Forschungsberichte aus der Industriellen Informationstechnik **28**

**Matthias Bächle**

Model-based Filtering of Interfering Signals

# **Model-based Filtering of Interfering Signals in Ultrasonic Time Delay Estimations**

Matthias Bächle

## **Model-based Filtering of Interfering Signals in Ultrasonic Time Delay Estimations**

#### **Forschungsberichte aus der Industriellen Informationstechnik**  Band 28

Institut für Industrielle Informationstechnik Karlsruher Institut für Technologie Hrsg. Prof. Dr.-Ing. Michael Heizmann

Eine Übersicht aller bisher in dieser Schriftenreihe erschienenen Bände finden Sie am Ende des Buchs.

# **Model-based Filtering of Interfering Signals in Ultrasonic Time Delay Estimations**

by Matthias Bächle

Karlsruher Institut für Technologie Institut für Industrielle Informationstechnik

Model-based Filtering of Interfering Signals in Ultrasonic Time Delay Estimations

Zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften von der KIT-Fakultät für Elektrotechnik und Informationstechnik des Karlsruher Instituts für Technologie (KIT) genehmigte Dissertation

von Matthias Bächle, M.Sc.

Tag der mündlichen Prüfung: 2. Mai 2022 Hauptreferent: Prof. Dr.-Ing. Michael Heizmann, KIT Korreferent: Prof. Dipl.-Ing. Dr. Bernhard G. Zagar, JKU Linz

**Impressum**

Karlsruher Institut für Technologie (KIT) KIT Scientific Publishing Straße am Forum 2 D-76131 Karlsruhe

KIT Scientific Publishing is a registered trademark of Karlsruhe Institute of Technology. Reprint using the book cover is not allowed.

www.ksp.kit.edu

*This document – excluding parts marked otherwise, the cover, pictures and graphs – is licensed under a Creative Commons Attribution-Share Alike 4.0 International License (CC BY-SA 4.0): https://creativecommons.org/licenses/by-sa/4.0/deed.en*

*The cover page is licensed under a Creative Commons Attribution-No Derivatives 4.0 International License (CC BY-ND 4.0): https://creativecommons.org/licenses/by-nd/4.0/deed.en*

Print on Demand 2023 – Gedruckt auf FSC-zertifiziertem Papier

ISSN 2190-6629 ISBN 978-3-7315-1252-3 DOI 10.5445/KSP/1000152022

# Preface

This thesis, which was written during my time as a research assistant at the Institute of Industrial Information Technology (IIIT) of the Karlsruhe Institute of Technology (KIT), would not have been possible without the contribution of numerous people.

First of all, I would like to thank Prof. Dr.-Ing. Fernando Puente León, who laid the foundations for this work with his passionate enthusiasm for signal processing and his crucial thought-provoking ideas. I will always remember his joy about scientific progress.

Special thanks go to Prof. Dr.-Ing. Michael Heizmann for taking over the main lectureship and for his continued support during the final phase of the promotion. Furthermore, I would like to thank Prof. Dipl.-Ing. Dr. Bernhard Zagar for spontaneously taking over the co-review.

I would like to thank all my colleagues for the good cooperation and also for the time spent together at the institute. The always familiar environment and the research-oriented atmosphere have helped me in particular to find the necessary motivation to keep going even in difficult times of the doctorate. My special thanks go to Johannes Anastasiadis, Manuel Bihler, Daniel Diaz Ocampo, Muen Jin, Fabian Leven, Lanxiao Li, David Uhlig, Maximilian Schambach, Markus Schwabe, Erik Tabuchi Barczak and Hannes Weinreuter for proofreading parts of the manuscript. My further thanks go to all my students who spent many hours performing the necessary experiments to substantiate the findings of this thesis.

Last but not least, I am deeply grateful to my parents Andrea and Edgar, my sisters Ann-Kathrin and Isabell, my brother Volker and to my girlfriend Natascha for their continued support throughout the years and that they have endured the many discussions about ultrasonic systems.

Karlsruhe, November 2022 Matthias Bächle

# Contents





# Nomenclature

# Common abbreviations


Nomenclature


# Symbols

### Latin letters



Nomenclature



#### Greek letters


# Superscripts


# Subscripts


# Mathematical operators


# 1 Introduction

# 1.1 Ultrasonic time delay measurement systems

Most ultrasonic measurement technologies rely on the time delay estimation (TDE) of ultrasonic pulses. Depending on the context, the time delay is also known as time-of-flight or transit-time. In other words, an ultrasonic transducer generates a pulse, which propagates through a medium and gets either reflected back to the transducer operating in sensing mode or received by a different transducer. The resulting time delay can then be estimated and used to infer the desired measurand, e.g., the speed of sound (SoS), the distance to an object, or the velocity of a medium. Typical applications include, but are not limited to, non-destructive testing (NDT) [66], sonar [1], parking sensors [125] and ultrasonic flow measurements (UFM) [79]. As an example, figures 1.1(a) and 1.1(b) show the different principles of parking sensors and UFM, respectively. While UFM requires two transducers on opposite sides of the pipe, parking sensors can be realized using a single transducer operating alternatively as a transmitter and receiver.

Figure 1.1 Ultrasonic transit-time estimation applications.

As there are a myriad number of possible ultrasonic applications, the application-specific challenges are equally varied. However, all systems have in common that their accuracy is mainly determined by the measurement of the underlying time delay. The challenge of an accurate TDE can be approached in two ways. Either the signals are improved by adapting the design of the ultrasonic system, or the signals are subsequently enhanced by using advanced signal processing methods—a combination of both is also possible. Firstly, the possibilities by adapting the design are considered. As the design of an ultrasonic measurement system contains a lot of degrees of freedom, the following list outlines a few key points that can be varied:



Table 1.1 Typical frequency ranges for a selection of ultrasonic applications.

beamforming to reach better directional characteristics [1], or simply be connected with backing and electrodes and then placed inside a metal case. In the case of non-invasive transit-time UFM, so-called clamp-on sensors [114] that generate surface acoustic waves are often preferred.

While the development and improvement of ultrasonic systems require a lot of iterations and are, therefore, very time-consuming and expensive, the digital signal processing method can be easily adapted using a software update, even in the late stage of the development. Hence, the focus of this thesis lies on signal processing techniques for TDE in ultrasonic systems. Furthermore, this thesis places a special focus on the TDE accuracy and robustness against interfering signals and noise, as these properties are usually of high importance.

From a signal processing point of view, ultrasonic time delay measurements can be simplified into the question, how to get an accurate TDE from a 1D measurement signal <sup>d</sup> () and a reference signal <sup>r</sup> (). Besides signal distortion due to the acoustic transmission system, the most likely reasons that complicate the TDE are unwanted signals, such as noise or interfering signals, which are discussed in more detail in the next section.

## 1.2 Influence of interfering signals

Unwanted signals in ultrasonic measurements can be separated into external disturbances due to uncorrelated sound sources, interfering signals due to multipath propagation, and stochastic noise such as addi-

Figure 1.2 Interfering signals (red) due to multipath propagation in an NDT scenario.

tive white Gaussian noise (AWGN). A typical scenario with unwanted signals due to multipath propagation is depicted in Figure 1.2, where a homogeneous material is tested for cracks or other defects. Due to the impedance jump at the crack, the incident wave gets reflected towards the receiving transducer, which in turn allows locating the crack by estimating the time delay, if the SoS of the material is known. However, the signal created by the crack, namely the target signal, is superposed by a direct propagation between the transducers along the surface of the test specimen and by back wall reflections. An external disturbance could be caused by sound sources in the test environment, which makes them uncorrelated to the target signal. Additionally, as with any measurement, the recorded signals are subject to stochastic noise, caused by polarization noise or thermal noise of the piezoelectric ceramic as well as the thermal noise of the measurement electronics.

While the influence of AWGN can be reduced by averaging and uncorrelated sound sources can be suppressed by using modulated excitation signals, the separation of interfering signals is more difficult. Since they are highly correlated with the target signal, a modulated excitation signal shows the same effects on both the interfering and target signals, and therefore introduces no information to separate them. Furthermore, the interfering signals are in the same frequency range as the target signal and stationary in the sense that they do not vary over multiple measurements, which renders frequency filtering or averaging useless. If a short pulse is used as excitation and the transducer bandwidth is sufficiently high, the different signals can be separated by gating in the time domain (see Fig. 1.3(a)). Otherwise, as it can be seen in Fig. 1.3(b), the interfering

(a) Separable interfering signals. (b) Non-separable interfering signals.

Figure 1.3 Superposition scenarios of interfering signals (the measurement signals are colored in blue and the target signals in red).

signals are located in the same time and frequency range resulting in a distorted observed signal, in the literature often termed as non-separable interfering signals [69, 102].

The main problem arising from the interfering signals is their influence on the accuracy of the TDE. Since they show no measurement effect, the actual time delay of the target signal may be hidden under the time delay of the interfering signals, which induces a systematic estimation error. Depending on the estimation method, the error can be significant, if for example the target signal cannot be detected at all and the peak of the interfering signals is taken for the time delay of the target signal. Even if the level of the interfering signals is small compared to the target signal, the accuracy is considerably reduced. To illustrate this, a thought experiment of a TDE in the presence of interfering signals is conducted in the following. For simplicity, pure sine signals are used as the target and interfering signals, which is at the same time the worst case, because the bandwidth of such a measurement is nearly zero. The time delay gets estimated via the phase shift of the measurement signal since it describes the time delay for pure sinusoidal signals. To describe the problem, the measurement signal

$$s\_{\mathbf{d}}(t) = s\_{\mathbf{t}}(t - \tau) + s\_{\mathbf{i}}(t) \tag{1.1}$$

is composed of the target signal <sup>t</sup> () and additive interfering signals i (), which are defined by

$$s\_t(t) = A\_t \sin(\omega\_0 t) \, \tag{1.2}$$

$$s\_i(t) = A\_i \sin(\omega\_0 t + \varphi) \,. \tag{1.3}$$

5

Figure 1.4 Worst-case estimation error caused by the interfering signals with = /2 depending on their level (ScNR = 20 lg(t/<sup>i</sup> )dB).

Given the equations (1.1)–(1.3), the observable time delay of the measurement signal can be calculated using the harmonic addition of sine waves for arbitrary phase shifts [12, p. 84] resulting in the estimated time delay

$$\hat{\tau} = \frac{1}{\omega\_0} \arctan \left( \frac{A\_\text{t} \sin(\omega\_0 \tau) + A\_\text{i} \sin(\varphi)}{A\_\text{t} \cos(\omega\_0 \tau) + A\_\text{i} \cos(\varphi)} \right) \,. \tag{1.4}$$

The results of this equation can be seen in Figure 1.4, where the estimated time delays for different signal-to-correlated-noise ratios (ScNR) are shown. This ratio must not be confused with SNR, as ScNR describes the level of interfering signals and not the stochastic noise. It can be seen that the error, oscillating with <sup>0</sup> , is dependent on the real time delay and the ScNR. The absolute maximal deviation ̂ − is determined by the ScNR, while the phasing is determined by , which has been set to /2 in this example. Even for small levels of interfering signals, where the amplitude of the target signal <sup>t</sup> is about three times higher than the interfering signals amplitude <sup>i</sup> (ScNR = 10 dB), the estimation error can be up to 5 % of the period duration 1/<sup>0</sup> . In summary, the errors induced by the interfering signals are systematic, dependent on the phasing of the superposition and can only be mitigated by increasing the excitation frequency or damping the interfering signals mechanically.

Another source of error that occurs in the presence of interfering signals is the influence of process parameters, e.g., temperature or pressure. Since the estimation error is dependent on the superposition, any process parameter that leads to a deviation of the SoS changes the phasing and, therefore, the estimation quality.

One scenario, where interfering signals are especially problematic, is ultrasonic flow metering, e.g., using clamp-on sensors, as shown in Fig. 1.1(a). Here, the ultrasonic waves are transmitted into the fluid via the pipe wall, whereby the interfering signals account for a significant proportion of the measurement signals. The different approaches reported by the literature to address this problem are based on either mechanical damping or using flow statistics. However, their constraints limit their applicability, since mechanical damping is usually temperature- and frequency-dependent, and the assumed flow statistics are too restrictive. Therefore, extended algorithmic methods for filtering the interfering signals are proposed in this thesis, exploiting the property that the target signals exhibit different dynamic behavior compared to the interfering signals over multiple measurements. In contrast to the state-of-the-art (SotA) methods, the presented algorithms require little dynamic behavior of the target signals and also work over a wide temperature range.

# 1.3 Scope of the thesis

This thesis presents model-based algorithmic approaches for interferenceinvariant TDE, which are specifically suited for the estimation of small time-delay differences with a necessary resolution well below the sampling time. Therefore, the methods can be applied particularly well for transit-time UFM, since the problem of interfering signals is especially prominent in this application. The main focus lies on how to use multiple measurements with varying time delays or process parameters for the separation of interfering signals in UFM, while maintaining good robustness against AWGN and a high resolution. To this end, a signal model is assumed that contains stationary interfering signals, which are not dependent on changing time delays, and a target signal, which contains the measuring effect.

Firstly, the signal model of an UFM and its dynamic behavior under temperature or time delay variations is examined in Chapter 3. The aim is to generate valid simulated data sets, which are used to test the developed methods both under the premise that the data fits to the signal model perfectly and under the premise that model errors are present. In the course of this, the properties of the signal model components, such as bandwidth, stationarity and temperature dependency, are identified. For this purpose, a new method to model the temperature dependence of the interfering signals is presented. After the characterization of the total measurement system, the signal model—adapted to UFM—is used as the basis for two new methods whose objective is to reduce the impact of the interfering signals.

The first proposed technique, described in Chapter 4, extends the dynamics-based approaches of the literature by weakening the constraints on the needed variance of the time delays. To this end, a new representation of multiple measurement signals as point clouds is introduced. The point clouds are then processed using the principal component analysis (PCA) and B-splines, leading to either interference-invariant TDE or estimated interfering signals. In this context, a novel joint B-spline and misalignment approximation is developed to enhance the robustness.

The second approach consists of a regression-based estimation of the time-delay differences by learning reasonable signal subspaces. These subspaces are efficiently calculated by the analytic wavelet packet transform (AWPT) before the resulting coefficients are transformed into features that correlate well with time-delay differences. Furthermore, a novel subspace training approach that works unsupervised is proposed and compared to the conventional filter- and wrapper-based feature selection methods.

Finally, both methods are tested in an experimental UFM system with a high level of interfering signals present, where it is shown that they are in most cases superior to the literature methods. The quality of the methods is evaluated using the accuracy of the TDE, since the ground truth for the interfering signals cannot be determined reliably. Different data sets are used to analyze the dependencies on the hyperparameters, the process conditions and, in the case of the regression-based method, the training set.

# 2 State of the Art

In this chapter, an overview of the relevant literature concerning TDE, transit-time UFM, and suppression of interfering signals is given. The first section presents and categorizes SotA methods for TDE that will be used in Chapter 6 as a reference for the newly developed methods. While many of the methods are only described in a brief overview, the methods that are to be used as a reference are explained in more detail with the necessary equations. Since the focus of this thesis is the robustness against interfering signals using the transit-time UFM as an application example, sections 2.2 and 2.3 outline the SotA regarding transit-time UFM and interfering signals suppression, respectively.

## 2.1 Time delay estimation

In TDE problems, the objective is to find the time delay between a reference signal and its delayed and attenuated echo in the presence of noise. Let the general problem be modeled by

$$\begin{aligned} s\_{\mathbf{r}}(t) &= s\_{\mathbf{t}}(t) + n\_{\mathbf{r}}(t), \\ s\_{\mathbf{d}}(t) &= A\_{\mathbf{t}} \, s\_{\mathbf{t}}(t - \tau) + n\_{\mathbf{d}}(t), \end{aligned} \tag{2.1}$$

with the reference signal <sup>r</sup> (), the delayed signal <sup>d</sup> (), the target signal t (), the attenuation <sup>t</sup> , and the time delay [61]. In most cases, the AWGN signals <sup>r</sup> () and <sup>d</sup> () are assumed to be uncorrelated with each other and with <sup>t</sup> ().

The different estimation methods that are reported in the literature can be categorized into three classes according to Fang et al. [30]: time-based [17], correlation-like [5] and waveform-fitting [3, 47, 87]. While timebased methods are easier to implement and require little computational effort, correlation-like methods yield better accuracy and robustness against noise for low SNR [117]. If prior knowledge such as propagation

Figure 2.1 Zero crossing estimation via linear approximation of the Hilbert angle <sup>h</sup> . The zero crossing to be selected must be defined by another criterion.

characteristics or transducer impulse response is available, the waveform can be physically or empirically modeled to estimate the time delay via an optimization approach. However, the necessary parameters are often only known approximately which reduces the accuracy. For this reason and the significant computational effort of physically modeling the acoustic waves, waveform-fitting approaches are not widespread.

#### 2.1.1 Time-based estimation methods

Let us first consider time-based methods, which are based on the determination of prominent points in the measurement signal, e.g., intersections with thresholds [35], zero crossings [139] or envelope edges [4].

As early as 1974, Frederiksen and Howard [35] implemented a threshold-based determination of the time of arrival of wave packets on a monolithic chip. Via a specially developed analog circuit, an impulse noise rejection has even been integrated, which discriminates against received signals that are too short. Since a fixed threshold is not very robust to fluctuating signal intensities, various extensions of the threshold-based TDE to multiple thresholds [71] or variable thresholds [140] have been introduced. However, thresholds cannot only be applied directly in the time domain. For example, Duarte et al. [28] have presented the use of

thresholds in spectrograms at a given frequency as a way to perform TDE.

Alternatively to the threshold-based TDE, the zero crossing can be used to determine the time delay—an easy algorithm for the zero-crossing detection is given by Zhou et al. [139]. To get a more robust estimation of the zero crossing, Kupnik et al. [63] use the Hilbert transform in combination with a linear approximation of the phase signal. The approach is depicted in Figure 2.1. As it can easily be seen, narrow-band signals lead to a linear behavior of the phase signal <sup>h</sup> (), which results from calculating the argument of the analytic signal as(). The robust zero crossing is then extracted as the intersection of the linear phase fit ̃ h () with the time axis. The same approach is also presented by Roosnek [99], where the linear approximation ̃ h () is calculated by the weighted least squares using the signal envelope |d,as()| as weights. This specific method—in the following simply called the zero-crossing method—is one of the SotA methods that are implemented as a reference to measure the accuracy of the methods proposed in this dissertation. It has to be noted that the zero-crossing method is ambiguous since in a sinusoidal signal there are usually several zero crossings. A solution to always determine the time delay of the same zero crossing is proposed by Fang et al. [30], where finding the same zero crossing is guaranteed by first estimating the signal onset using a classification.

In addition to the methods already listed, many more elaborate methods exist in the literature. They are based on, for example, features such as the intersection of an envelope tangent with the time axis [4], or they determine the time delay by tracking multiple sample points individually [136].

#### 2.1.2 Correlation-like time delay estimation

Correlation-based methods for TDE have been published since the early 1970s, as TDE is one of the main applications for cross-correlation. The core of each method lies in the cross-correlation function

$$\begin{split} R\_{s\_{\mathrm{d}},s\_{\mathrm{r}}}^{\mathrm{CC}}(\tilde{\tau}) &= \mathrm{E}\{s\_{\mathrm{r}}(t) \, s\_{\mathrm{d}}(t+\tilde{\tau})\} \\ &\approx \int\_{-\infty}^{\infty} s\_{\mathrm{r}}(t) \, s\_{\mathrm{d}}(t+\tilde{\tau}) \mathrm{d}t \end{split} \tag{2.2}$$

which shows a peak at the real time delay under ideal conditions [61, 89, 93]. A simple application of this function is the cross-correlation flowmeter, where two axially separated transmission paths are used to measure the movement of bubbles, turbulence, or other clumps of particles [5]. However, the ideal conditions include the noise signals being independent zero-mean stationary Gaussian and the available signal sequences being ergodic and long enough to approximate the expectation value by time integration. In real applications, the observation time, the sampling rate and the bandwidth of the target signals <sup>t</sup> () are usually limited, leading to a noise polluted spread out cross-correlation. Moreover, multipath propagations are neglected in the standard approach which can further deteriorate the estimation quality. To cope with these problems, several extensions of the classical method have been developed and published over the years. These extensions range from pre-filtering the signals to advanced approximation techniques based on the calculation of weighted cross-correlations in the Fourier domain.

An important class of cross-correlation techniques that needs to be considered is the generalized cross-correlation. This class is characterized by the fact that the cross-correlation calculation is carried out in the Fourier domain with a frequency weighted spectrum. Knapp and Carter [61] compared different frequency windows <sup>r</sup> ,<sup>d</sup> () for the generalized cross-correlation method, in which the time delay is estimated by

$$\begin{split} \hat{\tau} &= \arg\max\_{\tilde{\tau}} R^{\rm GCC}\_{s\_{\rm d}, s\_{\rm r}}(\tilde{\tau}) \\ &= \arg\max\_{\tau} \int\_{-\infty}^{\infty} H\_{s\_{\rm r}, s\_{\rm d}}(f) \, S\_{\rm d}(f) \, S\_{\rm r}^\*(f) \, \mathrm{e}^{\rm j2\pi f\tau} \, \mathrm{d}f \, \end{split} \tag{2.3}$$

where <sup>d</sup> () and <sup>r</sup> () denote the Fourier transforms of <sup>d</sup> () and <sup>r</sup> (), respectively. The aim is to improve the estimation quality of the crosscorrelation by emphasizing frequencies with high SNR. This approach is equivalent to pre-filtering the signals with <sup>ℎ</sup><sup>d</sup> () and <sup>ℎ</sup><sup>r</sup> (), respectively, before the cross-correlation is calculated. Since the choice of <sup>r</sup> ,<sup>d</sup> () influences both the robustness and accuracy of the estimation, a selection of different weighting schemes, also known as processors, is listed below:

Roth's processor

$$H\_{s\_{\mathbf{r}},s\_{\mathbf{d}}}(f) = \frac{1}{S\_{\mathbf{r}}(f) \cdot S\_{\mathbf{r}}^\*(f)}\tag{2.4}$$

suppresses frequencies with low SNR but also spreads out the cross-correlation ̃ d,<sup>r</sup> () [101].

Using the phase transform processor

$$H\_{s\_\mathrm{r},s\_\mathrm{d}}(f) = \frac{1}{|S\_\mathrm{r}(f) \cdot S\_\mathrm{d}^\*(f)|} \,'\,\tag{2.5}$$

the cross power spectral density is normalized, which can sharpen the peak of the generalized cross-correlation [16].

The Eckart filter

$$H\_{s\_\mathbf{r}, s\_\mathbf{d}}(f) = \frac{S\_\mathbf{t}(f) \, S\_\mathbf{t}^\*(f)}{N\_\mathbf{r}(f) \, N\_\mathbf{r}^\*(f) \cdot N\_\mathbf{d}(f) N\_\mathbf{d}^\*(f)}\tag{2.6}$$

has been developed by Eckart [29] as a filter that maximizes the deflection. In this context, deflection represents the ratio between the filter's response to the target signal compared to the response to noise. It also has the property to suppress frequency bands with low SNR, but the spectral densities of the target signal and the noise have to be known beforehand.

The last spectral filter to be mentioned is the Hannan-Thomson processor [42], for which the approximation

$$H\_{s\_{\mathbf{r}},s\_{\mathbf{d}}}(f) = \frac{|S\_{\mathbf{r}}(f)| \, |S\_{\mathbf{d}}^{\*}(f)|}{|N\_{\mathbf{r}}(f)|^{2} \, |N\_{\mathbf{r}}^{\*}(f)|^{2} \, |N\_{\mathbf{d}}(f)|^{2} \, |N\_{\mathbf{d}}^{\*}(f)|^{2}} \tag{2.7}$$

was proposed by Brandstein et al. [10]. Note that this filter leads to the maximum likelihood estimate of the time delay, with an error variance approaching the Cramér-Rao lower bound (CRLB) [61].

All above mentioned generalized cross-correlation methods need the power spectral densities (PSD) of either the target signal, the noise signals, the measurement signals, or all together. In practice, these densities are approximated via the periodogram, Bartlett's/Welch's method, or other spectral density estimators. Alternatively, a data-driven FIR filter design approach, which does not rely on spectral densities, is presented in a newer work [6]. Nevertheless, as already stated above, all methods are still

limited by the bandwidth and the observation time, leading to a smeared cross-correlation function with additive residual noise. If the peak of the cross-correlation function is too broad or polluted by noise, Cabot [13] has shown that the Hilbert transform can then be used to convert the maximum of d,<sup>r</sup> () into a zero crossing. This technique could also be combined with the pre-filters in the generalized cross-correlation.

In contrast to using the peak of the cross-correlation, other properties such as the phase of the cross power spectrum can also be used to estimate the time delay, especially in cases where the noise signals d () and <sup>r</sup> () are correlated with each other [93]. Since correlated noise sources are a typical scenario and originate from external sources in the environment, this problem was also addressed by Nikias and Pan [89] by employing higher-order spectrum domains such as the bispectrum. Their approach is based on the property that the third moment of a zero-mean Gaussian process vanishes, which removes the influence of the correlation between the noise signals. However, the term correlated noise must not be confused with the interfering signals, the reduction of which is the objective of this thesis, as these are not only correlated with each other but also with the target signal.

While all methods based on the cross-correlation perform TDE by maximizing an adapted form of (2.2), an alternative correlation-like technique is based on the minimization of the average square difference function (ASDF)

$$R\_{s\_\mathbf{r},s\_\mathbf{d}}^{\rm ASDF}(\tilde{\tau}) = \int\_{-\infty}^{\infty} \left( s\_\mathbf{r}(t) - s\_\mathbf{d}(t + \tilde{\tau}) \right)^2 \mathrm{d}t \tag{2.8}$$

or the average magnitude difference function (AMDF)

$$R\_{s\_\mathbf{r},s\_\mathbf{d}}^{\rm ASMF}(\tilde{\tau}) = \int\_{-\infty}^{\infty} |s\_\mathbf{r}(t) - s\_\mathbf{d}(t + \tilde{\tau})| \, \mathrm{d}t \, \tag{2.9}$$

which were examined in the works of Jacovitti and Scarano [50] and Fertner and Sjolund [31]. These techniques are also known in the literature as the sum-of-squared-differences and sum-of-absolute-differences method, respectively [36, 121]. Compared to the conventional cross-correlation, ASDF and AMDF require less computational effort since there is no need for multiplications and normalization. Furthermore, the variance of the

estimation error is closer to the CRLB than the cross-correlation in a white signal scenario [50].

A special challenge that has been neglected so far in the cross-correlation methods is multipath propagation. If the difference of the time delays between the system's various propagation paths is large enough, the cross-correlation shows distinct peaks that correspond to the time delays of the different paths. The limiting factor for the necessary difference between the time delays is the available signal bandwidth which is inversely proportional to the width of the correlation peaks [55]. For high SNR, the resolution can be improved by whitening the spectrum as shown by Khyam et al. [55]. The other approaches consist of using pulsed-wave ultrasound as excitation in combination with a windowed cross-correlation, where the signals are windowed in the time domain before calculating the cross-correlation [9, 129]. While fixed rectangular windows are often used [33, 45], Xu et al. [133] presented an algorithm that calculates the cross-correlation in the wavelet domain with carefully selected scaling levels. A closer look into this algorithm leads to the conclusion that this is a special case of a time-frequency windowed cross-correlation showing similarities to a time-windowed generalized cross-correlation.

The last step of each correlation-like approach is the subsample approximation of the time delay since the signals are practically processed in the discrete time domain. There exist different ideas on how to increase the resolution beyond the sampling time, which can be separated into interpolation techniques in the time domain or the phase domain. While the former category is more widespread, it needs a well-chosen interpolation model. In contrast, phase-based approaches are built on the assumption of narrow-band signals, which means that the measurement signals approximately only contain a single frequency. This assumption allows the subsample TDE by using linear interpolation or approximation in the phase domain to get the phase shifts or the slopes of the phase.

Let us first consider the interpolation in the time domain. Viola and Walker [122] have given an overview of the possible strategies, which include interpolation of the signals or interpolation of the cross-correlation function. If the signals are interpolated before the cross-correlation, they must still be resampled at a higher sampling rate to calculate the crosscorrelation. This results in a large computational effort and the estimated time delay will still be quantized, though at a finer resolution. Alternatively, the parameters of the interpolation model can be used to directly estimate the time delay, e.g., the root of the analytic derivative of a spline interpolation directly leads to an expression for the time delay [122]. In the second strategy, the cross-correlation is directly interpolated in the immediate neighborhood of the discrete global maximum. The interpolation models depend on the used pattern-matching function such as the cross-correlation, the ASDF, or the zero crossing curve presented by Shaswary et al. [112, 113]. In most cases, the interpolations are based on parabolic [50], cosine [26], or spline fitting [122], where cosine fitting usually performs better than parabolic fitting [26]. Nandi [88] has shown that the parabolic fit can also be applied to multiple points of the crosscorrelation function which offers significant improvements at low SNR.

For the second group of subsample approximation approaches, the phase information is necessary, which can either be extracted by sine fitting [95], transformation into the Fourier domain [117] or other modelbased fittings. For example, Grennberg and Sandell [41] used the crosscorrelation between the delay signal and the Hilbert transformed reference signal to get a direct estimator for the phase. Due to the phase ambiguity, only small time delays are allowed if the result is required to be unique. Therefore, the phase-based methods are typically combined with rough estimators to obtain the integer number of period durations contained in the time delay [95, 117].

To date, the correlation-based methods and its many variants are considered by many researchers as the "gold standard" for TDE [112, 122], which is why recent publications still deal with its further improvement, such as the frequency-sliding approach presented by Cobos et al. [18].

#### 2.1.3 Model-based approaches

The last group of approaches for the estimation of time delays in acoustic measurement systems is based on physical or empirical models for the waveforms. Since the model-based approaches are the least common and depend heavily on whether a valid signal model can be established, they will only be briefly discussed here based on a few selected publications.

Yan and Gu [134] proposed a physical model based on solving the differential equation of mechanical vibrations. The rising edge envelope of the signal model that solves the differential equation is then fitted to the measurement signal to estimate the starting time of the recorded ultrasonic pulse. A simpler fit of the rising edge is published by McMullen et al. [87], where the rising edge was approximated as a parabola, which could be fit using two threshold measurements.

Apart from that, a popular signal model used for the TDE of ultrasonic signals is the Gaussian-modulated cosine signal, where the amplification, time delay, and width of the Gaussian window as well as the frequency and phase of the cosine are the free parameters of the model. Both Gholami et al. [38] and Lu et al. [75] presented methods to fit multiple overlapping Gaussian modulated cosine signals. While Gholami et al. [38] based the fitting on the expectation-maximization algorithm, Lu et al. [75] developed a modified Gauss-Newton method.

In terms of fitting methods, successful applications of the unscented Kalman filter [3], the genetic-ant colony optimization [47], or a weighted Fourier transform and relaxation-based method [52] have been reported.

# 2.2 Transit-time ultrasonic flow meter

Ultrasonic flow meters, except for some less common types, can be divided into two main classes: the transit-time UFM and the Doppler UFM. Since the latter requires inhomogeneities to reflect the ultrasonic waves and is based on estimating the Doppler frequency of the reflected signals, it is not further considered in this thesis. In contrast to this, transit-time flow meters work for homogeneous single-phase fluids. They are based on the effect that an ultrasonic wave sent in upstream direction propagates slower than in downstream direction. Other UFM variants, such as the cross-correlation flow meters, are not considered because their special signal processing requirements are only slightly related to the topic of this thesis—the accurate TDE for very small time delays in the presence of interfering signals.

The first flow meters based on the transit-time of ultrasonic pulses are in use since the early 1950s [118]. Due to the non-existence of moving parts, they are characterized by both low energy consumption and low maintenance requirements. Hence, their field of application ranges from the food to energy industry, where large throughputs and, therefore, large cross-sections are required. While the first UFMs used two paths and were built completely analog [34], the systems have been further improved to use multiple frequencies [90], phase-shift approaches [137] and single path circuits [73]. In order to apply UFM for the measurement of highly corrosive fluids or in a sanitary environment, so-called clamp-on transducers, which allow excitation of the ultrasonic waves through the pipe wall without contact to the fluid, were presented by Tobin [120] in 1972. Soon after their introduction, Lynnworth reported their accuracy limitation due to interfering signals [78]. After years of constant improvement, a new chapter was opened in the 1990s with the introduction of digital signal processing techniques to the field of TDE in ultrasonic applications [32, 92]. Different methods, such as improving the robustness of the zero crossing employing the Hilbert transform [99] or reconstructing flow profiles using a parametric model [84], were published. With the improved accuracy, the possible applications have also further expanded. For example, Schwarz et al. [105] have shown that the flow in partially filled drainage pipes can be measured with resolutions of less than 1 cm s−1. To this end, the UFM system was built to provide measurement signals with little noise and interference, and the timedelay differences were measured using a high-precision TDE chip that provides time-delay resolutions of 55 ps.

Three different aspects of UFM in which research is still being carried out can be named: signal processing, flow profile considerations and transducer technologies.


The latest research on the signal processing methods featured the TDE via improved cross-correlation methods [81], local peak fittings [80] similar to the envelope method, and cluster analysis of local peaks by processing multiple measurements [126].

## 2.3 Interfering-signals suppression

As already mentioned in the introduction, the interfering signals limit the accuracy of ultrasonic TDEs. If this limited accuracy is not tolerable due to the requirements, an interfering-signals suppression must be used. In the literature, a distinction is made between algorithmic and mechanical suppression where the detailed procedure depends on the application. Since the application example in this dissertation is the UFM, the SotA suppression approaches for UFM are listed, even though not many solutions to this problem have been published.

In the group of mechanical suppression falls the use of pipe materials that are poor at transmitting structure-borne sound, e.g., the plastic pipes used in the earliest UFM clamp-on measurements [54]. Apart from that, the use of damping mats or other attenuation elements is reported by Lynnworth and Liu [79].

In contrast, the algorithmic approaches proposed in the literature are all based on the fundamental principle that the ultrasonic waves, which are transmitted solely through the pipe wall, are stationary, whereas the target signals exhibit dynamic behavior because they are dependent on the flow [79]. There are several methods to exploit this principle. Roosnek's approach [99] works by collecting different measurement signals whose time delays uniformly cover a period duration to use destructive interference for estimation of the stationary interfering signals, i.e., the target signals propagating in the fluid cancel each other out when averaged, leaving only the interfering signals. However, this approach is highly dependent on the availability of the measurement signals, since the time delays are required to cover at least one-period duration. Another approach has been published by Jacobson et al. [49], where the suppression of the non-varying interfering signals is realized by using a high-pass with respect to a set of consecutive measurements. Mansfeld et al. [85] proposed a similar method, where the signals of consecutive measurements are subtracted to remove the stationary signal components—this is equivalent to a simple numerical differentiation according to the measurement index, and thus this method also has a highpass characteristic. Nevertheless, the methods proposed by Jacobson et al. [49] and Mansfeld et al. [85] require highly fluctuating target signals to get sufficient target signal levels, which in turn are necessary to be robust against measurement noise.

In summary, the interfering signals suppression is based on either attaching damping materials to the pipes, or using the assumption of stationary interfering signals. In the case of the latter, the time delays of the target signals must necessarily either follow a uniform distribution or change permanently at a fast rate.

# 3 Measurement System Identification

Since the transit-time UFM is used as an application example in this thesis, the fundamentals of the measurement principle, the most important process influences and the derived signal models are described in this chapter. The focus of this chapter is to present a signal model of UFM which describes the classification of the ultrasonic signals into target and interfering signals, their sources and properties, and how the influencing variables temperature and flow velocity are included.

Before the measurement system is investigated, an introduction to the fundamentals of ultrasonic systems is given in Section 3.1. Based on this, the signal model is derived in the following two sections (an overview of the sections and their contents is given in Fig. 3.1). In Section 3.2 the properties and sources of the target and interfering signals are presented and the structure of the signal model is given. Subsequently, the influence of the temperature and the VoF on the target and interfering signals is investigated in Section 3.3. This finally results in the complete signal model with all components, their dependency on the VoF and the temperature as well as a collection of the properties of the signal components, e.g., bandwidth or distribution over the time range. All insights and

Figure 3.1 Topics of Chapter 3.

assumptions are supported by experimental tests, whose designs and implementations are described in Section 3.2.1.

Finally, the knowledge gained on the properties, influences, and structure of the signal model is used in Section 3.4 to generate simulated data sets. These can be used to validate the intermediate steps of the developed filtering methods, which are presented in chapters 4 and 5.

#### 3.1 Ultrasonic fundamentals

#### 3.1.1 Ultrasonic flow metering

The principle of UFM is, as mentioned in the introduction, based on the superposition of the SoS <sup>M</sup> with the VoF . To this end, two axially distanced transducers UT 1 and UT 2 are used to transmit one ultrasonic pulse each in the upstream and downstream direction. For this, the transducers can be operated both as senders and receivers. Figure 3.2 depicts a sketch of this principle. The resulting propagation velocity is dependent on the path angle given by the angle between the flow vector and the propagation vector. Thereby, the waves in upstream and downstream direction propagate with the velocities

<sup>1</sup> = <sup>M</sup> − ⋅ cos(), (3.1a)

<sup>2</sup> = <sup>M</sup> + ⋅ cos(), (3.1b)

respectively. Given the path length , this results in the absolute time delays

$$\tau\_{\mathbf{a},1} = \frac{L}{c\_{\mathbf{M}} - v \cdot \cos(\alpha)} \,\mathrm{\,\,\,\mathrm{\,} \,\mathrm{\,\,} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{\,}} \,\mathrm{\,\,{}} \,\mathrm{\,\,{}} \,\mathrm{\,\,{}} \,\mathrm{\,\,{}} \,\mathrm{\,\,{}} \,\mathrm{\,\,{}} \,\mathrm{\,\,} \,\mathrm{\,\,{}} \,\mathrm{\,\,$$

$$\tau\_{\rm a,2} = \frac{L}{c\_{\rm M} + v \cdot \cos(\alpha)}.\tag{3.2b}$$

By resolving equations (3.2) to the VoF , we can get the relation

$$v = \frac{\tau\_{\mathbf{a},1} - \tau\_{\mathbf{a},2}}{\tau\_{\mathbf{a},1}\tau\_{\mathbf{a},2}} \cdot \frac{L}{2\cos(\alpha)}\,, \tag{3.3}$$

Figure 3.2 Setup for UFM with necessary system parameters (following [145]).

which simplifies to

$$v = \frac{\tau}{L} \cdot \frac{c\_{\text{M}}^{\text{-}2}}{2 \cos(\alpha)}, \quad \tau = \tau\_{\text{a},1} - \tau\_{\text{a},2}.\tag{3.4}$$

if the condition <sup>M</sup> ≫ holds. In liquid fluids such as water with a SoS around 1480 m s−1, this is mostly fulfilled. Otherwise the exact equation is to be used, which requires the measurement of the absolute time delays and not only the time-delay difference between upstream and downstream direction.

Either way, only the average velocity of the flow along a single ultrasonic transmission path can be measured this way. To determine the mass flow rate, the diameter, the density, and the flow profile have to be considered. While the first two are defined by the medium and the pipe, the latter depends on the Reynolds number and is usually included in the measurement equation via multiplication with a hydraulic correction factor. The following flow profile considerations are summarized from Iooss et al. [48]. For fully turbulent flows with Reynolds numbers larger than 4000 the flow profile can be described by Prandtl's law [48]

$$v(R) = v\_{\text{max}} \left( 1 - \frac{R}{D/2} \right)^p \tag{3.5}$$

as a function of the maximum velocity in the middle of the pipe max and the distance to the center of the pipe . Note that this flow profile is only valid for a long enough inlet length. The parameter is dependent on the Reynolds number and can be determined experimentally or by an approximated equation. The integration along the diametral acoustic path of the ultrasonic wave, whereby diametral means that it passes through the center of the pipe, results in an average velocity

$$v = \frac{v\_{\text{max}}}{1+p}.\tag{3.6}$$

Inserting (3.6) in the integration result over the pipe cross-section

$$\overline{v} = \frac{v\_{\text{max}}}{(1+p)(1+\frac{p}{2})} \tag{3.7}$$

yields the hydraulic correction factor

$$k\_{\mathbf{h}} = \frac{v}{\overline{v}} = 1 + \frac{p}{2}.\tag{3.8}$$

This correction factor can be used for all diametral acoustic paths. In case of multipath UFM, where multiple ultrasonic transducer pairs are attached to the pipe, a weighted average of the different single path velocities

$$\overline{v} = \frac{1}{k\_{\text{h}}} \sum\_{i} w\_{i} v\_{i} \tag{3.9}$$

is typically used [84]. Thereby, the weights have to be adapted to the path configuration and the expected flow profile [131]. Finally, the pipe cross-sectional average velocity can be used to calculate the mass flow rate

$$Q = \rho \cdot \pi \left(\frac{D}{2}\right)^2 \cdot \overline{v} \,. \tag{3.10}$$

#### 3.1.2 Transducers for ultrasonic waves excitation

Until now, only the application of ultrasonic waves for flow metering was featured. However, the pre-condition of using ultrasonic waves is their generation and sensing. To this end, ultrasonic transducers are employed, which are mostly based on the piezoelectric effect for high frequencies. The exact construction of a transducer depends on the waves to be excited, but the core of all transducers is a piezoceramic with electrical connection,

Figure 3.3 Ultrasonic transducer structures for directional longitudinal radiation (left) or guided waves excitation, as used for clamp-on UFM (right). The left figure is based on [102].

backing, and acoustic matching layers. Since a complete listing of all transducer types is beyond the scope of this thesis, only one prominent type for both longitudinal wave and guided wave excitation is presented in this section.

In Figure 3.3 on the left, a typical structure of an ultrasonic transducer to generate pressure waves is depicted. The backing consists of a material with a high attenuation that has a similar acoustic impedance as the piezoceramic disc to reduce wave reflections. It usually is based on an epoxy resin mixed with metal powder [104]. The high attenuation of the backing provided by the epoxy resin is necessary to suppress the ringing of the transducer which would reduce the bandwidth and affect the accuracy, especially for TDE problems. The next few layers are the piezoceramic disc and the two electrodes used for electrical connection. Finally, several matching layers (for better visualization only two are illustrated in the figure) with adapted thicknesses are used to transmit the ultrasonic waves to the medium with as few reflections as possible [14]. However, the above-mentioned relationships of inner reflections, ideal layer thicknesses, and attenuations are frequency-dependent functions. Therefore, the total behavior of the transducer is highly frequency dependent. In summary, the size of the layers and their material characteristics determine the operational media, the frequency range, and the bandwidth of the transducer. Furthermore, the directivity—consisting of the

Figure 3.4 Multipath propagation examples. Depending on the objective to measure, the path can be classified into the desired propagation path and the interfering multipath propagation. In this scenario, the desired paths are drawn in blue, while the interfering paths are drawn in red.

main and side lobes—is influenced by the diameter and the frequency. In this context, a higher frequency or a larger diameter leads to a narrower directivity angle [102].

The second transducer depicted in Figure 3.3(right) shows a wedge transducer that is typically used for clamp-on UFM. Here, the waves once again are generated by the mounted piezoceramic and propagate as longitudinal waves. After traveling through the wedge material to the interface between the transducer and the plate, they get converted to shear or Lamb waves propagating in the pipe wall [72, 102]. By designing the wedge angles to match the phase velocity of the wedge material only to a certain wave mode in the pipe wall, the propagation of other wave modes can be reduced. Possibly reflected waves get absorbed by the damping structure at the top of the transducer. Subsequently, the shear or Lamb waves in the pipe wall radiate into the fluid medium under the angle according to Snell's law of refraction [23]. In contrast to the transducers for longitudinal waves excitation, the wedge transducers are mainly used for the generation of guided waves in solid media.

#### 3.1.3 Multipath propagation

In many cases, ultrasonic measurement systems are subject to multipath propagation, which can be represented by a convolution of the excitation signal with a sum of acoustic room impulse responses. This is because the excited ultrasonic transducers form side lobes in the directional char-

Figure 3.5 System theoretic description of multipath propagation.

acteristics or the waves can propagate as guided waves in solids, which can furthermore lead to a continuous emission of waves.

Two example scenarios are depicted in Figure 3.4. On the left, a side lobe leads to an acoustic path that gets reflected to the receiver by a solid object, while on the right, the guided waves propagate in the bottom plate, radiate into the liquid medium, and are reflected back. The multiple acoustic paths obviously have different time delays and waveforms. If the objective is liquid level metering, the guided waves propagating solely in the plate have to be considered as interfering signals. In Figure 3.5, a system theoretic representation of the depicted scenarios is shown. Each path is represented by its individual room impulse response and the corresponding impulse responses for the sending and receiving transducer.

A more general model of ultrasonic time delay measurements with multipath propagations can thus be formulated as follows:

$$s\_{\mathbf{d}}(t) = s\_{\mathbf{u}}(t) \* \sum\_{i=0}^{I} \left( g\_{\mathbf{A},i}(t) \* g\_{\mathbf{T},i}(t) \right) + n\_{\mathbf{d}}(t) \,. \tag{3.11}$$

In (3.11) the acoustic room impulse response of the system and the summarized transducer impulse responses for each path are denoted by A ,() and T ,(), respectively, and ∗ denotes the convolution. Note that each path has its own transducer impulse response due to the directionspecific excitation of ultrasonic waves. If only the time delay of the direct path in a homogeneous non-dispersive medium is relevant, the direct path impulse response A ,0() becomes ( − ) and the total impulse response can be separated into

$$g(t) = g\_{\Gamma}(t - \tau) + \sum\_{i=1}^{I} g\_{\mathcal{A},i}(t) \* g\_{\mathcal{T},i}(t) \, . \tag{3.12}$$

27

where the multipath propagations are modeled by ∑ A ,() ∗ T ,(). Then, the signal components resulting from the convolution of <sup>u</sup> () with any multipath impulse response can be considered as an interfering signal, whose filtering is the main focus of this thesis.

#### 3.1.4 Lamb waves propagation

In contrast to fluids, where the predominant mode of propagation is a longitudinal wave, solid materials also feature the propagation of shear waves [100], where the particle movement is orthogonal to the propagation direction. Furthermore, the wave propagation is subjected to boundary conditions if the solid material is not infinitely extended. A particular waveform, that propagates in thin solid plates with thicknesses around the magnitude of the wavelength is the Lamb wave, named after Horace Lamb, who first presented an analytic description in 1917 [64]. Lamb waves can be classified into symmetric and antisymmetric modes where in symmetric modes the particles to the left and right of the plate center move in phase toward (or away) from the center, and in antisymmetric modes, there is a phase shift of 180 degrees. Since the propagation of these waves is constrained to solid media, they are a subtype of guided waves. Depending on the frequency and thickness of the plate, higherorder modes can also occur.

The analytic solution presented by Lamb can be summarized in the characteristic equations that describe the relationship between the angular frequency and the wavenumber w. The so-called Rayleigh-Lamb equations [100]

$$\frac{\tan(qd/2)}{\tan(pd/2)} = -\frac{4k\_{\text{w}}^2pq}{(q^2 - k\_{\text{w}}^2)^2} \quad \text{for symmetric modes and} \tag{3.13}$$

$$\frac{\tan(pd/2)}{\tan(qd/2)} = -\frac{4k\_{\text{w}}^2pq}{(q^2 - k\_{\text{w}}^2)^2} \quad \text{for antisymmetric modes} \tag{3.14}$$

show that Lamb waves are dispersive, i.e., the phase and group velocities are frequency dependent. Here describes the thickness of the plate, while and have to be substituted by

$$p^2 = \left(\frac{\omega}{c\_{\rm L}}\right)^2 - k\_{\rm w}^{\,\,<} \quad \text{and} \quad q^2 = \left(\frac{\omega}{c\_{\rm T}}\right)^2 - k\_{\rm w}^{\,\,<}.\tag{3.15}$$

Figure 3.6 Lamb waves group and phase velocities of the first two symmetric and antisymmetric modes in a steel plate with a thickness of 4 mm.

In equations (3.15) the transverse wave speed <sup>T</sup> and the longitudinal wave speed <sup>L</sup> are material constants that are influenced by Young's modulus, the density, and Poisson's ratio. Thus, the exact dispersion characteristic has to be either identified empirically or through precise knowledge of the material properties. Note that (3.13) and (3.14) do not have an analytical solution and, therefore, have to be solved numerically for each frequency . As the tangent is periodic, the Rayleigh-Lamb equations can have multiple solutions for higher frequencies, which explains the existence of higher-order modes. Given the solution of the relationship between wavenumber and frequency, the phase velocity

$$v\_{\rm P}(\omega) = \frac{\omega}{k\_{\rm w}(\omega)}\tag{3.16}$$

and the group velocity

$$v\_{\rm G}(\omega) = \left(\frac{\partial k\_{\rm w}(\omega)}{\partial \omega}\right)^{-1} \tag{3.17}$$

can easily be calculated. The resulting phase and group velocities of the first two symmetric and antisymmetric modes in a 4 mm stainless steel plate with standard material properties are depicted in Figure 3.6. It can be seen that the A1 and S1 mode can only propagate for > 0.5 MHz

(a) Test setup for clamp-on UFM systems. (b) Simplified test setup for structural waves. Figure 3.7 Setup of the measurement system.

and > 0.7 MHz, respectively. Furthermore, the dispersive character of the Lamb waves reduces for very high frequencies.

# 3.2 Signal model

The following section introduces the measurement system together with its measurement signals and the signal model consisting of the target signals and the interfering signals. Measurement signals are recorded in different scenarios and examined for their properties such as bandwidth, multipath propagations and dispersion.

## 3.2.1 Measurement system

Firstly, the experimental setups used for the identification of the measurement system are presented. Figure 3.7(a) shows a pair of two transducers mounted opposite each other on a steel pipe with a wall thickness of 4 mm and a diameter of 80 mm. Depending on the objective of the investigation, water or air is used as the medium. The other setup (see Fig. 3.7(b)), used solely for the investigation of structural waves, consists of a transducer pair on a 4 mm steel plate that is loaded half-sidedly with water. Since the plate is from another manufacturer, the material is not exactly the same as the pipe wall in the first scenario, but both times stainless steel was used. Using the curvature of the plate due to its own weight, it can be achieved that only the lower surface of the plate is

loaded with water and not the edges or the top surface. This mimics the situation in a water-filled pipe without disturbing multipath propagations. To this end, the basin is built deep enough and the quadratic plate is chosen large enough (≈ 1 m<sup>2</sup> ) so that all reflections from the bottom or the edge of the plate are directly separable in the time domain. The transducers are distanced 45 cm apart. In both setups, wedge transducers as illustrated in Figure 3.3 were used.

The measurements were carried out using a PXIe-1062 station with a PXIe-5171 ADC module and a PXI-5412 DAC module from National Instruments. Both the ADC and DAC modules operate at a sampling rate of 50 MHz, so all calculations could be performed in the digital domain. Unless otherwise specified, the Gaussian modulated cosine signal

$$s\_{\rm u}(t) = 10\,\text{V} \cdot \exp\left(-\frac{(t - 5\,\text{μs})^2}{2 \cdot (1.5\,\text{μs})^2}\right) \cdot \cos\left(2\pi f\_0 t\right) \tag{3.18}$$

was used as the electrical excitation, where the center frequency was set to 700 kHz.

Three different data sets were acquired for the system identification from the setups presented in Fig. 3.7. For the first data set <sup>D</sup>Id,1, the setup from Fig. 3.7(a) was installed in a water circulation system with a pump to control the flow and a reference flow meter to get the VoF, which is used to calculate the ground truth for the time-delay difference. One measurement each was recorded for the VoFs 0.5 m s−1 , 0.6 m s−1 and 0.7 m s−1, where each measurement consists of one recording for the downstream waves <sup>r</sup> () and one recording for the upstream waves <sup>d</sup> (). In order to use a uniform terminology for the signals in UFM that is also consistent with the SotA methods for TDE, the recordings are called delayed and reference signal. As the upstream signal is obviously slower than the downstream signal, it is denoted as the delayed signal, whereas the downstream signal is denoted as the reference signal. Since the used VoFs are quite high and a flow straightener with sufficient inlet length was used, a fully developed turbulent flow profile resulted. For the second data set <sup>D</sup>Id,2, the empty pipe was placed in a temperature chamber and while the temperature was increased, 72200 measurement signals were continuously recorded, packaged in blocks of 100, and averaged, resulting in 722 signals. Because there was no water or vacuum in the pipe

Figure 3.8 Typical measurement signals of the pipe setup with water (left) and air (right) as medium. Only the downstream direction is drawn since both signals are visually the same at the present scaling.

during the experiment, the air is present as the medium. In contrast to the first two data sets, the last data set <sup>D</sup>Id,3 was not recorded from the pipe setup, but from the plane plate setup shown in Fig. 3.7(b). Several measurements were carried out, in which the excitation frequency in (3.18) was increased step by step from 300 kHz to 1 MHz with an increment of 10 kHz.

#### 3.2.2 Target signals

All signal components that show the measuring effect—a time-delay difference between up- and downstream direction—are considered as target signals in this thesis. However, for a better representation of the physical interrelationships, the target signals have to be further subdivided based on their respective propagation path and transducer impulse response. To explain the different physical effects that characterize the UFM system, an example signal from the data set <sup>D</sup>Id,1 is depicted on the left in Fig. 3.8. In contrast to the scenario where the pipe was filled with air (see right diagram in Fig. 3.8), two major signal components starting at about 60 µs and 100 µs stand out. Furthermore, other small but widely distributed

signal components can also be seen, which already start at 30 µs. A closer look at the difference signal

$$
\Delta s(t) = s\_\mathbf{r}(t) - s\_\mathbf{d}(t) \tag{3.19}
$$

between downstream signal <sup>r</sup> () and upstream signal <sup>d</sup> () at a high VoF yields that these small widespread signal components do not contain any useful information, since Δ() is nearly zero in their time ranges. It is noticeable that the two major wave packets, both of which have a significant time delay for high VoF, differ significantly in their envelope and especially in their time of arrival. This leads to the conclusion that these major wave packets result from a multipath propagation through the fluid. As each path has its own path length and its own average VoF , a valid signal model for the target signals can be described by the sum

$$\sum\_{i} s\_{t,i} \left( t - \frac{L\_i}{c\_{\mathcal{M}}} \pm \frac{\tau\_i}{2} \right) \,. \tag{3.20}$$

The sign of the time delay depends on whether the signals result from the measurement in downstream or upstream direction. It is important to note that the exact shape of each signal component t ,() is not constrained through this model, although the individual signal components t ,() may very well be strongly correlated with each other. The number of propagation paths is dependent on the geometric relations of the respective pipes and is, therefore, not yet determined here.

In addition to their propagation behavior, the target signals are characterized by their bandwidth. That is why the PSD of the recorded signals are calculated using Welch's method [130]. As the window in Welch's method a Hamming window with length 60 µs for the target signals' PSD and length 10 µs for the noise signals' PSD was used. To improve the noise PSD estimation, the noise signals can be isolated in the time domain, since the ultrasonic waves need a minimum travel time, and thus the measurement signal contains only noise in the beginning. The results are plotted in Fig. 3.9. From this diagram, several conclusions can be drawn:

Figure 3.9 PSDs of the different measurement signals calculated by Welch's method. The PSD of the isolated noise signals (|<sup>r</sup> ()<sup>∗</sup> r ()|) was calculated using only the time ranges that do not contain target or interfering signals. Since the interfering signals are suppressed in the difference signal, the PSD of the difference signal (|Δ()Δ∗()|) shows the isolated PSD of the target signals. Furthermore, the PSD |<sup>r</sup> ()<sup>∗</sup> d ()| removes the noise influence, since <sup>r</sup> () and <sup>d</sup> () are uncorrelated.


To summarize, the target signals can be modeled by (3.20), where each signal component is individual concerning its waveform t ,(), its absolute time delay a, = /<sup>M</sup> and its time-delay difference . Furthermore, every waveform is limited in its bandwidth.

#### 3.2.3 Interfering signals

Apart from the target signals, the measurement signal contains further signal components. These remaining signals do not contain the measuring effect and are considered interfering signals. However, since both the target and the interfering signals are always superpositioned in the measurement signals, they are difficult to separate using conventional approaches. Therefore, a trick is needed to examine the interfering signals in isolation. To this end, the fact is used that the attenuation of highfrequency ultrasonic waves is significantly greater in air than in liquid media [68, 123]. Furthermore, the acoustic coupling reduces through the impedance mismatch, further enhancing the suppression of the waves propagating through the medium. Using these relationships, the propagation paths through the medium in the pipe can be almost completely suppressed if the pipe is emptied. An example measurement signal resulting from the data set <sup>D</sup>Id,2, in which this trick was applied, can be seen on the right side of Fig. 3.8. That it has worked can be seen from the fact that the two major wave packets seen in <sup>D</sup>Id,1 are no longer present. Obviously, the present signal is caused solely by waves propagating through the pipe wall, as this is the only path left when the paths through the inside of the pipe are suppressed. Note that the amplitude and shape of these structural waves cannot be directly compared to the situation with water as the medium because the structural waves are not attenuated by the water anymore and the automatic gain control amplifies the received waves to take full advantage of the resolution of the ADC.

From the data set <sup>D</sup>Id,2 three properties of the interfering signals can be deduced:

If the examination of the PSD using Welch's method is repeated for the isolated interfering signals, the same frequency range as for the target signals can be observed. Therefore, the presence of

interfering signals leads to a spectral interference with the target signals.

The interfering signal can be considered symmetric, i.e., the signal component contained in the upstream signal is identical to the one contained in the downstream signal. A quantitative evaluation of this property can be obtained by the ratio of the signal energies in decibel

$$10\log\_{10}\left(\frac{\|s\_{\mathrm{d}}(t) - s\_{\mathrm{r}}(t)\|\_{2}^{2}}{\left\|\frac{1}{2}(s\_{\mathrm{d}}(t) + s\_{\mathrm{r}}(t))\right\|\_{2}^{2}}\right) \text{ .}\tag{3.21}$$

which yields <sup>9</sup>40.5 dB for the present data set.

There is no time range in which the target signals are the only active signals, because the interfering signals are extended over the entire time domain.

The reasons for these properties and a concluding signal model of the entire measurement signal, including the target and interfering signals, are given in the following four paragraphs.

#### Interference with the target signals

The reason why the interfering signals are in the same frequency range is easy to find. Both target and interfering signals are excited by the same transducer with the same electrical excitation. This leads to a strong correlation between both signal components. Without any nonlinear element to provide a frequency shift of the structural waves or the target signals, the basic frequency range cannot change. Therefore, the structural waves interfere with the target signals.

#### Symmetry in up- and downstream direction

The symmetry of the structural waves follows from the reciprocity of the whole transmission system, which can be modeled by a passive, linear time-invariant four-terminal network [77]. It is only limited by the nonlinear wave propagation in water, the nonlinear behavior of the

transducers, and the time-variance of the parameters, since the up- and downstream signals cannot be recorded at the same instant. Even though these effects are very small, the symmetry is limited to <sup>9</sup>40.5 dB in the recorded data set.

#### Time extension of the interfering signals

The unusually long time extension of the interfering signals is difficult to explain, especially since the excitation signal is time-limited to 9 µs. Even considering a very long ringing of the transducer, this can only explain a time extension in the range of tens of microseconds, but not up to 100 µs. To further enhance the understanding of the structural waves, the data set <sup>D</sup>Id,3 was recorded. The dimensions, materials, and setup have been chosen to mimic the situation in a water-filled pipe without any multipath propagations. Because there is no moving medium and the reciprocity property of the acoustic transmission system holds, the direction of the wave propagation does not matter. Thus, only one propagation direction is recorded and the resulting measurement signals are denoted as <sup>r</sup> (). According to Lamb wave theory (see Section 3.1.4), the structural waves propagate as Lamb wave modes in the pipe wall, as the wall thickness is similar to the wavelength. Therefore, in the time-frequency domain, high signal energy is expected at the time-frequency coordinates that fit the equation

$$t = \frac{L}{v\_G(f)}, \quad \text{with} \quad L = 45 \,\text{cm}. \tag{3.22}$$

This was also exploited by Kim et al. [57] for the validation of Lamb wave modes.

The assumption is tested by transferring the measurement signals via the short-time Fourier transform (STFT) into the time-frequency domain. Although there are other methods to visualize the dispersion characteristic, such as a 2D Fourier transform, a dense or at least a sparse spatial sampling of the wave field would be needed [46, 138]. In order to cover the entire frequency range without amplifying nonlinear effects, multiple narrowband excitation signals <sup>u</sup> (; <sup>0</sup> ) with stepwise increasing center frequency <sup>0</sup> are used instead of only one broadband excitation. In order to create an artifact-free detailed time-frequency representation from

Figure 3.10 Preliminary study for the qualitative representation of Lamb waves using a stacked spectrogram of the plane plate data set <sup>D</sup>Id,3. For reference, the group delay at the propagation distance 45 cm is drawn for the first two symmetric and antisymmetric modes. Left: wide window with high frequency resolution. Right: short window with high time resolution.

the multiple measurement signals, a multi-stage processing is necessary. It can be divided into four steps: spectrogram calculation, frequency response compensation, normalization, spectrogram fusion. Firstly, the absolute square of the STFT, also known as spectrogram,

$$S\_{s\_\mathbf{r}}^w(t, f; f\_0) = \left| \int\_{-\infty}^{\infty} w^\*(\tau - t) s\_\mathbf{r}(\tau; f\_0) \, \mathbf{e}^{-j2\pi f \tau} \, \mathrm{d}\tau \right|^2 \tag{3.23}$$

is calculated, where <sup>r</sup> (; <sup>0</sup> ) denotes the recorded measurement signal after using an excitation with center frequency <sup>0</sup> . For the STFT a Gaussian window () is used to get the best possible time-frequency resolution. In the next step, the non-uniform frequency response of the transducers is compensated by a weighting function () and subsequently normalized. Fusion of all pre-processed spectrograms is performed by finding the maximum energy from all spectrograms for each discrete time-frequency point. In summary, all obtained spectrograms are frequency weighted, normalized, and then fused via the maximum operation:

$$S\_{s\_r}^{w}(t,f) = \max\_{f\_0} \frac{W(f) \cdot S\_{s\_r}^{w}(t,f;f\_0)}{\max\_{t,f} W(f) \cdot S\_{s\_r}^{w}(t,f;f\_0)} \,. \tag{3.24}$$

The resulting time-frequency energy map for two different window lengths is depicted in Fig. 3.10. Furthermore, the dispersion relations of the first two symmetric and antisymmetric modes translated to (, ) by equation (3.22) are plotted. Since the exact material parameters are not known and can, except for the density, not be measured with reasonable effort, they are, within the typical limits given in standard tables, chosen by an optimization. The energy maxima show good agreement with the dispersion relation of all four modes. Now it is also clear that due to the highly dispersive S1 and A1 mode, the waves can extend very strongly in time, even if there is no multipath propagation. Furthermore, if this dispersion property is transferred to the situation in a pipe, where a multipath propagation is present, the time extension can be even more pronounced.

#### Final signal model

In conclusion, only a very general model of the interfering signals <sup>i</sup> () can be assumed, where the symmetric interfering signals are interfering with the target signals in the same time and frequency range, but are independent from the SoS of the water and the VoF. Thus, the final signal model is

$$s\_{\mathbf{r}}(t) = \sum\_{i} s\_{\mathbf{t},i} \left( t - \tau\_{\mathbf{a},i} + \tau\_i/2 \right) + s\_{\mathbf{i}}(t) + n\_{\mathbf{r}}(t) \, \tag{3.25a}$$

$$s\_{\mathbf{d}}(t) = \sum\_{i} s\_{\mathbf{t},i} \left( t - \tau\_{\mathbf{a},i} - \tau\_i/2 \right) + s\_{\mathbf{i}}(t) + n\_{\mathbf{d}}(t) \,. \tag{3.25b}$$

Here, the absolute time delay and the time-delay difference

$$\tau\_{\mathbf{a},i} = \frac{L\_i}{c\_{\mathbf{M}}} \,'\,\tag{3.26a}$$

$$\tau\_i = \frac{2v\_i \Delta x}{c\_{\text{M}}^2} \tag{3.26b}$$

are determined by the individual path lengths and the average VoFs along the propagation paths.

# 3.3 Measurement influence of process parameters

In the previous section, the signal model was presented under the assumption of stationary process conditions. However, ultrasonic waves are strongly dependent on the physical properties of the materials, such as SoS or attenuation. These properties are in turn dependent on the temperature, pressure, and other process influencing variables. In particular, the VoF and the temperature are the two most important influencing variables in UFM. Therefore, to get an estimation about the size of the expected effects, the influence of these two parameters on the measurement signals is investigated in this section.

#### 3.3.1 Velocity of flow

The VoF as the measurand should have a strong impact on the measurement signals. To this end, the resulting time-delay difference should be very sensitive to variations of the VoF, which, as (3.26b) implies, is reached by a large axial distance of the transducers or by a small SoS. The axial distance can be influenced, but as the angle at which the waves can couple into the fluid from the pipe wall is constrained, the possibility of influencing is limited. Furthermore, the SoS depends on the medium, which is generally not desired to be affected by the measurement system. To sum it up, the sensitivity can only be improved to a limited extent, which leads to the point that the impact of the time-delay difference in the measurement signals rather than the VoF should be investigated.

One way to evaluate the time-delay difference without the interfering signals is the signal difference since the interfering signals can be considered symmetric. Figure 3.11 shows on the left the signal envelope of the difference signal, calculated by the Hilbert transform, for three different time-delay differences, and on the right the value of three local maxima depending on the time-delay difference. In this evaluation, the data set <sup>D</sup>Id,1 was used, where the VoF was stepwise increased with multiple recordings per step. Generally, a linear relationship of the envelopes' local maxima is shown, but there is also a considerable variation for each VoF step. Another point that stands out is the large differences in the

Figure 3.11 Difference between delay and reference signals depending on the time-delay difference.

slopes. This can be explained if the target signals are approximated by a narrowband sine signal leading to the difference signal

$$
\Delta s(t) = s\_{\mathbf{t},i}(t + \tau\_i/2 - \tau\_{\mathbf{a},i}) - s\_{\mathbf{t},i}(t - \tau\_i/2 - \tau\_{\mathbf{a},i}) \tag{3.27}
$$

$$\approx A\_{\rm t} \cdot \cos(\omega\_0 (t + \tau\_i/2 - \tau\_{\rm a,i}))$$

$$-A\_{\rm t} \cdot \cos(\omega\_0 (t - \tau\_i/2 - \tau\_{\rm a,i})) \tag{3.28}$$

$$=2A\_\text{t}\cdot\cos\left(\omega\_0(t-\tau\_{\text{a},i})\right)\cdot\sin\left(\omega\_0\frac{\tau\_i}{2}\right)\,,\tag{3.29}$$

which is valid, if the observed time ranges are small enough (around one period). Equation (3.29) proves that the relationship can only be considered linear if ≪ 1/<sup>0</sup> holds. It is also shown that the slope depends on both the signal amplitude and the frequency of the ultrasonic waves. This also explains, why a higher center frequency leads to an improved accuracy of UFM systems. In summary, it can be assumed that the information of the time-delay difference is contained in multiple propagation paths with some paths being more sensitive than others.

#### 3.3.2 Temperature

The temperature affects the signals in many ways, e.g., the transducer impulse response, the acoustic coupling or the attenuation in the medium, each influence being of different significance. Since the proposed algorithms in the later chapters do not require complicated temperature models to perform reliably, the temperature influence on the signals is reduced to a simple temperature-dependent absolute time delay. Similar to the target signals, a pulse-specific description of the temperature influence for the interfering signals is chosen, leading to the simplified signal model

$$s\_{\mathbf{r}}(t;v,T) = \sum\_{i} s\_{t,i} \left( t - \tau\_{\mathbf{a},i}(T) + \tau\_{i}(v)/2 \right) \tag{3.30a}$$

$$+ \sum\_{l} A\_{\mathbf{i},l}(T) s\_{\mathbf{i},l}(t - \tau\_{\mathbf{a},l}(T)) + n\_{\mathbf{r}}(t) \, \tag{3.30b}$$

$$s\_{\mathbf{d}}(t;v,T) = \sum\_{i} s\_{\mathbf{t},i} \left( t - \tau\_{\mathbf{a},i}(T) - \tau\_{i}(v)/2 \right) \tag{3.30b}$$

$$+ \sum\_{l} A\_{\mathbf{i},l}(T) s\_{\mathbf{i},l}(t - \tau\_{\mathbf{a},l}(T)) + n\_{\mathbf{d}}(t) \, \, \mathbf{d}$$

which incorporates also the VoF, the temperature, and the multipath propagations of both the target signals and the interfering signals. In the following, the verification of the simplified description of the temperature dependence (3.30) is presented. Furthermore, a quantitative evaluation of the different time delay variations is given.

#### Temperature dependence of target signals

For the absolute time delay of the target signals, an analytic quadratic expression for the SoS in water is reformed using the path length to get an estimation for the absolute time delay variation per Kelvin. Using one diameter as minimal and three times the diameter as maximal path length, the absolute time delay sensitivity

$$\left| \frac{\mathrm{d}\tau\_{\mathrm{a}}}{\mathrm{d}T} \right| = \frac{L}{c\_{\mathrm{M}}^{2}(T)} \cdot \frac{\mathrm{d}c\_{\mathrm{M}}(T)}{\mathrm{d}T} \tag{3.31}$$

can be enclosed in the interval [55 ns K−1, 347 ns K−1], if ∈ [19 °C, 40 °C]. The estimation with the help of an interval was chosen because the geometrical dimensions of the setups vary slightly and only the order of magnitude of the sensitivity is relevant. The choice of the temperature range is motivated by the process conditions present in the later experiments.

#### Temperature dependence of interfering signals

In contrast to the target signals, the interfering signals show a much smaller temperature dependence, which is more difficult to model. The literature reports several methods ranging from simple time-shifting and time-scaling [21] to physically modeling the temperature dependence of the whole system consisting of the piezoelectric transducer with all acoustic couplings and Lamb waves [65]. Although time-scaling has proven to be very popular, since it is easy to parameterize and compute using the scale transform [43], it cannot handle multipath propagations. To overcome this problem, the pulse-specific description of the interfering signals in (3.30) is introduced and implemented using the Matching Pursuit (MP) algorithm followed by a second constrained MP algorithm to track the small change induced by the temperature variation. The method was adapted from Lu and Michaels [74], who used it to differentiate between signal changes due to temperature variation and material damages. Since the detailed results of the MP-based modeling of the temperature effects have already been published in an own previous work [142], the following paragraph only briefly outlines the approach.

For the investigation of the temperature influence, the measurement signals from the data set <sup>D</sup>Id,2 are used, as they contain isolated interfering signals subjected to multipath propagation. The first step consists of the free decomposition of the measurement signal using the MP algorithm. The aim is to find a sparse representation of the signal () such that the new signal representation compresses the signal energy in as few coefficients as possible. Coefficients below a specified threshold are omitted, which equals to them being set to zero. Each signal is approximated by a weighted sum of basis functions:

$$s(t) \approx \hat{s}(t) = \sum\_{k} c[k] \,\psi\_{k}(t) \,. \tag{3.32}$$

The sparse representation problem is formulated by the optimization

$$\text{minimize} \quad \left\| s(t) - \sum\_{k=1}^{K} c[k] \,\psi\_k(t) \right\|\_{2}^{2} \quad \text{w.r.t.} \quad \psi\_k(t) \in \mathcal{G} \,\,\,\,\tag{3.33}$$

where, given a constant , the best functions () have to be chosen from an overcomplete function set <sup>G</sup> <sup>=</sup> Frame{ (), = −∞, … , ∞}. The optimization problem (3.33) is also known as sparse approximation, which is NP-Hard. Optionally, the problem can be solved by a convex relaxation leading to the basis pursuit algorithm or LASSO (least absolute shrinkage and selection operator). However, in this case, it is solved by a greedy algorithm, iteratively choosing the locally optimal function from the frame. The solution may not be globally optimal, but the convergence is ensured and the computational effort is reasonable. In summary, the MP decomposition of a signal () can be formulated by the iterative rules, which are modified from Xu et al. [132]:

$$\begin{cases} r\_0(t) = s(t), \\ \psi\_i(t) = \operatorname\*{arg\,max}\_{\psi\_k(t) \in \mathcal{G}} | \langle r\_i(t), \psi\_k(t) \rangle | \, \, \, \end{cases} \tag{3.34}$$

$$\begin{cases} c[i] = \langle r\_i(t), \psi\_i(t) \rangle \, \, \, \\ r\_{i+1}(t) = r\_i(t) - 2 \cdot \operatorname{Re} \{ c[i] \cdot \psi\_i(t) \} \, \, \, \end{cases} \tag{3.34}$$

The decomposition is stopped after iterations.

The frame G is built up from energy normalized Gabor wavelets

$$\psi\_k(t) = \frac{1}{\sqrt{\sqrt{\pi}\sigma}} \exp\left(-\frac{(t-t\_k)^2}{2\sigma\_k^2} + \text{j2}\pi f\_k t\right) \tag{3.35}$$

with variable time delays , modulation frequencies and time durations . Note that using the complex Gabor wavelets in combination with taking the real part of the projection during the update of the residual signal () leads to the optimal phase, which can be calculated by

$$\varphi = \arctan\left(\frac{\text{Im}\{\langle r\_i(t), \psi\_i(t)\rangle\}}{\text{Re}\{\langle r\_i(t), \psi\_i(t)\rangle\}}\right). \tag{3.36}$$

Figure 3.12 Example residuals of the MP decomposition of interfering signals after different iterations. For better visualization an offset was added to the different residuals.

The last problem to be solved is how to find the locally best function from the frame in each iteration step. This can take up a lot of computational effort if a fine discretization of the parameters = [ , , ] is required, since the discretization determines the cardinality of the frame, and thus the number of inner products to be computed. A workaround, which nevertheless yields a high resolution of the parameter space, is to use an optimization method to solve a constrained optimization problem instead of a brute force search through all frame functions. This also allows for constraints to be set up on the parameter space by adding regularization terms. The final optimization problem is defined as

$$\boldsymbol{\theta}\_{k} = \underset{\boldsymbol{\theta}\_{k}}{\text{arg}\max} \boldsymbol{J}(\boldsymbol{\theta}\_{k}) \; \prime \quad \boldsymbol{\theta}\_{k} = [t\_{k}, f\_{k}, \sigma\_{k}] \; \prime \tag{3.37}$$

with the quality function

$$\begin{split} J(\boldsymbol{\theta}\_{k}) &= |\langle r\_{i}(t), \psi(t, \boldsymbol{\theta}\_{k})\rangle| \\ &+ k\_{\text{f}} \cdot (f\_{k} - f\_{\text{set}})^{2} + k\_{\sigma} \cdot (\sigma\_{k} - \sigma\_{\text{set}})^{2} \\ &+ k\_{\text{t}} \cdot (t\_{k} - t\_{\text{set}})^{2} \\ &+ k\_{\text{t}, \text{max}} \cdot (t\_{k} - t\_{\text{max}})^{2} \cdot h(t\_{k} - t\_{\text{max}}) \end{split} \tag{3.38}$$

where ℎ() denotes the Heaviside step function, which is used to limit the time range in which the decomposition should take place. The strength of the regularization can be tuned by the multipliers <sup>f</sup> , , <sup>t</sup> , t,max

allowing different combinations of constraints. It has to be noted that all restrictions can lead to convergence problems. Therefore, the regularization was used cautiously. An example decomposition of the first signal from <sup>D</sup>Id,2 can be observed in Fig. 3.12. Here, the process of the MP decomposition is shown by plotting the residuals after 0, 7 and 60 iteration steps. The limitation of the parameter space to the time range max can be seen very well. In addition, the last residual energy shows, that very high compression is possible.

After the free MP decomposition of one measurement signal, a restricted MP decomposition of the remaining signals follows. The objective of the second constrained decomposition is to track the small temperature-induced changes without allowing large jumps in the coefficients. This is important because the MP decomposition does not result in orthogonal basis functions leading to large deviations of the sparse representation if only the order of the basis functions is changed. Therefore, the basis functions and their order resulting from the free MP algorithm are fixed and used to repeat the iteration rules (3.34), omitting the search for the maximum absolute inner product. Applying this approach to the analysis of the signals from the data set <sup>D</sup>Id,2 results in a set of coefficients for each signal, where each signal was recorded at a specific temperature. Although the second MP decomposition is restricted, the residual signal energy is still at −25 dB of the original energy, if the temperature range of interest ∈ [19 °C, 40 °C] is considered (see [142] for the model quality at all temperatures).

The absolute values and the phase values of the largest four coefficients are plotted against temperature in Fig. 3.13. Even though a single decomposed signal component cannot fully represent a physical propagation path, it is likely that some of the different signal components originate from different paths. This is shown by the different slopes of the absolute and phase values in Fig. 3.13. While the absolute values correlate with the coupling, the attenuation, and other effects inside the transducer, the phase values indicate the time delay of the component, and thus the SoS. Note that in contrast to the interfering signals, no temperature dependence of the absolute values could be observed for the propagation paths through the fluid. This leads to two possible conclusions: either it is a fictitious effect caused by the superposition of

Figure 3.13 Temperature dependency of interfering signals in the MP space for the first four basis functions.

different propagation modes with the subsequent MP decomposition, or the attenuation is exclusively due to effects in the Lamb wave propagation in the pipe wall. Either way, this effect has to be kept in mind for the later algorithmic approaches. From Fig. 3.13 the maximum quantitative effect of the temperature on the absolute value and the time shift is shown to be 0.61 % K−1 and 11.9 ns K−1, respectively. These values are calculated by the slopes of the red curve from the left plot and the purple curve from the right plot. Since the phase values need to be converted to time shifts via their frequency, the excitation frequency of 700 kHz was assumed. Comparing the time shift of the interfering signals 11.9 ns K−1 to the minimum effect on the target signals' time shift 55 ns K−1 shows that the assumption of stationarity of the interfering signals can lead to estimation errors if the temperature increases.

# 3.4 Simulated signals

In this section, the knowledge gained about the signal model and its dependence on process conditions is turned into a simulated data set to be used for the development of the filtering algorithms. There are advantages and disadvantages of these simulated signals. On one hand, the ground truth for the interfering signals and target signals is known. On the other hand, the signals are only a simplified version of real measurement signals and therefore, the performance of an algorithm on the

simulated data can only be transferred to real measurement data to a limited extent. Nevertheless, using simulated data is a good approach to prove the concept of a model-based approach if the assumptions on the signal model hold.

The simulated signals are composed of target signals with three propagation paths through the fluid, the interfering signals, and artificial AWGN. Firstly, a signal form for the target signals has to be designed. For this purpose, a Gaussian modulated cosine signal with a center frequency <sup>0</sup> = 700 kHz and a relative bandwidth of 0.25 is filtered by an artificial impulse response of a transducer, which is generated using a SPICE-simulated equivalent circuit of a piezoelectric transducer. The time delays are set according to Eq.(3.2), with a path angle of 50.12° and the three propagation distances <sup>1</sup> = 9.36 cm, <sup>2</sup> = 11.51 cm and <sup>3</sup> = 14.68 cm. Additionally, different amplitudes are set for the different paths to simulate a distance-dependent attenuation. Two types of data sets are generated each consisting of 1000 signals: one with a constant VoF and one with a rising and a falling ramp for the VoF as seen in Fig. 3.14(a). For both types, the temperature is set to start at 19 °C and to exponentially approach 40 °C. To calculate SoS in (3.2), Lubbers and Graaff [76] presented the simple quadratic approximation

$$c\_{\rm M}(T) = 1404.3 \,\text{m} \,\text{s}^{-1} + 4.7 \frac{\text{m} \,\text{s}^{-1}}{\text{°C}} \cdot T - 0.04 \frac{\text{m} \,\text{s}^{-1}}{\text{°C}^2} T^2. \tag{3.39}$$

Although they only claimed a high precision for ∈ [15 °C, 35 °C] a comparison to the data from the NIST webbook [67] shows a good agreement in the used temperature range ∈ [19 °C, 40 °C].

Subsequently to the simulation of the target signals, realistic interfering signals have to be generated. To this end, a propagating wave packet is defined once again by a Gaussian modulated cosine signal with <sup>0</sup> = 700 kHz. This wave packet at = 0 is propagated in the Fourier domain governed by the dispersion relation

$$s\_{\mathbf{i}}(t,x) = \mathcal{F}^{-1}\{S\_{\mathbf{i}}(f,x=0) \cdot \exp(-\mathbf{j}k\_{\mathbf{w}}(\omega)x)\},\tag{3.40}$$

with <sup>i</sup> (, = 0) denoting the Fourier transform of <sup>i</sup> () at = 0. As there are different Lamb wave modes, the signal energy of the wave packet is distributed to the first two symmetric and antisymmetric modes,

(b) Examples of target and interfering signals.

Figure 3.14 Simulated data set.



then individually propagated by (3.40) and finally superpositioned to generate the final realization of the interfering signals. An example signal from the generated data sets is drawn in Fig. 3.14(b). The three propagation paths through the fluid are easily recognizable. Furthermore, the interfering signals are extended over the whole time range, as it was also observable in real measurement signals.

In order to test the algorithms presented in this thesis once with and without model errors, the interfering signals are simulated once as stationary and once with a temperature-dependent amplification of 0.61 % K−1 and time shift of 11.9 ns K−1. These values are chosen to match the maximum observed amplification and time shifts in the measurement data. This, together with the decision of whether to use a constant VoF or a varying VoF, leads to a total of four data sets, which are summarized in Table 3.1.

# 4 Novel Separation Methods Based on Signal Dynamics

In this chapter, the first proposed class of algorithms to filter the interfering signals is presented. They are based on the assumption that the interfering signals are stationary, while the target signals show a variable time delay due to changes in the SoS induced by the temperature or the VoF (see (3.25)). In this thesis, these properties are called signal dynamics of the target signals. For this reason, the presented methods in this chapter will be summarized under the term signal-dynamics method (SDM). The first section, describing the requirements and the preliminaries, contains a new representation of the signal dynamics in form of point clouds. Subsequently, two methods to process the resulting point clouds are introduced in the following two sections leading to several methods for the estimation of the interfering signals. Both methods only evaluate the direct propagation path. Finally, each method is tested on the simulated data sets and concluded by a discussion of the advantages and disadvantages of the respective method.

# 4.1 Point cloud representation of consecutive measurements

#### 4.1.1 Requirements and preliminaries

The precondition to use signal dynamics for filtering stationary interfering signals is to have a package of multiple measurement signals. Furthermore, this package needs to be recorded while the varying process conditions lead to time-shifted target signals. In summary, let there

be measurements each for the reference <sup>r</sup> () and delayed signal <sup>d</sup> () combined to the vectors

$$\mathbf{s}\_{\mathbf{r}}(t) = [s\_{\mathbf{r}}(t, 1), \dots, s\_{\mathbf{r}}(t, m), \dots, s\_{\mathbf{r}}(t, M)]^{\rm T},\tag{4.1a}$$

$$\mathbf{s}\_{\mathrm{d}}(t) = [s\_{\mathrm{d}}(t, 1), \dots, s\_{\mathrm{d}}(t, m), \dots, s\_{\mathrm{d}}(t, M)]^{\mathrm{T}},\tag{4.1b}$$

where the measurement index denotes a single signal within the package. In the following, the process conditions , or the derived variables a , during a single measurement of the package are always denoted by the measurement index, e.g., , , a,. Since the processing takes place in the digital domain, the signals are only available in the timediscrete form

<sup>r</sup> = (<sup>r</sup> ( , )) ∈ ℝ× , (4.2a)

$$\mathbf{S}\_{\mathrm{d}} = \left( s\_{\mathrm{d}}(t\_n, m) \right)\_{mn} \in \mathbb{R}^{M \times N},\tag{4.2b}$$

with = <sup>s</sup> being the discrete time sampled at the sampling rate 1/<sup>s</sup> . To highlight the difference between quantities with physical dimensions such as and integer quantities such as , round brackets are used in the physical case and square brackets in the integer case. Furthermore, multiple equal independent variables that influence a signal or function are separated by comma in the argument, whereas hyperparameters or process conditions are separated by semicolon.

The signal model used for SDMs is simplified to contain only a single propagation path of the target signals, i.e., the sum in (3.25) is omitted. Cutting out only a small time range, in which the direct propagation path is included, is necessary to improve the validity of this model. To this end, a preprocessing searches local maxima in the envelope of the measurement signals and their difference signals Δ(). By introducing several regularizations in combination with choosing the local maximum with the smallest transit time leads to a robust detection of the correct time range. The regularizations used are peak prominence, peak height, necking in the envelope, and physically possible limits. The remainder of this chapter assumes that the time range, which the SDMs have to process, is known or determined by a preceding method.

(a) Cutout of a simulated measurement signal and its Hilbert transform. (b) 2D representation of complex-valued analytic signal.

Figure 4.1 Hilbert transform example and 2D representation of an analytic signal.

#### 4.1.2 Fundamentals on the Hilbert transform

Real-valued signals are characterized by the fact that the absolute values of their Fourier transform are symmetrical around = 0. In contrast, the Fourier transform of analytic signals is non-zero only for ≥ 0. For sinusoidal signals, this has multiple advantages, such as an easy determination of the phase, the instantaneous frequency, or the instantaneous amplitude. For these reasons, the measurement signals are firstly transformed into their analytic counterparts using the Hilbert transform [53]

$$s\_{\rm as}(t) = s(t) + \mathbf{j} \cdot \mathcal{H}\{s(t)\} \tag{4.3}$$

$$h = s(t) + \mathbf{j} \cdot s(t) \* h(t), \quad \text{with} \quad h(t) = \frac{1}{\pi t}. \tag{4.4}$$

However, since the signals are time-discrete, a discrete calculation of the Hilbert transform is necessary. This can be realized by calculating the discrete Fourier transform (DFT) of the signals, replacing the DFT coefficients that correspond to negative frequencies with zeros and finally transforming the resulting DFT values back to the discrete time domain by using the inverse DFT [86].

The resulting analytic signal is complex-valued as (4.3) implies. Therefore, in order to visualize the signal, it can be displayed either separately

by real and imaginary part or in a 2D area where the real and imaginary part are the axes. Figure 4.1 shows an example of the Hilbert transform applied to a signal from the simulated data set <sup>D</sup>sim,1. It is easy to see that the Hilbert transform produces a 90-degree shift of ( ), which leads to spiral trajectories in the complex area. Since the sampled values at each point in time are projected onto their real and imaginary values, the direct relation to the time vanishes in the complex area, which makes this representation invariant to time scalings or time shifts. The formulation

$$s\_{\rm as}(t) = A(t) \cdot \mathbf{e}^{\mathbf{j}\phi\_{\rm h}(t)} \tag{4.5}$$

shows that the phase position and the envelope can be calculated taking the argument and the absolute value of the analytic signal. Note that if the envelope is constant, the trajectory will form an ideal circle.

#### 4.1.3 Point clouds

Applying the time-discrete Hilbert transform, introduced in the previous section, to each measurement signal from the given signal package (4.2) leads to the complex-valued matrices r,as, d,as ∈ ℂ×. Subsequently, to interpret the signal package as point clouds, the real and imaginary parts are considered as abscissa (-coordinate) and ordinate (-coordinate), respectively, which yields one 2D point cloud per discrete time index for both the reference and delay signals

$$\mathbf{P}\_{\mathbf{r}}[n] = \begin{bmatrix} \text{Re}\{\mathbf{e}\_n^T \mathbf{S}\_{\mathbf{r}, \mathbf{as}}^T\} \\ \text{Im}\{\mathbf{e}\_n^T \mathbf{S}\_{\mathbf{r}, \mathbf{as}}^T\} \end{bmatrix} \in \mathbb{R}^{2 \times M} \tag{4.6a}$$

$$\mathbf{P}\_{\mathrm{d}}[n] = \begin{bmatrix} \mathrm{Re}\{\mathbf{e}\_{n}^{\mathrm{T}} \mathbf{S}\_{\mathrm{d},\mathrm{as}}^{\mathrm{T}}\} \\ \mathrm{Im}\{\mathbf{e}\_{n}^{\mathrm{T}} \mathbf{S}\_{\mathrm{d},\mathrm{as}}^{\mathrm{T}}\} \end{bmatrix} \in \mathbb{R}^{2 \times M},\tag{4.6b}$$

where denotes the -th unit vector and each point cloud consists of points.

An example of the point cloud representation is shown in Fig. 4.2. If the signals and their Hilbert transforms on the left side are sampled at the four discrete time steps indicated by the dashed lines, the four point clouds depicted on the right side are obtained. Each point cloud consists of as many points as there are measurement signals. The length,

Figure 4.2 Point cloud generation from measurement signals package.

curvature, and orientation of the point clouds is determined by the signal dynamics, the frequency, the amplitude, and the instantaneous phase at the corresponding time step. Although a high SNR was set in the simulated data sets, the influence of measurement noise is shown by the slight deviation of the point clouds from a perfect line.

Lastly, the origin of the effect that the point clouds of consecutive time steps are misaligned is discussed. The following considerations apply to both the reference and the delay signals. If the signal model (3.25) with stationary interfering signals and varying time delays ̃ is assumed, the signal package (4.1) can be reformulated to

$$\mathbf{s}\_{\mathbf{r}}(t\_n) = [s\_{\mathbf{t}}(t\_n - \tilde{\tau}\_1), \dots, s\_{\mathbf{t}}(t\_n - \tilde{\tau}\_M)]^\mathrm{T} + \mathbf{1}\_{M1} \cdot s\_{\mathbf{i}}(t\_n) + \mathbf{n}\_{\mathbf{r}}(t\_n) \text{. (4.7)}$$

In (4.7), the -by-1 column vector filled with ones 1 = {1} indicates that the interfering signals <sup>i</sup> ( ) are identical in each measurement of the package. Since the effect is the same, it does not matter if the time delay ̃ is caused by a varying absolute time delay a,, a varying timedelay difference or a combination of both. Repeating the necessary steps to get point clouds for this model results in

$$\mathbf{P}\_{\mathbf{r}}[n] = \mathbf{P}\_{\mathbf{t}}[n] + \mathbf{p}\_{\mathbf{i}}[n] \cdot \mathbf{1}\_{M1}^{\mathrm{T}} + \mathbf{P}\_{n\_{\mathbf{r}}}[n] \,. \tag{4.8}$$

This shows that each point cloud <sup>r</sup> [] is composed of a shape influencing component <sup>t</sup> [], an offset <sup>i</sup> [] determined by the interfering

signals, and a noise-induced stochastic component <sup>r</sup> []. For further investigation of the point cloud shape, suppose that the time delays ̃ in (4.7) cover the interval [lb, ub]. Thus, the component <sup>t</sup> [] can be considered as a discrete sampling of the analytic target signal in the interval = [ − ub , − lb]. For this reason, the shape of <sup>t</sup> [] is determined by the shape of the analytic target signal in the interval . Transferring this concept to other time steps, the next point cloud <sup>t</sup> [ + 1] shows the shape of the analytic target signal in the interval +1 = [+1 − ub , +1 − lb] and so on. A sufficiently small sampling time or a sufficiently large interval ∈ [lb, ub] leads to a non-empty intersection = ∩ +1. In other words, the shapes of consecutive point clouds overlap in cases where the offset <sup>i</sup> [] is zero. Otherwise, the misalignment

$$\mathbf{p}\_{\mathrm{i}}^{\Delta}[n] = \mathbf{p}\_{\mathrm{i}}[n+1] - \mathbf{p}\_{\mathrm{i}}[n] \tag{4.9}$$

is present between the point clouds <sup>r</sup> [] and <sup>r</sup> [ + 1]. This concludes all effects that are observable from the example point clouds.

## 4.2 PCA-based point cloud processing

This section describes how the point clouds presented in the previous section are processed by the PCA to filter the interfering signals in TDE. The point clouds can be processed in two ways: either they are used to first geometrically estimate the interfering signals followed by a simple subtraction to improve conventional TDE methods, or the TDE is estimated directly from the point clouds. Both methods assume a narrowband signal model with limited bandwidth, which was shown in Section 3.2.2 to be in good agreement with the measurement signal properties.

#### 4.2.1 Geometric estimation of interfering signals

The theoretical motivation to estimate the interfering signals geometrically is based on the narrowband signal model in analytic signals form

$$s\_{\rm r,as}(t;\tau) = A\_{\rm t} \,\mathrm{e}^{\rm i2\pi f\_0(t+\tau/2)} + s\_{\rm i,as}(t) + n\_{\rm r,as}(t) \,\prime \,\tag{4.10a}$$

$$s\_{\rm d,as}(t;\tau) = A\_{\rm t} \,\mathrm{e}^{\rm i2\pi f\_0(t-\tau/2)} + s\_{\rm i,as}(t) + n\_{\rm d,as}(t) \,, \tag{4.10b}$$

with the stationary interfering signals i,as(). In (4.10), the absolute time delay was omitted and a constant frequency was assumed, but nevertheless, the following argumentation for the interfering signals estimation is still valid, since the only precondition is the presence of the signals' derivative at two distinct time delays. In order to make the calculations more manageable, the analytic signal components are transferred to the 2D area equivalent to the point cloud representation. To this end the operation

$$vec : \mathbb{C} \mapsto \mathbb{R}^2, \quad x \mapsto \begin{bmatrix} \text{Re}\{x\} \\ \text{Im}\{x\} \end{bmatrix} \tag{4.11}$$

is introduced, which maps a complex number to a 2-by-1 vector. Applying the operation to all components from (4.10)

$$\begin{aligned} \mathbf{p}\_{\mathbf{r}}(t;\tau) &= \operatorname{vec}\left(s\_{\mathbf{r},\mathbf{as}}(t;\tau)\right), & \mathbf{p}\_{\mathbf{d}}(t;\tau) &= \operatorname{vec}\left(s\_{\mathbf{d},\mathbf{as}}(t;\tau)\right), \\ \mathbf{p}\_{\mathbf{t}}(t) &= \operatorname{vec}\left(A\_{\mathbf{t}}\operatorname{\mathbf{e}}^{\mathbf{i}2\pi f\_{0}t}\right), & \mathbf{p}\_{\mathbf{i}}(t) &= \operatorname{vec}\left(s\_{\mathbf{i},\mathbf{as}}(t)\right), \\ \mathbf{p}\_{n\_{\mathbf{r}}}(t) &= \operatorname{vec}\left(n\_{\mathbf{r},\mathbf{as}}(t)\right), & \mathbf{p}\_{n\_{\mathbf{d}}}(t) &= \operatorname{vec}\left(n\_{\mathbf{d},\mathbf{as}}(t)\right) \end{aligned} \tag{4.12}$$

results in the 2D signal model representation

$$\mathbf{p}\_{\mathbf{r}}(t;\tau) = \mathbf{p}\_{\mathbf{t}}(t+\tau/2) + \mathbf{p}\_{\mathbf{i}}(t) + \mathbf{p}\_{n\_{\mathbf{r}}}(t) \, . \tag{4.13a}$$

$$\mathbf{p}\_{\rm d}(t;\tau) = \mathbf{p}\_{\rm t}(t-\tau/2) + \mathbf{p}\_{\rm i}(t) + \mathbf{p}\_{n\_{\rm d}}(t) \,. \tag{4.13b}$$

Remembering the point cloud representation of the signal package, short sections of the target signals' trajectory are given by the point clouds, where the lengths of the sections are determined by the signal dynamics contained in the package. Even if the true trajectory of the target signals cannot be restored due to the sections being too short, it is possible to calculate the tangents to the observable trajectory pieces. These tangents r , <sup>d</sup> are proportional to the derivatives of the signals with respect to the time delay . By inserting the complex exponential function in the derivative and using Euler's formula, a relationship between the derivatives and the target signals can be established:

$$\frac{\mathrm{d}\mathbf{p}\_{\mathrm{r}}(t;\tau)}{\mathrm{d}\tau} = \pi f\_0 \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} \cdot \mathbf{p}\_{\mathrm{t}}(t + \tau/2) \propto \mathbf{v}\_{\mathrm{r}\prime} \tag{4.14a}$$

$$\frac{\mathrm{d}\mathbf{p}\_{\mathrm{d}}(t;\tau)}{\mathrm{d}\tau} = \pi f\_0 \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix} \cdot \mathbf{p}\_{\mathrm{t}}(t - \tau/2) \propto \mathbf{v}\_{\mathrm{d}}\,. \tag{4.14b}$$

57

The rotation matrices in (4.14) are caused by the multiplication of the target signals with the imaginary units, resulting from the differentiation. Subsequently, by inverting the rotation matrices, the normals to the tangents

$$\mathbf{p}\_{\mathbf{t}}(t+\tau/2) = c\_{\mathbf{r}} \begin{vmatrix} 0 & 1 \\ -1 & 0 \end{vmatrix} \cdot \mathbf{v}\_{\mathbf{r}} =: c\_{\mathbf{r}} \mathbf{v}\_{\mathbf{r}}^{\perp} \tag{4.15a}$$

$$\mathbf{p}\_{\mathbf{t}}(t - \tau/2) = c\_{\mathbf{d}} \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} \cdot \mathbf{v}\_{\mathbf{d}} =: -c\_{\mathbf{d}} \mathbf{v}\_{\mathbf{d}}^{\perp} \tag{4.15b}$$

are given for both distinctive time delays /2 and −/2. Note that for a better overview, the time dependencies of the tangents <sup>r</sup> , <sup>d</sup> and the normals ⊥ r , <sup>⊥</sup> d are not displayed. Finally, an estimation of the interfering signals can be obtained by inserting the normals (4.15) into (4.13), neglecting the measurement noise, and reformulating the equation to i (). The resulting estimation is given by

$$\hat{\mathbf{p}}\_{\mathrm{i}}(t) = \mathbf{p}\_{\mathrm{r}}(t;\tau) - [1,0] \cdot [\mathbf{v}\_{\mathrm{r}}^{\perp}, \mathbf{v}\_{\mathrm{d}}^{\perp}]^{-1} \cdot (\mathbf{p}\_{\mathrm{r}}(t;\tau) - \mathbf{p}\_{\mathrm{d}}(t;\tau)) \cdot \mathbf{v}\_{\mathrm{r}}^{\perp} \text{ (4.16)}$$

which leads to the estimated interfering signals ̂<sup>i</sup> () = <sup>T</sup> 1 ̂i (). Taking into account the measurement noise, the estimation deviates from the true interfering signals values according to

$$\begin{split} \hat{\mathbf{p}}\_{\mathrm{i}}(t) &= \mathbf{p}\_{\mathrm{i}}(t) + \mathbf{p}\_{n\_{\mathrm{r}}}(t) \\ &\quad + [1, 0] \cdot [\mathbf{v}\_{\mathrm{r}}^{\perp}, \mathbf{v}\_{\mathrm{d}}^{\perp}]^{-1} \cdot \left(\mathbf{p}\_{n\_{\mathrm{d}}}(t) - \mathbf{p}\_{n\_{\mathrm{r}}}(t)\right) \cdot \mathbf{v}\_{\mathrm{r}}^{\perp} . \end{split} \tag{4.17}$$

From this equation, two conclusions can be drawn:


This concludes the time-continuous formulation of an estimator for the interfering signals based on the tangents to the trajectories.

The following is the practical implementation of the estimation based on the given discrete-time signals and their derived point clouds <sup>r</sup> [], <sup>d</sup> []. Instead of solving (4.16) for all times , the interfering signals need to be only estimated for the discrete time steps . To further reduce the effort, the time range can be limited to only the time range of the target signals, where the SNR is higher than a predefined threshold. After determining for which time steps (4.16) is to be solved, the next task is to calculate the tangents. For this purpose, the point clouds are processed using the PCA. The PCA adapted to the processing of 2D point clouds can be summarized in two steps. At first, the point clouds at the discrete time steps are processed via the rule

$$
\tilde{\mathbf{P}}[n] = \mathbf{P}[n] \cdot \left(\mathbf{I}\_M - \frac{1}{M} \mathbf{1}\_{MM}\right) \tag{4.18}
$$

with denoting the -by- identity matrix, for the reference and delay signals respectively, which leads to the offset-free point clouds ̃ r [] and ̃ d []. In the second step, the eigenvalue decompositions

$$\boldsymbol{\Sigma}\_{\rm r}[n] = \mathbf{V}\_{\rm r}[n] \cdot \mathbf{A}\_{\rm r}[n] \cdot \mathbf{V}\_{\rm r}^{1}[n] \,, \tag{4.19a}$$

$$\mathbf{\dot{\Sigma}\_{\rm d}}[n] = \mathbf{V}\_{\rm d}[n] \cdot \mathbf{A}\_{\rm d}[n] \cdot \mathbf{V}\_{\rm d}^{\rm T}[n] \tag{4.19b}$$

of the estimated covariance matrices

$$\boldsymbol{\Sigma}\_{\rm r}[n] = \frac{1}{M} \boldsymbol{\tilde{\mathbf{P}}}\_{\rm r}[n] \cdot \boldsymbol{\tilde{\mathbf{P}}}\_{\rm r}^{\rm T}[n] \in \mathbb{R}^{2 \times 2},\tag{4.20a}$$

$$\boldsymbol{\Sigma}\_{\rm d}[n] = \frac{1}{M} \boldsymbol{\tilde{\mathbf{P}}}\_{\rm d}[n] \cdot \boldsymbol{\tilde{\mathbf{P}}}\_{\rm d}^{\rm T}[n] \in \mathbb{R}^{2 \times 2} \tag{4.20b}$$

are calculated. For further information on the PCA and its multiple applications see Beyerer et al. [7]. In the present case, the eigenvector matrices <sup>r</sup> [], <sup>d</sup> [] are composed of two eigenvectors ∈ ℝ<sup>2</sup> each. The respective eigenvector with the largest corresponding eigenvalue, i.e., the largest corresponding diagonal entry of , is called the first principal component and signifies the direction in the 2D area with the largest variance of the point cloud. Therefore, the resulting matrices <sup>r</sup> [] and <sup>d</sup> [] can be used to get an estimate for the tangents by using the respective first principal components <sup>r</sup> [] and <sup>d</sup> []. To complete the necessary variables for the time-discrete variant of (4.16), only the vectors

(a) Geometric estimation of interfering signals at using a constant amplitude assumption.

(b) Interfering signals estimation using the PCA-based geometric estimation with ( ̂i ) and without ( ̂ corr i ) constant amplitude assumption for the data set <sup>D</sup>sim,1. Ground truth is drawn in dashed.

Figure 4.3 Example of the PCA-based interfering signals estimation.

r [; ] and <sup>d</sup> [; ] are missing. These can be approximated using the mean value of the respective point clouds

$$\mathbf{p}\_{\mathbf{r}}[n;\tau] = \frac{1}{M} \mathbf{P}\_{\mathbf{r}}[n] \cdot \mathbf{1}\_{M1'} \tag{4.21a}$$

$$\mathbf{p}\_{\rm d}[n;\tau] = \frac{1}{M} \mathbf{P}\_{\rm d}[n] \cdot \mathbf{1}\_{M1} \,. \tag{4.21b}$$

An example of the approach for a single time step is shown in Fig. 4.3(a). There are four components shown: the interfering signals <sup>i</sup> [] (green arrow), the target signals of the reference and delay signals <sup>r</sup> , <sup>d</sup> (blue and red arrows), their respective point cloud (blue and red points), and the resulting first principal components (dashed lines). This figure also explains why the solution of (4.16) can be interpreted as a geometric intersection between two normal vectors. Repeating the estimation for all time steps ∈ [70 µs, 75 µs] leads to the estimated interfering signals ̂i ( ) in Fig. 4.3(b). However, it is easily observable that the deviation from the ground truth is not to be neglected. There are several sources for the estimation errors: the measurement noise, non-uniform sampled

point clouds, the tangent estimation, model errors such as the violation of the constant amplitude assumption.

Since the assumption of constant amplitude is the most restrictive condition, it is relaxed by adjusting the normals estimation. The proposed normals correction is presented in the following paragraph, where due to the redundancy of the equation only the equations for the reference signals are displayed. The delay signals are processed equivalently. Again, the derivation of the adjusted method is based on the signal model. Adding a non-constant amplitude to the signal model (4.10) leads to the extended model

$$s\_{\rm r,as}(t;\tau) = A(t+\tau/2) \,\mathrm{e}^{\mathrm{j}2\pi f\_0(t+\tau/2)} + s\_{\rm i,as}(t) + n\_{\rm r,as}(t) \,. \tag{4.22a}$$

It follows that the derivative of the signals

$$\frac{\mathrm{d}s\_{\mathrm{r,as}}(t;\tau)}{\mathrm{d}\tau} = \left(\frac{\mathrm{d}A(t+\tau/2)}{\mathrm{d}\tau} + \mathrm{j}A(t+\tau/2)\pi f\_0\right) \mathrm{e}^{\mathrm{j}2\pi f\_0(t+\tau/2)} \tag{4.23}$$

is extended by the derivative of the amplitude, whereby the rotation by 90° in (4.14) becomes a rotation by the yet unknown angle leading to

$$\frac{\mathbf{d}\mathbf{p}\_{\mathbf{r}}(t;\tau)}{\mathbf{d}\tau} \propto \begin{bmatrix} \cos(\varphi) & -\sin(\varphi) \\ \sin(\varphi) & \cos(\varphi) \end{bmatrix} \cdot \mathbf{p}\_{\mathbf{t}}(t+\tau/2) \propto \mathbf{v}\_{\mathbf{r}} \,. \tag{4.24}$$

Therefore, the definition of the normal vectors is changed to become the multiplication of the tangent <sup>r</sup> with the inverse rotation matrix, i.e.

$$\mathbf{p}\_{\mathbf{t}}(t+\tau/2) = c\_{\mathbf{r}} \begin{bmatrix} \cos(\varphi) & \sin(\varphi) \\ -\sin(\varphi) & \cos(\varphi) \end{bmatrix} \cdot \mathbf{v}\_{\mathbf{r}} =: c\_{\mathbf{r}} \mathbf{v}\_{\mathbf{r}}^{\perp} \,. \tag{4.25}$$

From (4.23) the angle

$$\varphi = \arctan\left(\frac{2\pi f\_0 A(t + \tau/2)}{A'(t + \tau/2)}\right) \tag{4.26}$$

can be deduced. However, since only the whole measurement signal with all additive components can be observed, the amplitude is not known before the interfering signals are subtracted. This can be solved by using the signal difference between the analytic reference and delay signals. The absolute value of this so calculated difference signal

$$|\Delta s\_{\rm as}(t;\tau)| = |s\_{\rm r,as}(t;\tau) - s\_{\rm d,as}(t;\tau)| \approx 2|\sin(\pi f\_0 \tau)| \cdot |A(t)| \tag{4.27}$$

is an approximation of the time-dependent amplitude curve, if the time delay is very small compared to the time duration of the envelope (). Even though the absolute value of the amplitude curve remains unknown, for the estimation of the angle only the ratio is needed. Therefore, the angle is estimated by

$$\hat{\varphi} = \arctan\left(\frac{2\pi f\_0 |\Delta s\_{\rm as}(t;\tau)|}{\mathrm{d} |\Delta s\_{\rm as}(t;\tau)| / \ \mathrm{d}t}\right) \,\mathrm{},\tag{4.28}$$

where the derivation is performed in the discrete time by a first-order divided difference.

In summary, for a non-constant amplitude, the approach is nearly the same as using the constant amplitude assumption. The only difference lies in the calculation of the normals, where an adapted angle is introduced. The result of this relaxation to non-constant amplitudes is displayed in Fig. 4.3(b) as ̂ corr i ( ). The expected improvement of the adjusted estimation is proven, as the deviation from the ground truth is significantly smaller than when assuming a constant amplitude.

#### 4.2.2 Interference-invariant time delay estimation

Apart from the method of first estimating the interfering signals and then simply subtracting them, the time delay between reference and delay signals can also be calculated directly from the point clouds. To this end, the stationary interfering signals are separated from the target signal by transforming the measurement signals into a space that is invariant to stationary signal components. Since this approach is already published in [143], a brief description of this transformation and the subsequent post-processing to determine the time-delay difference follows.

Stationary components in the point clouds have only an influence on the offset as proven by (4.8) in Section 4.1.3 and therefore have no influence on the shape or orientation of the point clouds. It follows that the offset-free point clouds retain only the shape influencing components and the orientation. Thus, by using the orientation of the offset-free point clouds, expressed by an angle, the interfering signals can be suppressed. However, since the measurement noise is also still included, a reliable estimate can only be obtained if the time delay variation of either the

(a) Time-dependent Hilbert and PCA phase values.

(b) Linear fit to the phase values.

Figure 4.4 PCA-based point cloud processing to get interference-invariant phase values of an estimation of the interfering signal.

absolute time delay <sup>a</sup> or the time-delay difference in the signal package is significantly larger than the variance of the measurement noise.

In order to calculate the orientation of the point clouds for both the reference and delay signals, the PCA is employed. From the respective first principal components, given by the eigenvectors <sup>r</sup> [], <sup>d</sup> [] ∈ ℝ<sup>2</sup> with the respective largest associated eigenvalues, the orientations are determined by

$$\phi\_{\rm PCA,r}[n] = \arctan\left(\frac{\mathbf{v}\_{\rm r}^{\rm T}[n] \cdot \mathbf{e}\_2}{\mathbf{v}\_{\rm r}^{\rm T}[n] \cdot \mathbf{e}\_1}\right) \tag{4.29a}$$

$$\phi\_{\rm PCA,d}[n] = \arctan\left(\frac{\mathbf{v}\_{\rm d}^{\rm T}[n] \cdot \mathbf{e}\_2}{\mathbf{v}\_{\rm d}^{\rm T}[n] \cdot \mathbf{e}\_1}\right). \tag{4.29b}$$

After solving the PCA equations (4.18)–(4.20) and calculating the angle (4.29) for each time step , the measurement signal is completely transformed into a space invariant with respect to stationary signals.

However, the direction of each eigenvector is ambiguous, i.e., it is not possible to influence the PCA whether the eigenvector is rotated by 180 degrees or not. Due to this problem, there are still random phase jumps present in the phases calculated according to (4.29), which need

to be removed before the phase values can be further processed. For this reason, a case distinction is included in the phase calculation, adding if <sup>T</sup>[] ⋅ <sup>2</sup> < 0. Subsequently, the phases thus obtained are mapped linearly to the range [−, ]. In summary, for both the reference and delay signals, the phase calculations in (4.29) are adapted to

$$\phi\_{\text{PCA}}[n] = 2 \cdot \arctan\left(\frac{v\_2[n]}{v\_1[n]}\right) + \begin{cases} -\pi & v\_1[n] \ge 0, \ v\_2[n] \ge 0 \\ \pi & v\_1[n] \ge 0, \ v\_2[n] < 0 \\ 0 & v\_1[n] < 0 \end{cases} \tag{4.30}$$

where the components of the eigenvectors <sup>T</sup>[] ⋅ <sup>1</sup> and <sup>T</sup>[] ⋅ <sup>2</sup> are abbreviated to <sup>1</sup> [] and <sup>2</sup> [], respectively.

The course of the orientations is shown in Fig. 4.4(a). It should be noted that the orientations contain only a reliable information in time ranges with a significant level of target signals. Outside these time ranges, these orientations are determined only by the measurement noise and carry no information about the time-delay difference. The time range shown in Fig. 4.4(a) was selected on the basis of the transit time of the direct propagation path. Furthermore, the phase values obtained by directly using the Hilbert transform are included in the plot of the phase values. It is observable that the solution for the ambiguity of the eigenvector direction leads to double slope compared to the Hilbert phase <sup>h</sup> [].

To extract the time-delay difference from the PCA,r[], PCA,d[] orientations, they are separated by the procedure proposed by Kupnik et al. [63] at the 2 jump points and approximated by linear regressions ̃ PCA,r( <sup>s</sup> ), ̃ PCA,d( <sup>s</sup> ), as shown in Fig. 4.4(b). The best linear regression is selected based on the eigenvalue ratio of the associated orientations. This ensures that the linear regression with the best SNR is used. The intersection points with the time axis can then be determined from the linear regressions, and thus, by simply taking the difference, the time-delay difference is calculated.

However, a closer look at the method shows that only one time-delay difference per point cloud can be estimated, i.e., only the average timedelay difference of a whole signal package results from this method. Further information and a quantitative evaluation of the performance can be found in [143].

#### 4.2.3 Discussion

Both PCA-based processing methods are tested on the simulated data sets. Before they can be applied, compliance with the preconditions of the SDMs has to be ensured, i.e., the measurement signals need to be separated into signal packages. The included signal dynamics per package is a free to choose hyperparameter that controls the robustness against noise, but also limits the precision of the tangent estimation since the PCA is designed for linearly shaped point clouds. Furthermore, the dynamic behavior of a measurement system based on this concept is limited, since the interfering signals can only be reliably filtered after the required number of measurements has been recorded. For the proof of concept, the signal dynamics per package is set to 50 ns, which corresponds to a phase shift of 12.6° at a center frequency of about 700 kHz. To obtain estimates of the time delays for all measurements, the signal packages are selected with overlap such that the entire range of measurements was uniformly covered. Because the geometric approach estimates and subtracts the interfering signals, before the zero-crossing-based TDE approach from Kupnik et al. [63] is applied, this SDM enables a TDE for each signal within the package. On the other hand, the interference-invariant SDM yields only one TDE per package. Due to the overlapping signal packages, the geometric SDM results in multiple TDEs for the same measurement, which are therefore averaged to get comparable TDE curves. For easier notation later in Chapter 6, the set of hyperparameters for the PCA-based SDM

$$\boldsymbol{\theta}^{\text{PCA}} = \{\Delta \tau, \text{ConstAmp}\} \tag{4.31}$$

is defined as the collection of a signal dynamics per package parameter Δ and, in case of the geometric approach, a Boolean value , which indicates whether or not the constant-amplitude assumption is used.

The results of the proposed PCA-based SDMs are pictured in Fig. 4.5. As already expected, the interference-invariant approach performs badly for data sets with fastly varying VoFs, e.g., <sup>D</sup>sim,2, <sup>D</sup>sim,4 since it can only calculate one time delay per package, which limits the allowed dynamic behavior. However, if the stationary interfering signals assumption holds and the rate of change is slow enough, this approach is most precise and

Figure 4.5 Results of TDEs using the PCA-based SDM for the simulated data sets <sup>D</sup>sim,1, … , <sup>D</sup>sim,4.

most robust against noise. The geometric approach can alleviate the limited dynamics, which is proven by the fact that the results also perform well in data sets with varying VoF. However, the expected improvement using the non-constant amplitude extension did not occur. One possible explanation is that the incorrectly estimated interfering signals when assuming a constant amplitude are phased in such a way that they have only a small negative effect on the TDE. It is actually also the case that the extended method often shows worse behavior both against noise and against the systematic error. While the behavior against noise is to be

expected, since the estimated correction angle depends on the noisy difference signal, the systematic error can only be explained by the phasing of the residual interfering signals after subtraction. A non-oscillating constant positive TDE error indicates that the residual interfering signals are after correction always in 180 degrees phase shift to the target signals. This leads to the explanation that the amplitude estimation used for the angle determination is the source of error, since the difference signal is an input quantity moving with the process influence quantities to the same extent as the target signals. The last insight that can be gained from the figure is the quantitative influence of non-stationary interfering signals. If the interfering-signals variations—with the quantitative descriptions determined in Chapter 3—are assumed, the TDE error is within ≈ ±2 % (see lower left of Fig. 4.5).

# 4.3 B-spline-based point cloud processing

The performance of PCA-based point cloud processing depends on the extent to which the point clouds are linearly shaped. Therefore, the maximal signal dynamics per package have to be limited, which can be achieved by forming packages in which the phase of the target signals changes only by a maximum of 25°. However, using higher signal dynamics would allow better robustness against noise. Since the shape of the point clouds becomes more similar to sections of spirals, if the signal dynamics are increased, a curved approximation becomes necessary. In this thesis, this problem is solved by applying B-splines to the approximation problem. They are fast to calculate and can approximate every multidimensional line as a parametric curve where the continuity of the curve and its derivatives can be easily determined by manipulation of the knot vector. Furthermore, the application of B-splines removes the requirement of having uniformly distributed points within the point clouds, which in fact only depends on the process conditions during the available measurement signals. In this section, the B-spline-based SDM that was briefly published in [141] is described in detail. The presentation of the approach is structured as follows. Firstly, the fundamentals of B-splines are introduced, followed by two 2D approximation approaches to get robustly estimated B-splines. Subsequently, one algorithm to estimate the interfering signals and one algorithm to estimate the time delays directly as in Section 4.2.2 are presented. Finally, the concept is proven and discussed by application to the data sets <sup>D</sup>sim,1, <sup>D</sup>sim,2 where the signal model holds, and also tested against signal model errors by application to <sup>D</sup>sim,3 and <sup>D</sup>sim,4.

#### 4.3.1 B-spline fundamentals

The following fundamentals on B-splines are a summary of selected contents from [25]. B-splines, also known as basis splines, are piecewise defined polynomial functions with limited support, whose derivatives have adjustable continuity properties. Here, basis functions mean that any other spline function can be expressed as a linear combination of B-splines for the same polynomial order. A unique set of B-splines is given by the B-spline order and a knot vector <sup>k</sup> = [<sup>0</sup> , … , −1] <sup>T</sup> with +1 ≥ defining the limits of the polynomial pieces. Using the given order and knot vector, the B-splines can be defined by the Cox-de Boor recursion formula [20, 25]

$$b\_{i,1}(r) = \begin{cases} 1 & r \in [r\_i, r\_{i+1}) \\ 0 & \text{else} \end{cases},\tag{4.32a}$$

$$b\_{i,l+1}(r) = \frac{r - r\_i}{r\_{i+l} - r\_i} b\_{i,l}(r) + \frac{r\_{i+l+1} - r}{r\_{i+l+1} - r\_{i+1}} b\_{i+1,l}(r) \,. \tag{4.32b}$$

The resulting set of functions ,() are called B-splines and fulfill the sum-to-one property

$$\sum\_{i} b\_{i,l}(r) = 1 \;/\quad \forall r \in [r\_0, r\_{I-1}) \tag{4.33}$$

within the interval [<sup>0</sup> , −1) spanned by the knot vector <sup>k</sup> . Furthermore, from the recursion formula (4.32) it can be seen that the number of intervals [ , +1) in which a B-spline is non-zero depends on the order. At order = 1 this is only one interval, and then the number of intervals increases with the B-spline order at the same rate. This means that, for example, B-splines consisting of quadratic polynomials ( = 3) are spread over three intervals. From the same relation, it follows directly

that for a B-spline of order at least + 1 knots have to be present, which enclose intervals. To ensure that this is always the case, the first and the last element are usually appended − 1 times to the given knot vector resulting in

$$\mathbf{r}\_{\mathbf{k}} = \overbrace{\underbrace{r\_0, \dots, r\_0}\_{l+1}, \overbrace{r\_0, r\_1}^{I}, \dots, \underbrace{r\_{I-2}, r\_{I-1}, r\_{I-1}, \dots, r\_{I-1}}\_{l+1}}^{I-1} \mathbf{r}$$

It should be noted that successive knots may be identical ( = +1), i.e., the intervals can be empty. In summary, the properties of the B-splines and its derivatives can be derived as follows based on the given knot vector:


An example of quadratic B-splines ( = 3) is shown in Fig. 4.6(a). It is observable that in each interval three B-splines are non-zero. Due to the given number of knots and the B-spline order, there are six different B-splines present in this example ( = + − 2). Except for the end points, the first derivatives of the B-splines are continuous at the knots, i.e., at ∈ {1, 2, 3}. Additionally, this example illustrates the necessity to add knots at the beginning and end of the knot vector, because without these additional knots, the splines in the intervals [0, 1] and [3, 4] do not have enough degrees of freedom to meet possible boundary conditions such as slope or endpoints.

(a) Quadratic B-splines with knot vector <sup>k</sup> = [0, 0, 0, 1, 2, 3, 4, 4, 4].

Figure 4.6 B-splines and application to an approximation of 2D points.

Finally, the B-splines can be used to generate any 2D spline curve through a linear combination

$$
\begin{bmatrix} x(r) \\ y(r) \end{bmatrix} = \sum\_{i=0}^{F-1} \begin{bmatrix} c\_{\mathbf{x},i} \\ c\_{\mathbf{y},i} \end{bmatrix} \cdot b\_{i,l}(r) \tag{4.35}
$$

with the control points [x,, y,] T. Of course, curves can also be generated in only one or more dimensions, but since in this work B-splines are used to approximate 2D point clouds, only the representation of 2D spline curves will be discussed here. The results of applying B-splines to approximate 2D points are depicted in Fig.4.6(b). Note that this curve is only possible due to the representation as parametric curve, because a conventional function () can only map a single -value to a given -value.

Another advantage of B-splines lies in the possibility to calculate the derivatives analytically. Since the B-splines are defined recursively by (4.32), it follows that the derivatives of B-splines can be calculated using the B-splines of order − 1:

$$\frac{\mathrm{d}b\_{i,l}(r)}{\mathrm{d}r} = l \cdot \left(\frac{b\_{i,l-1}(r)}{r\_{i+l} - r\_i} - \frac{b\_{i+1,l-1}(r)}{r\_{i+l+1} - r\_{i+1}}\right) \,. \tag{4.36}$$

#### 4.3.2 2D B-spline approximation

Using the B-spline theory, the shapes of the point clouds calculated from the measurement package according to (4.6) can be approximated with more degrees of freedom compared to the PCA. Therefore, for each given point cloud

$$\mathbf{P}[n] = \left[ \mathbf{p}\_1[n], \dots, \mathbf{p}\_m[n], \dots, \mathbf{p}\_M[n] \right] \in \mathbb{R}^{2 \times M},\tag{4.37}$$

the task is to find a 2D curve ̂() by linear combination of B-splines, which fits the given point cloud best in a least-squares (LS) sense. For simpler notation, the dependency on the discrete time step will be omitted for the remainder of this subsection. It is only important to keep in mind that the presented approximation must be applied for each point cloud of the reference signals <sup>r</sup> [] and delay signals <sup>d</sup> [].

The fitting problem of the spline curve to the given 2D points can be reformulated in the matrix form

$$\underbrace{\begin{bmatrix} p\_{11} & p\_{21} \\ \vdots & \vdots \\ p\_{1M} & p\_{2M} \end{bmatrix}}\_{\mathbf{P}^{T} \in \mathbb{R}^{M \times 2}} = \underbrace{\begin{bmatrix} b\_{0,l}(r\_{1}) & \dots & b\_{F-1,l}(r\_{1}) \\ \vdots & \ddots & \vdots \\ b\_{0,l}(r\_{M}) & \dots & b\_{F-1,l}(r\_{M}) \end{bmatrix}}\_{\mathbf{B}(\mathbf{r}) \in \mathbb{R}^{M \times F}} \underbrace{\begin{bmatrix} c\_{\mathbf{x},0} & c\_{\mathbf{y},0} \\ \vdots & \vdots \\ c\_{\mathbf{x},F-1} & c\_{\mathbf{y},F-1} \end{bmatrix}}\_{\mathbf{C} \in \mathbb{R}^{F \times 2}} + \mathbf{N} \tag{4.38}$$

with the approximation error ∈ ℝ×2 and the corresponding support points

$$\mathbf{r} = \begin{bmatrix} r\_1, \dots, r\_M \end{bmatrix}^T, \quad r\_m \in \{ \mathbb{R} \, | \, 0 \le r\_m \le 1 \}. \tag{4.39}$$

The problem (4.38) can usually be solved using an LS estimator. However, to build the Moore-Penrose inverse of (), besides a set of B-splines, a corresponding support point for each point within the point cloud is necessary. In simplified terms, this means that the support points need to be estimated first. Unfortunately, the determination of these support points influences the precision of the approximation and the complexity of the curves (), () a lot. In order to get smooth curves with the least possible curvature, the distance between two support points should be proportional to the distance of the corresponding points along the

spline curve, which in turn is only available after the support points are determined.

Consequently, the problem of finding the support points and the spline curve is solved jointly by the iterative method of Wang et al. [127]:

$$\begin{cases} \mathbf{r}^{(0)}: & \text{initial support points} \\ \mathbf{C}^{(i+1)} = \left(\mathbf{B}^{\mathrm{T}}(\mathbf{r}^{(i)})\mathbf{B}(\mathbf{r}^{(i)})\right)^{-1}\mathbf{B}^{\mathrm{T}}(\mathbf{r}^{(i)}) \cdot \mathbf{P}^{\mathrm{T}} \\ \mathbf{r}^{(i+1)} = \underset{\mathbf{r}}{\arg\min} \left\|\mathbf{B}(\mathbf{r}) \cdot \mathbf{C}^{(i+1)} - \mathbf{P}^{\mathrm{T}}\right\|\_{\mathrm{F}}^{2} . \end{cases} \tag{4.40}$$

Finding the initial support points (0) is realized by pre-ordering the points using a projection onto an ordering curve. Furferi et al. [37] presented an algorithm for ordering the points by piecewise progressing along the line. The objective is to find ordered contour points along the point cloud. In this approach, starting from a random point of the cloud, the direction in which to search the next contour point is determined using a PCA of the points in proximity. The radius of the circle to differentiate between points lying in proximity and further distanced points is gradually increased until the eigenvalues of the PCA reach a predefined ratio. After the radius is fixed, the next contour point is calculated to be the intersection of the first principal component with the circle. While the idea of determining contour points to get an ordering curve was taken up in this work, the PCA-based algorithm to find them could not keep the robustness requirements.

Therefore, a simpler but more robust approach, which selects contour points from the cloud, is developed. In the following context, contour points ̃ are distinguished from original points by a tilde. Firstly, the number of contour points has to be set. Examination of different point clouds from the simulated data sets has proven that it is favorable to select ten contour points, thus, this hyperparameter is fixed to ten. Since the trajectories do not have loops and are only slightly curved even for higher signal dynamics (see Fig. 4.2), the two points with the maximal distance between each other represent the start and end points

$$\{\tilde{\mathbf{p}}\_1, \tilde{\mathbf{p}}\_{10}\} = \operatorname\*{arg\,max}\_{\mathbf{p}\_i, \mathbf{p}\_j} \left\| \mathbf{p}\_i - \mathbf{p}\_j \right\|\_2 \quad 1 \le i, j \le M. \tag{4.41}$$

(b) Normalized squared distance of 2D points to B-spline curve depending on the iteration.

Figure 4.7 Visual example of iterative 2D B-spline approximation.

The remaining eight support points ̃ are selected in a way that they are distributed as uniformly as possible over the maximum point distance:

$$\tilde{\mathbf{p}}\_m = \operatorname\*{arg\,min}\_{\mathbf{p}\_j} \left\| \left\| \mathbf{p}\_j - \tilde{\mathbf{p}}\_1 \right\|\_2 - \frac{m-1}{9} d\_{\text{max}} \right\|\_{\text{\textquotesingle}}, \ 2 \le m \le 9 \,, \tag{4.42}$$

with

$$d\_{\text{max}} = \left\| \tilde{\mathbf{p}}\_{10} - \tilde{\mathbf{p}}\_{1} \right\|\_{2} \,. \tag{4.43}$$

After all contour points are found, their support points ̃ are estimated using the normalized cumulative Euclidean distance

$$\tilde{r}\_m = \frac{1}{\sum\_{j=2}^{10} \left\| \tilde{\mathbf{p}}\_j - \tilde{\mathbf{p}}\_{j-1} \right\|\_2} \cdot \begin{cases} 0 & m = 1 \\ \sum\_{j=2}^{m} \left\| \tilde{\mathbf{p}}\_j - \tilde{\mathbf{p}}\_{j-1} \right\|\_2 & m > 1 \end{cases} \tag{4.44}$$

The scaling of the support points ̃ can be arbitrarily chosen because the points to approximate are only given as 2D vectors, but the scaling needs to be adapted to the interval spanned by the knot vector. For simplicity, the scaling of the support points and correspondingly the knot vector are set to the interval [0, 1]. Subsequently, a B-spline approximation of

the contour points with the support points ̃ yields the ordering curve ̂ (0)(), which in turn is employed to find the initial support points (0) by orthogonal projection of the 2D points onto the ordering curve. Due to the analytic derivatives and formulations of the B-splines, this projection can be solved analytically by intersection of the normal vectors with the points.

Figure 4.7 depicts the results of finding the contour points ̃, the ordering curve, and the final spline curve. On the right side the convergence behavior of the summed squared orthogonal distances can be observed. Due to well-chosen initial support points, the spline curve converges after only a few iteration steps. The residual mean squared error (MSE) should not be significantly less than the standard deviation of the measurement noise, as this would indicate overfitting. In this case, the degrees of freedom have to be reduced by using a knot vector with fewer elements. For both the ordering curve and the iteratively approximated spline curve, the B-spline order and the knot vector are set to = 4 and <sup>k</sup> = [0, 0.5, 1]<sup>T</sup>, respectively.

#### 4.3.3 Interfering signals estimation

From the discussion about the misalignments of the point clouds in Section 4.1.3 it is clear that determining these misalignments yields the differential interfering signals in its vector form Δ i []. However, there are several problems to be solved. Firstly, the overlap between two point clouds [], [+1] is influenced by the interval , which in turn depends on the interval of the time delays. Hence, the length of the overlap is not known beforehand. Secondly, the misalignment cannot be calculated from the point clouds directly, because the distribution of the sampled time delays in the interval may be non-uniform or even contain gaps leading to points [] with no corresponding point ̃[ + 1] in the subsequent point cloud. This disqualifies the application of point set registration approaches such as *iterative closest point* since they require existing direct point correspondences. To circumvent this issue and to be more robust against AWGN, the shapes of the points clouds are first approximated by using B-splines as explained in the previous subsection, instead of calculating the misalignment between the point clouds. For

each time step , this results in two trajectories represented by the spline curves ̂ () and ̂+1(). As the shapes only overlap partially, finding the misalignment results in the problem called *partial shape matching* (see [141] for further reference).

To solve this problem, the trajectories ̂ (), ̂+1() are densely sampled and the misalignment is determined by minimization of a mean squared distance-based quality measure. Because the curves are only partially overlapped, the crux in the quality measure is to classify which points have a corresponding point in the other trajectory. To this end, the advantage that sampling the B-splines creates ordered points is exploited. Considering the direction of the trajectory, the end point of ̂ () which is guaranteed to have a correspondence on ̂+1() is easy to determine. Following from this approach, the misalignments used in the optimization are reduced to the differences

$$
\hat{\mathbf{p}}\_{i,n}^{\Delta}(r\_i) = \hat{\mathbf{p}}\_{n+1}(r\_i) - \hat{\mathbf{p}}\_n(0) \tag{4.45}
$$

at the sampled support points . Without loss of generality, Equation (4.45) assumes that the end point with guaranteed correspondence is ̂ (0), otherwise the control points of the spline curve can simply be reversed. Thus, the 2D optimization is simplified to a 1D search.

Given densely sampled support points ∈ [0, 1] and the misalignments from (4.45), the final estimation for the differential interfering signals can be calculated via

$$\hat{\mathbf{p}}\_{\mathbf{i}}^{\Delta}[n] = \underset{\hat{\mathbf{p}}\_{\mathbf{i},n}^{\Delta}(r\_{i})}{\arg\min} \sum\_{i=i}^{N} \min\_{j} \left\| \hat{\mathbf{p}}\_{n+1}(r\_{i}) - \hat{\mathbf{p}}\_{n}(r\_{j}) - \hat{\mathbf{p}}\_{\mathbf{i},n}^{\Delta}(r\_{i}) \right\|\_{2}. \tag{4.46}$$

Equation (4.46) minimizes a quality measure formed by accumulating the Euclidean distances of the respective nearest neighboring points, where the starting index is responsible for classifying which points have a correspondence.

The last step for estimating the interfering signals is the numerical integration of ̂ Δ i [] followed by offset correction. To this end, the rectangle rule is applied and the resulting cumulative sum is searched for local maxima. Since the interfering signals are bandpass signals, the average calculated over one period should be zero. This is exploited by

Figure 4.8 Example estimation of the interfering signals. Upper left: all point clouds. Upper right: B-Spline approximation of a single point cloud. Lower left: partial shape matching between two trajectories. Lower right: estimated differential interfering signals Δ i ( ) and their offset corrected integration. Figure adapted from previous publication [141].

calculating and subtracting the average value of the range between the found local maxima.

An example of the entire algorithm consisting of the point cloud generation, the B-spline approximation, the partial shape matching, and the numerical integration with offset correction is shown in Figure 4.8. For this illustration, a measurement package was extracted from the data set <sup>D</sup>sim,1. After clipping the time domain containing the target signals, the point clouds of the reference signals are depicted for each time step in the clipped time domain (upper left). Although not shown, each

point cloud is subsequently approximated by a spline curve (upper right), which in turn are used to calculate the misalignment for each time step (lower left). On the lower right, the estimated differential interfering signals are plotted in blue and their numerical integration in red. The shaded area marks the range, where the average for the offset correction was calculated. Due to the solution in the 2D area, the interfering signals are estimated in form of analytic signals, but these can be easily converted back to the real-valued interfering signals by simply taking the real part.

The last point to mention is which signals are to be used for the point cloud generation since the B-spline-based approach works with both the reference or delay signals. There are two options to deal with it: either the interfering signals are estimated once using the reference signals and once using the delay signals followed by averaging, or the point clouds are generated by concatenating the reference signals and the delay signals leading to [] = [<sup>r</sup> [], <sup>d</sup> []]. For easier notation, the hyperparameter set

 BS = {Δ, } , ∈ {individual, concatenated} (4.47)

is introduced, with the categorical variable which decides whether the point clouds for the interfering signals estimation should be concatenated or the point clouds of the reference and delay signals are generated and processed individually followed by an averaging.

#### 4.3.4 Joint shape fit and interfering signals estimation

The B-spline-based SDM proposed in the previous two subsections has two drawbacks. Firstly, using an individual B-spline approximation for every time step yields slight deviations in the shapes of the trajectories due to the noise leading to further errors when solving the partial shape matching problem. Secondly, the solution of the partial shape matching requires a lot of computational effort, since the approach is based on sampling the splines, where the high precision requirement of the misalignment requires a high sampling density. Therefore, a novel approximation method, named Joint B-Spline approximation, is presented that considers the knowledge that the shapes of the trajectories at different time steps are constant except for the misalignment. By using a

joint shape fit and misalignment estimation, a global optimum can be reached. To accomplish this, the fitting problem (4.38) is extended by the misalignment, leading to the formulation

$$\underbrace{\begin{bmatrix} \mathbf{P}^{\mathrm{T}}[n] \\ \mathbf{P}^{\mathrm{T}}[n+1] \end{bmatrix}}\_{\mathbf{\tilde{P}}^{\mathrm{T}}[n] \in \mathbb{R}^{2M \times 2}} = \underbrace{\begin{bmatrix} \mathbf{B}(\mathbf{r}[n]) & \mathbf{0}\_{M1} \\ \mathbf{B}(\mathbf{r}[n+1]) & \mathbf{1}\_{M1} \end{bmatrix}}\_{\mathbf{\tilde{B}}(\mathbf{\tilde{r}}) \in \mathbb{R}^{2M \times (F+1)}} \underbrace{\begin{bmatrix} \mathbf{C} \\ \mathbf{p}^{\Delta,T}\_{\mathrm{i}}[n] \end{bmatrix}}\_{\mathbf{\tilde{C}} \in \mathbb{R}^{(F+1) \times 2}} + \begin{bmatrix} \mathbf{N}[n] \\ \mathbf{N}[n+1] \end{bmatrix} \tag{4.48}$$

with the ones vector 1 ∈ {1}, the zeros vector 1 ∈ {0} and the transposed misalignment Δ i []. Instead of fitting the B-splines to only the point cloud [], the same spline curve needs to approximate both consecutive point clouds [], [ + 1], with the second point cloud being shifted by the estimated misalignment. Inserting the modified matrices [] ̃ , ( ̃) ̃ and ̃into the iterative approach (4.40) results in both the final spline curve and the differential interfering signals at the time step . Finally, the numerical integration with offset correction as already explained in the previous subsection is repeated to obtain an estimate for the interfering signals.

However, the quality of the initial support point estimation deteriorates with increasing offset and the convergence is not guaranteed anymore as Figure 4.9 illustrates. Since the convergence rate is rather low and the LS estimation in (4.40) can be interpreted as the direction of a steepestdescent approach, the learning rate could be increased. To this end, a golden-section search was implemented, but the results showed that manipulating the learning rate during the iterative optimization makes the convergence unstable. For this reason, the line search was omitted when applying the joint B-spline. Nevertheless, good results could be achieved by increasing the iterations by factor 10 compared to the B-spline approximation of each individual point cloud. Overall, the computational effort was still below that of the individual approach, because the effort for the partial shape matching could be saved.

#### 4.3.5 B-spline-based direct time delay estimation

Equivalent to the direct TDE using the angle of the first principal component vector, the spline curves can also provide a tangent line. Furthermore, due to the estimation as a curved line, it is possible to determine distinct

(a) Convergence of estimated misalignment to the true misalignment without (A) and with golden-section (B) algorithm for line search. All values are normalized by the true misalignment.

(b) Normalized squared distance of 2D points to B-spline curve depending on the iteration, without line search (A) and with golden-section line search (B).

Figure 4.9 Visual example of iterative joint B-spline and misalignment approximation.

angles for each measurement within the package, instead of only one orientation per point cloud. Because the B-splines can be derivated analytically, the angle calculation is fairly simple. The following steps must each be applied once to the point clouds resulting from the reference signals and the delay signals. At first, for each point [] the support point with the least Euclidean distance

$$\tilde{r}[m,n] = \operatorname\*{arg\,min}\_{r \in [0,1]} \left\| \hat{\mathbf{p}}\_n(r) - \mathbf{p}\_m[n] \right\|\_2 \tag{4.49}$$

has to be computed, with ̂ () being the spline curve approximated to the respective point cloud []. Secondly, the analytical derivative

$$\mathbf{v}\_{\rm BS}[m,n] = \frac{\mathbf{d}\hat{\mathbf{p}}\_n(r)}{\mathbf{d}r}\Big|\_{r=\tilde{r}[m,n]}\tag{4.50}$$

is calculated at the support points obtained in the previous step. As the spline curve is a weighted sum of the B-splines, the analytical derivative becomes

$$\frac{\mathrm{d}\hat{\mathbf{p}}\_n(r)}{\mathrm{d}r} = \sum\_{i}^{F-1} \begin{bmatrix} c\_{x,i} \\ c\_{y,i} \end{bmatrix} \cdot \frac{\mathrm{d}b\_{i,l}(r)}{\mathrm{d}r} \, \mathrm{} \tag{4.51}$$

where the derivation of the B-splines needs to be substituted by the expression given in (4.36). Finally, the angles

$$\phi\_{\rm BS}[m,n] = \operatorname{atan2}\left(\mathbf{e}\_2^T \mathbf{v}\_{\rm BS}[m,n], \mathbf{e}\_1^T \mathbf{v}\_{\rm BS}[m,n]\right) \tag{4.52}$$

of both the reference and delay signals are sorted per measurement index , resulting in BS r,[] and BS d,[]. From these angles, again using the same procedure already applied for the PCA-based direct TDE in Subsection 4.2.2, the time-delay difference ̂ is determined for each measurement index .

Since this direct TDE is only based on the spline curves, it can be used for both the individual B-spline and the joint B-spline approximation. Furthermore, similar to the B-spline-based interfering signals estimation, the hyperparameters specify the signal dynamics and whether the point clouds for reference and delay signals should be concatenated or processed individually (cf. the set defined by (4.47)).

#### 4.3.6 Discussion

The B-spline-based point cloud processing has several advantages but also disadvantages to the PCA-based SDMs. The first point to mention is that the B-spline-based SDM can also be applied if only either the reference or the delay signals are available. Furthermore, due to the estimation via the differential signal, which needs to be integrated before the interfering signals are acquired, the B-spline-based SDMs have a better robustness against noise. The next advantage lies in the dynamic behavior if the time delays are estimated directly from the tangent lines of the spline curves. Here, the B-spline-based point cloud approximation allows us to obtain an estimation for each signal within the measurement package rather than just an averaged value per package. The last point to mention is that the requirements on the point clouds are less restrictive.

Figure 4.10 Results of TDEs using the B-spline-based SDM for the simulated data sets <sup>D</sup>sim,1, … , <sup>D</sup>sim,4.

The variations of the time delay within a package may be larger on the one hand and on the other hand the distribution of the points within the point clouds are not required to be uniform.

However, the computational effort is considerably higher than for the PCA-based processing and due to the more complex algorithm, the method is more susceptible to outlier signals, e.g., zero signals due to loose contacts, fluctuating time shifts due to wrong trigger signals, etc. Especially, the performance of the joint-B-spline approach depends on the quality of the signals, since the convergence is not guaranteed. Therefore, outliers would have to be filtered out beforehand to get reliable estimates. The last expected disadvantage is the robustness against model errors. Because the algorithm is based on the invariance of the shape against time shifts of the target signals, any shape manipulating effect, such as varying interfering signals, should have a strong impact on the misalignment estimation. Of course, a changing amplitude of the target signals could also have a negative effect, but during the measurement system identification this effect was not observed and is therefore considered as unlikely.

In order to investigate the discussed properties, the joint-B-spline- and B-spline-based SDMs are applied to the four simulated data sets leading to the results plotted in Figure 4.10. Same as for the PCA-based approach, the signals are separated into packages with Δ = 50 ns, but due to the higher computational effort, the overlap was reduced to allow the calculations in reasonable time. From this follows the staircase-like curve that can be observed in the upper and lower left of Fig. 4.10. Furthermore, the estimate of the curvature of the splines at the endpoints of the trajectories deteriorates, resulting in poor B-spline- and joint-B-spline-based direct TDE at these points, which explains the discontinuities in the TDE for signals at the edges of the signal packages. Regardless, the figure shows that the spline-based approaches work in general but, as already suspected, show stronger deviations than the PCA-based approaches in case of model errors. The joint-B-spline-based processing can cope better with model errors, but still not as good as the PCA-based processing. In return, however, it is also allowed that strongly varying time delays are present in the measurement package for the direct TDE (see the upper and lower right in Fig. 4.10). Since the simulated data sets are generated with a high SNR, the robustness against noise cannot be investigated in this evaluation. Therefore, this point will be discussed later in the experimental results.

## 4.4 Global interfering signals compensation

In practice, it would not be usual to recalculate the interfering signals for each measurement package without taking into account the information given by already existing estimates of the interfering signals. While,

due to the interfering signals being slightly temperature dependent, it is beneficial to have an individual estimation per package, the robustness can be further improved using a globally valid estimation for all measurement packages. To this end, all estimations obtained from evaluation of the different packages are averaged after outliers were filtered out. Subsequently, all measurement signals can be compensated globally using the thus obtained interfering signals estimation, followed by the conventional zero-crossing-based TDE. This approach is referred to as global compensation SDM later in the results chapter, whereas the compensation of each measurement package with its individual estimation will be called local compensation SDM.

As already mentioned in the discussion of the B-spline-based processing, outlier points in the point clouds can greatly impact the quality of the estimation. To remove bad estimations, a new histogram-based outlier detection algorithm is introduced in the following. Given ̃different estimated interfering signals ̂<sup>i</sup> [, ̃], in the first step, a histogram

$$h\_i^n = \sum\_{\tilde{m}=1}^M \begin{cases} 1 & \hat{s}\_{\text{i}}[n, \tilde{m}] \in [b\_i^n, b\_{i+1}^n) \\ 0 & \hat{s}\_{\text{i}}[n, \tilde{m}] \notin [b\_i^n, b\_{i+1}^n) \end{cases} \tag{4.53}$$

is calculated for each discrete time index , where ℎ denotes the number of samples that fall within the interval [ , +1). These intervals, also called bins, are determined for each time index separately. For this, the next lower power of ten to Scott's normal reference rule [106] is chosen as bin width. Then, the intervals are uniformly distributed over the maximum interval [ min ̃ ̂<sup>i</sup> [, ̃] , max ̃ ̂<sup>i</sup> [, ̃]], determining the time-index-specific interval limits and thereby the number of bins. In other words, the interfering signals sampled at the time index are treated as a collection of values, which is then used as input for an automatic binning algorithm. Subsequently, the bins are applied to get a histogram of the collection of values. As this procedure is repeated for all time steps, a set of histograms is eventually obtained. With the help of these histograms, a function

$$\tilde{h}^n: \mathbb{R} \mapsto \mathbb{N} \,, \quad s \mapsto \begin{cases} h\_i^n & s \in [b\_i^n, b\_{i+1}^n] \\ \vdots & \\ h\_j^n & s \in [b\_j^n, b\_{j+1}^n] \end{cases} \tag{4.54}$$

can be defined for each time step, which assigns to a continuous value the number of observations that lie in the interval in which the value also lies.

In the second step of the outlier filtering, the histogram mapping function ℎ̃ is applied to create a quality measure that indicates which signals are less likely to be outliers and which are more likely to be outliers. Since a signal sample being in the bin with the highest number of observations indicates that it belongs to the majority, a higher value means a signal that is less likely to be an outlier. Obviously, the signals do not consist of only one sample. Therefore, the number of observations for each signal ̂<sup>i</sup> [, ̃] are accumulated over the samples in the quality measure

$$J[\tilde{m}] = \sum\_{n=1}^{N} \tilde{h}^n(\hat{s}\_{\text{i}}[n, \tilde{m}])\,. \tag{4.55}$$

Finally, a threshold determines for which [ ̃] the signals are considered outliers. An automatic threshold estimation can be found by sorting [ ̃]. If the first and the last point of the found quality measure are interpreted as two 2D points connected by a straight line, the point on the quality curve with the largest Euclidean distance to this straight line is taken as the threshold, i.e., the strongest bend in the curve is used as separation point. The last task is to remove the outliers and average the remaining signals to get a globally determined estimate of the interfering signals ̂<sup>i</sup> ( ). The results of using this globally determined estimate will be shown in the experimental results chapter.

# 5 Feature-Based Regression Method

A completely different approach, which is not based on compensating the interfering signals but on estimating the time delays robustly against interfering signals, is presented in the following. In principle, features are extracted from a time-frequency representation of the signals, from which the time-delay difference is subsequently estimated with the aid of a regression. To present this method, the fundamentals of the used timefrequency representation are given in the first section, followed by an overview of the algorithm in the second section, the feature design in the third section, and a feature subset selection in the fourth section. Finally, the last section of this chapter examines different regression models and the corresponding training approaches for both the supervised and the unsupervised case.

# 5.1 Fundamentals on analytic wavelet packets

The analytic wavelet packet transform (AWPT) belongs to the class of multirate filter banks, just like, e.g., the discrete wavelet transform (DWT) [83]. The applications of these multirate filter banks include, but are not limited to, speech processing, image processing, data compression, and denoising as they are fast to calculate and invertible. Most of these applications take advantage of the fact that the multirate filter banks create a time-frequency representation of the input signal that provides information about the signal energy in a certain time and frequency window. Since the time-frequency resolution of the DWT is fixed to have a high frequency resolution at low frequencies and a high time resolution at high frequencies, the more flexible wavelet packet transform [19] was introduced. However, the drawback of the real-valued multirate filter banks is the missing shift-invariance due to the downsampling operation in the algorithm. Although many variants retaining the shift-invariance

were investigated in the early 2000s, such as the undecimated wavelet transform [116] or the double density DWT [108], this thesis uses the analytic wavelet packet transform, which was published independently by Selesnick [110] and Weickert et al. [128] at about the same time. There are two main reasons for this decision. Unlike the undecimated wavelet transform, the amount of data does not increase at each level, and moreover, the use of analytic basis functions allows the determination of the time shift based on the phase of the complex-valued coefficients just as in the Fourier transform.

The following subsections explain the structure of the AWPT and the principles to design the necessary tree-like multirate filters and filter impulse responses. The presented fundamentals are summarized from a collection of publications on the subject [24, 58, 107, 109–111, 128, 144].

### 5.1.1 Analytic wavelet packets in the context of time-frequency representations

Basically, the AWPT can be classified as a basis transform for discrete signals. Representing signals in other basis systems to extract information is a common technique. The most prominent basis transform is probably the Fourier transform which allows the analysis of the frequency composition of signals. In the discrete version of the transform, harmonic oscillations with different frequencies form an orthogonal basis, i.e., the inner product, which measures the similarity of two functions, of two distinct basis functions is zero. Other known basis transforms are the wavelet transform or the STFT, whose main difference to the Fourier transform is the compact support of the basis functions in the time range allowing a time-dependent statement about the frequency composition. Whereas the STFT can be interpreted as the Fourier transform of a timewindowed signal, the wavelet transform is defined as the inner product between a signal and a time-stretched and time-shifted mother wavelet. In both cases, the form of the basic functions is determined by the window function and the wavelet, respectively. Even though the time-frequency resolutions can be adjusted by the window length for the STFT or the bandwidth or center frequency of the mother wavelet, the general traits regarding the time-frequency resolution of these transforms—the STFT

has a constant time-frequency resolution and the wavelet transform has a better frequency resolution at lower frequencies—are not changeable. Furthermore, the product of the frequency resolution <sup>f</sup> and the time resolution <sup>t</sup> is larger or equal to the Gabor limit 1/(4), where equality only holds for the Gaussian window function.

In this context, the DWT is the implementation of the real-valued wavelet transform for discrete signals by using filter banks with different sampling levels to replicate the time stretching of the mother wavelet. Therefore, it shares the same traits—better frequency resolution at lower frequencies and oscillating coefficients in some cases even for signals with a constant frequency. An extension of the DWT to use complex-valued basis functions equivalent to the DFT is called the dual-tree complex wavelet transform. This approach is already very close to the AWPT. The only extension missing to get the AWPT is the adaptive time-frequency resolution, i.e., different time-frequency resolutions can be set arbitrarily for different frequency levels within the Gabor limit. Basically, however, all presented time-frequency representations are still only basis transforms where the basis functions have certain properties, e.g., bandwidth, center frequency, time compactness.

Mathematically, a basis transform of a discrete signal [], e.g., the DFT, can be formulated as finding the coefficients [] needed to decompose it into the sum

$$s[n] = \sum\_{k=0}^{K-1} c[k] \,\psi\_k[n] \,. \tag{5.1}$$

Although the signal [] may in principle be real-valued or complexvalued, this thesis only considers real-valued signals due to the origin of the signals from ultrasonic measurements. To get the coefficients, the Gram matrix of the new basis system

$$\Phi = \text{Basis}\left\{\psi\_k[n] \; , \; k = 0 \dots K-1\right\} \tag{5.2}$$

is inverted and multiplied by the vector resulting from building the inner product of the signal with each basis function []. Note that the decomposition in (5.1) is only unique if the number of basis functions is equal to the number of discrete samples . For > the function system, called frame or dictionary in this case, is overcomplete and the

coefficients are determined by imposing additional constraints. Otherwise ( < ), the system only spans a subspace of all possible signals leading to the situation that only an approximation of the signal can be found. In all cases, the inverse transform is simply building the sum (5.1).

A special case is given when the basis is orthonormal. Since the inner product of distinct basis functions equals zero and the signal energy of the basis functions equals one, the Gram matrix becomes an identity matrix. Therefore, the coefficients can be determined directly through projection of the signal onto the respective basis function

$$c[k] = \left\langle s[n], \psi\_k[n] \right\rangle\_n = \sum\_{n=0}^{N-1} s[n] \,\psi\_k[n] \,. \tag{5.3}$$

Even if the basis is only orthogonal, normalizing the signal energy of each basis function creates an orthonormal basis. Furthermore, for orthonormal bases applies: given the basis function [] the coefficient [] not only provides information about how much signal energy is in the time window around the mean time , but also at the same time, due to Parseval's theorem, how much signal energy is around the mean frequency . Here, the mean time , the mean frequency , the time duration t,, and the bandwidth f, are defined by the relationships

$$\overline{t}\_{k} = \sum\_{n=0}^{N-1} nt\_{\text{s}} \cdot \frac{\left\|\psi\_{k}[n]\right\|^{2}}{\left\|\psi\_{k}[n]\right\|\_{2}^{2}}, \qquad \sigma\_{\text{t},k} = \sum\_{n=0}^{N-1} (nt\_{\text{s}} - \overline{t}\_{k})^{2} \frac{\left\|\psi\_{k}[n]\right\|^{2}}{\left\|\psi\_{k}[n]\right\|\_{2}^{2}}, \tag{5.4}$$

$$\overline{f}\_{k} = \sum\_{n\_{k}=0}^{N-1} \frac{n\_{k}}{Nt\_{\text{s}}} \frac{\left\|\Psi\_{k}[n\_{k}]\right\|^{2}}{\left\|\Psi\_{k}[n\_{k}]\right\|\_{2}^{2}}, \quad \sigma\_{\text{f},k} = \sum\_{n\_{k}=0}^{N-1} \left(\frac{n\_{k}}{Nt\_{\text{s}}} - \overline{f}\_{k}\right)^{2} \frac{\left\|\Psi\_{k}[n\_{k}]\right\|^{2}}{\left\|\Psi\_{k}[n\_{k}]\right\|\_{2}^{2}}. \tag{5.5}$$

where Ψ [ ] denotes the DFT of the basis function [].

Summarizing the above-mentioned properties of basis transforms, where the objective is to get information about the time-frequency energy distribution of signals, the AWPT is designed to have the following properties:

The transform should use complex-valued analytic basis functions

$$\psi\_k^{\mathbb{C}}[n] = \psi\_k^{\text{Re}}[n] - \text{j }\psi\_k^{\text{Im}}[n] \text{ , with } \psi\_k^{\text{Im}}[n] = -\mathcal{H}\{\psi\_k^{\text{Re}}[n]\} \text{ , (5.6)}$$

where the real and the negative imaginary part build a Hilbert pair, i.e., the negative imaginary part is the Hilbert transform of the real part. Due to this property, the shift-invariance of the transform is preserved, although the downsampling operations are still included.

In order to calculate the coefficients efficiently as inner products later it will be shown that the AWPT applies discrete convolutions to perform this task—the basis functions ℂ [] should be constraint orthonormal, i.e.

$$\begin{aligned} \left\langle \psi\_i^{\text{Re}}[n], \psi\_j^{\text{Re}}[n] \right\rangle\_n &= 0.5 \delta\_{ij}, \\ \left\langle \psi\_i^{\text{Im}}[n], \psi\_j^{\text{Im}}[n] \right\rangle\_n &= 0.5 \delta\_{ij}, \quad 0 \le i, j \le N - 1, \\ \left\langle \psi\_k^{\text{Re}}[n], \psi\_k^{\text{Im}}[n] \right\rangle &\approx 0, \quad 0 \le k \le N - 1. \end{aligned} \tag{5.7}$$

Note that if condition (5.6) does perfectly hold, the inner product between the real and imaginary part is zero. However, in practice, as it will be shown later, the filter design using finite impulse response filters can only approximately design the real and imaginary parts as a Hilbert pair, since the Hilbert transform requires an infinite impulse response. As the set of real-part and imaginarypart basis functions each form a complete basis system, the inner product between real and imaginary parts from different frequency bands is not required to be zero.


#### 5.1.2 Dual-tree approach for multirate filter banks

A multirate filter bank such as the DWT or wavelet packet transform is built as a full binary tree, i.e., each node can have exactly two or zero children, where a node without children is called a leaf. In this context, a node consists of a lowpass or bandpass filter followed by a downsampling by two. Due to this, successive filtering steps lead to a dyadic reduction of the sampling rate, which is why these structures are also called multirate filter banks. Furthermore, the filtering influences the frequency windows that the coefficients represent, i.e., the frequency characteristic of the basis function corresponding to a node is determined by the sequence of filters that leads to the respective node. In principle, two main properties of this frequency characteristic are set. Firstly, as each node reduces the sampling rate, the time resolution is halved while the frequency resolution is doubled for each additional filter pair that is passed through. Secondly, the order of the passed filters determines the center frequency, e.g., a bandpass followed by a lowpass yields the center frequency = 5/8N, with <sup>N</sup> denoting the Nyquist frequency.

In summary, given the input coefficients [] of a node, the coefficients of the subsequent lowpass and bandpass children are calculated by

$$c\_{\rm LP}[n] = \left(c[\tilde{n}] \* g\_{\rm LP}[\tilde{n}]\right)\Big|\_{\tilde{n}=2n} \,\prime \tag{5.8a}$$

$$c\_{\rm BP}[n] = \left(c[\tilde{n}] \* g\_{\rm BP}[\tilde{n}]\right)\Big|\_{\tilde{n} = 2n'} \tag{5.8b}$$

with the impulse responses of the lowpass filter LP[] and the bandpass filter BP[]. Since the filtering results in two sequences of coefficients, each of which is subsequently downsampled by two, the number of coefficients at each level remains constant. Note that the information lost due to discarding half of the samples after each filtering can only be recovered if the impulse responses are designed to hold certain properties, which will be discussed in the next subsection.

An example of a full multirate filter bank representing the AWPT is depicted in Fig. 5.1. Additionally to the already mentioned relationships, a dual tree responsible for calculating the imaginary part of the coefficients is included. This approach can be explained as follows. Recursive application of (5.8) leads to the coefficient sequences of the nodes, which

Figure 5.1 Dual-tree representation of the analytic wavelet packets. The filter pairs are rearranged to obtain analytic basis functions and to order the center frequencies ascending with the subband (see Section 5.1.3).

represent the inner products of the corresponding basis functions with the input signal. However, since the calculations are performed all in the real-valued domain, using a single tree yields only the real part of the coefficients Re[], or in other words, only the basis functions Re [] are implemented by a single tree. The dual-tree approach adds a second tree and slightly modifies the impulse responses to guarantee that the implicit basis functions of the original tree Re [] and the dual tree Im [] build a Hilbert pair—the order of the filter impulse responses is modified to get the correct behavior with respect to the center frequency and analytic basis functions, as explained in the following subsection. With this, the final complex-valued coefficients can be obtained by

$$c\left[k\right] = \left\langle s\left[n\right], \psi\_{k}^{\mathrm{Re}}\left[n\right] - \mathbf{j}\,\psi\_{k}^{\mathrm{Im}}\left[n\right]\right\rangle\_{n}$$

$$= \left\langle s\left[n\right], \psi\_{k}^{\mathrm{Re}}\left[n\right]\right\rangle\_{n} + \mathbf{j}\left\langle s\left[n\right], \,\psi\_{k}^{\mathrm{Im}}\left[n\right]\right\rangle\_{n}$$

$$= c^{\mathrm{Re}}\left[k\right] + \mathbf{j}\,c^{\mathrm{Im}}\left[k\right].\tag{5.9}$$

Figure 5.2 Time-frequency resolutions of two different tree structures. The standard time-frequency resolution of the DWT is shown on the right side.

Another important aspect that has been neglected so far is the nomenclature of the coefficient indices. Until now the coefficients were only indexed by the discrete frequency step . However, a detailed examination of the multirate filter bank shows that each node is uniquely defined by its level and subband. Furthermore, each node contains multiple coefficients due to the time index .

Firstly, the node nomenclature used in this thesis—and in the previous publication [144]—is explained. Each node is described by the tuple

$$(s\_{\mathcal{W}}, k\_{\mathcal{W}}) \in \mathbb{N}^2 \quad \text{with} \quad s\_{\mathcal{W}} \ge 0 \text{ } \boldsymbol{k}\_{\mathcal{W}} \in [0, 2^{s\_{\mathcal{W}}} - 1] \text{ } \tag{5.10}$$

where <sup>W</sup> denotes the level of the node and <sup>W</sup> denotes the subband within this level. Starting from the root with the tuple (0, 0), the following nodes are determined recursively by a parent-child relationship, in which the children are defined as

$$\text{lowpass child} \quad (s\_{\text{w}} + 1, 2k\_{\text{w}}) \,, \tag{5.11a}$$

$$\text{bandwidth} \quad (s\_{\text{w}} + 1, 2k\_{\text{w}} + 1) \,\text{.}\tag{5.11b}$$

if the parent node is given by (<sup>W</sup> , <sup>W</sup> ).

Secondly, the time index of the node coefficients is set in relation to the time-frequency window that corresponds to each individual coefficient. While the mean frequency is completely determined by the node identifier (<sup>W</sup> , <sup>W</sup> ), the mean time is determined by the time index of the

coefficient sequence. Unfortunately, the coefficients cannot be arranged in the form of a matrix, because the coefficient sequences W,<sup>W</sup> [] from different nodes are not always of the same length. Therefore, the coefficients are vectorized into a linear arrangement by simple concatenation of the different coefficient sequences and indexed by , which in this context specifies both the mean frequency and mean time. Due to this, a map needs to be stored containing the level, the subband, and the time index for each index . Two examples for different tree structures, their coefficient notations W,<sup>W</sup> [], and their time-frequency resolutions indicated by rectangular windows are depicted in Fig. 5.2. Due to the limited space the time duration is shortened to include only eight samples, which can be concluded from the number of coefficients in the first level (<sup>W</sup> = 1) being four.

#### 5.1.3 Filter design and arrangement

The desired properties of the AWPT stated in Subsection 5.1.1 and the existence of its inverse transform are strongly dependent on the design of the filters and their arrangement. Since the focus of this thesis does not lie on the filter design, only the fundamental approaches are presented. For further reference see the works of Daubechies [24], Selesnick [107, 109] and Kingsbury [58].

In the case of the conventional DWT without a dual tree and with non-analytic basis functions, Smith and Barnwell [115] have proven that a bandpass-lowpass filter pair with the relationship

$$g\_{\rm BP}[n] = (-1)^n g\_{\rm LP}[N - n] \tag{5.12}$$

allows a perfect reconstruction, which is in this context the inverse transform, if the perfect reconstruction condition

$$\begin{aligned} G\_{\rm LP}(z)G\_{\rm LP}(z^{-1}) + G\_{\rm LP}(-z^{-1})G\_{\rm LP}(-z) &= 2\\ \bullet \\ \sum\_{i} g\_{\rm LP}[i] \, g\_{\rm LP}[i+2n] &= \delta[n] = \begin{cases} 1 & n=0\\ 0 & n \neq 0 \end{cases} \end{aligned} \tag{5.13}$$

is fulfilled. Such a filter pair is called conjugate-quadrature-filter and leads to orthogonal filters. The reconstruction or synthesis filters ℎ[], which are used in the mirrored synthesis tree not shown in this thesis, can then be derived from the lowpass filter according to:

$$h\_{\rm LP}[n] = g\_{\rm LP}[N-n]\_{\prime} \tag{5.14a}$$

$$h\_{\rm BP}[n] = (-1)^{n+1} g\_{\rm LP}[n] \,. \tag{5.14b}$$

In summary, a set of four filters is necessary to perform the DWT and its inverse transform. Note that the perfect reconstruction condition (5.13) still contains degrees of freedom. One option to design such a lowpass filter is using Daubechies' approach

$$G\_{\rm LP}(z) = (1+z)^L Q(z) \, , \tag{5.15}$$

leading to the so-called Daubechies-N wavelet, where N denotes the resulting length of the filters [24]. In this context, the number of vanishing moments has to be set and a polynomial () has to be designed, which can be done by a spectral factorization. Furthermore, there are several more design methods, such as Symlets with improved symmetry or Coiflets where the resulting scaling function also has vanishing moments [82]. Whereas the standard Daubechies-N wavelets lead to a minimumphase system, the Symlets and Coiflets are nearly linear phase systems.

Until now, only the filter design for the real-valued DWT has been addressed. However, the extension to the AWPT requires further filters for the second tree and its reconstruction tree. Additionally to the perfect reconstruction conditions, the resulting wavelets need to build a Hilbert pair. As proven by Selesnick, this can be realized if the filters of both trees satisfy the half-sample delay condition

$$g\_{\rm LP}^{\rm Im}[n] = g\_{\rm LP}^{\rm Re}[n - 0.5] \,. \tag{5.16}$$

Since the filter impulse responses are discrete, this condition is not as trivial as it may seem and can only be solved approximately. Three prominent design approaches are the q-shift wavelets [58] and the linear-phase biorthogonal wavelets [59, 60] introduced by Kingsbury as well as the common factor solution presented by Selesnick [109]. For the latter, the vanishing moments of the resulting wavelets and the approximation accuracy of the half-sample delay condition can be set independently by two parameters.

Figure 5.3 Reason for exchanging the bandpass and lowpass filters after bandpass node in multirate filter bank (adapted from [56]).

However, Selesnick et al. [111] realized that even if the half-sample delay condition (5.16) is met perfectly, the resulting wavelets are only analytic if enough filters are passed through. This problem can be solved by using different filters in the first level, which satisfy

$$g\_{0, \rm LP}^{\rm lm}[n] = g\_{0, \rm LP}^{\rm Re}[n-1]. \tag{5.17}$$

Here, the different filters are highlighted by a subscripted 0. Because this condition is a lot easier to satisfy, ordinary Daubechies wavelets can be used in the first stage of the real-part tree and their one-sample delayed version in the imaginary tree. The corresponding filters for the synthesis trees can be calculated again by the relationships (5.12) and (5.14). In summary, a total of eight analysis filters and eight synthesis filters is necessary to implement the AWPT and its inverse.

Finally, the filter arrangement has to meet a few criteria to get analytic basis functions and to order the center frequencies ascending with the subband <sup>W</sup> . As Figure 5.3 illustrates, the downsampling shifts the bandpass parts BP[] into the lower frequency band, but with swapped lowpass (L) and bandpass (B) components. To account for this effect, the bandpass and lowpass filters must be swapped in the next stage (see Fig. 5.1). After another bandpass, the filters are swapped again. Due to the same effect, the filters of the imaginary- and real-part tree have to be swapped after the first bandpass (see Fig. 5.1). Otherwise, the basis functions would have only energy at negative frequencies. The last point to mention is that after reaching a node in the dual tree, where the basis functions are already analytic signals, the subsequent nodes in both the real and the imaginary trees need to have the same filters [110, 128]. For simplicity, the filters of the real-part tree are used in this case.

## 5.2 Algorithm design

The proposed approach for the robust time-delay difference estimation, in the following called the feature-based regression method (FRM), is inspired by classical machine learning methods. To this end, the data set needs to be split into a training and test set. Subsequently, feature vectors are extracted and, after a reduction of the dimensionality, used for a regression to estimate the time-delay differences.

There are two general classes of this approach: supervised and unsupervised. In both cases, the objective is to learn from a given training set how to estimate the correct time-delay difference given a measurement signal each for the reference and delay signal. However, in the supervised case, the corresponding ground truth, also known as labels, for the time-delay difference of the signals in the training set must be known. In contrast, unsupervised machine learning methods only need the training set without any corresponding labels. In practice, this means that a calibration in a scenario with known ground truth is necessary for the supervised approaches, whereas the unsupervised approaches can collect the data during operation and learn the regression from it. As a result, unsupervised approaches are closer to the SDMs or to conventional TDE methods and have higher acceptance in practical applications. For this reason, even though both supervised and unsupervised versions of the FRM are proposed, the focus lies on the unsupervised version. A supervised version of this class of approaches is already published in an own previous publication [145]. This chapter takes up the approach contained in [145] and extends it by adding further selection methods, regressions, and a solution approach for the unsupervised training.

The principle of the supervised FRM is outlined in Fig. 5.4. By omitting the label connections to the training of the feature selection and the regression model, the approach is modified to get the outline of the unsupervised FRM. Since the proposed feature extraction algorithm is based on the AWPT, the resulting number of features is much too large to get a stable regression. Therefore, a subsequent feature selection is added. The remaining features represent the input variables for the regression model. After the training, the selection mask for the features and the regression model parameters are fixed. Given a measurement from the test set, repeating the feature extraction, feature selection and regression

Figure 5.4 The principle of the FRM.

steps with the fixed mask and the trained regression model parameters leads to the final estimate of the time-delay difference.

A detailed description of the feature design, the feature selection approaches, and the regression model follows in the next three sections.

# 5.3 Feature design

As the system identification in Chapter 3 has shown, the measurement signals are band-limited and the contained information is time-dependent. Therefore, the signals can be sparsely represented in a different basis, e.g., the basis obtained by employing the AWPT. Since the AWPT coefficients represent a corresponding time-frequency window, they can be used to incorporate only relevant frequency parts and time ranges. In this section, three types of features are presented, two of which are used for the subsequent regression and one of which is an auxiliary variable used for the feature selection. Since the features are grouped in the form of matrices, the value range of the discrete frequency index is changed from [0, − 1] to [1, ] to be consistent with the matrix indices.

To begin with, the first step of the feature design is the AWPT of the reference and delay signals to get the corresponding coefficients

$$c\_{\mathbf{r}}[k] = \left\langle s\_{\mathbf{r}}[n] \, \right| \, \left. \psi\_{k}^{\mathbb{C}}[n] \right\rangle\_{n} \, \prime \tag{5.18a}$$

$$\left\langle c\_{\rm d}[k] = \left\langle s\_{\rm d}[n] \right\rangle , \left\langle \psi\_{k}^{\rm C}[n] \right\rangle\_{n} \right. \tag{5.18b}$$

Even though (5.18) implies that the coefficients are calculated via the inner product, the practical implementation is based on the dual-tree presented in Section 5.1.2, as the calculation using multirate filter banks reduces the computational effort significantly. To fully define the transform, both the filter types and the tree nodes must be specified. A preliminary study [151] has shown that, given the measurement signals with the excitation frequency 700 kHz and the sampling rate 50 MHz, the best results can be obtained by using nodes with a level <sup>W</sup> ∈ [5, 6, 7]. Furthermore, the study has shown good results if the filters of the first level (0,LP[]) and the subsequent levels (LP[]) are set to the Daubechies-20 wavelets and the common factor solution of Selesnick with one vanishing moment and an approximation order of five, respectively.

Using the calculated AWPT coefficients of the measurement signals, the features

$$\left|\xi\_{\delta}[k] = \left|c\_{\mathbf{r}}[k] - c\_{\mathbf{d}}[k]\right|\,,\tag{5.19a}$$

$$\xi\_{\phi}[k] = \arg\left\{ \frac{c\_{\rm r}[k] + \epsilon}{c\_{\rm d}[k] + \epsilon} \right\}, 0 < \epsilon \ll 1,\tag{5.19b}$$

$$\left| \xi\_{\rm a}[k] = \frac{1}{2} \left( |c\_{\rm r}[k]| + |c\_{\rm d}[k]| \right) \tag{5.19c} \right|$$

are calculated for each coefficient. Here, the coefficient index represents the time-frequency window—also known as the subspace—of the signal. In other words, after the signals have been transformed into the time-frequency representation, the three calculation rules (5.19) are applied to each time-frequency window denoted by . Note that taking the argument of the quotient of two complex values is equivalent to taking the difference of the arguments. The small constant is added to the quotient to increase numerical stability on one hand and to improve the robustness against noise on the other hand, but it should be chosen well below the expected signal energy level of the target signals. Additionally, note that the third feature (5.19c) is only used during the feature selection and not for the regression, which is why in the following solely the features (5.19a), (5.19b) are examined in more detail.

The motivation to use the mean absolute coefficient (5.19c) is that this feature represents the square root of the signal energy fraction within the respective time-frequency window, which can be used to decide whether the SNR is high enough to allow reliable information extraction. The other two types of features are chosen since they are approximately linearly correlated with the time-delay difference , which can be proven by the insertion of the narrowband assumption into the inner product. As (3.29) shows, the difference signal between the reference and delay signal is proportional to the time-delay difference if is sufficiently small. The projection of this difference signal onto the basis function,

$$\begin{aligned} & \left< s\_{\mathbf{d}}[n] - s\_{\mathbf{r}}[n] \right>, \left. \psi\_{k}^{\mathbb{C}}[n] \right>\_{n} \\ & \approx \left< A\_{\mathbf{t}} \omega\_{0} \tau \cdot \cos \left( \omega\_{0} (t\_{\mathbf{s}} n - \tau\_{\mathbf{a}}) \right) \right>, \left. \psi\_{k}^{\mathbb{C}}[n] \right>\_{n} \\ & \approx A\_{\mathbf{t}} \omega\_{0} \tau \cdot \left< \cos \left( \omega\_{0} (t\_{\mathbf{s}} n - \tau\_{\mathbf{a}}) \right) \right. \\ & \propto \left. \tau \right> \end{aligned} \tag{5.20}$$

is equivalent to the difference between the coefficients, since the transform is linear. Moreover, calculating the difference removes the symmetric interfering signals. The motivation to use the phase-difference feature is based upon the fact that a time delay of the target signal corresponds to a phase shift of the respective complex-valued coefficients. For this relation to hold, the basis function has to be analytic. A proof can be given by applying Parseval's theorem to the inner product (5.18). If the target signal is a pure sinusoidal signal with circular frequency <sup>0</sup> , taking the time-delay difference leads to the phase-shifted coefficients

$$\left\langle s\_t[n \pm f\_s \tau/2], \psi\_k^{\mathbb{C}}[n] \right\rangle\_n \approx \mathrm{e}^{\pm j\omega\_0 \tau/2} \langle s\_t[n], \psi\_k^{\mathbb{C}}[n] \rangle\_n. \tag{5.21}$$

It is easy to see that the two coefficients are out of phase with each other by exactly <sup>0</sup> . The difference of the phases is consequently linear with respect to the time delay . However, it must be noted that, firstly, the target signals are only approximately pure sinusoidal signals and, secondly, the interfering signals have not been taken into account here,

Figure 5.5 Dependency of the two proposed feature types with respect to the time delays of the different measurements. The curves are created by plotting the features (5.19a) and (5.19b) for the subspace with the highest energy over the time delays for data set <sup>D</sup>sim,2. Each point represents one measurement contained in the data set.

which leads to a distortion of both the slope of the linear relationship and the linearity itself when working with real measurement data.

A visual example of the relationship between the features and the time delay is shown in Figure 5.5. Here, the absolute- and phase-difference features of the subspace with the largest energy are shown for the simulated signals from the data set <sup>D</sup>sim,2. Due to the varying temperature, the absolute time delay <sup>a</sup> also varies, which influences the slope of the straight lines through the varying amplitude of the signals within the time window spanned by the corresponding basis function. Nevertheless, the graphs are in good agreement with the desired linear relationship between the time delay and the features. Furthermore, the linearity can be improved by restricting the temperature variation within the measurement package, which is used for training. It must be taken into account that the regression may then only provide good estimates in the process conditions range in which it was trained. A quantitative study on the generalizability will be given in Section 6.6.5.

Finally, the absolute and phase-difference features for every subspace are concatenated into a single feature vector

$$\boldsymbol{\xi} = \begin{bmatrix} \boldsymbol{\xi}\_{\boldsymbol{\delta}} \\ \boldsymbol{\xi}\_{\boldsymbol{\phi}} \end{bmatrix} \in \mathbb{R}^{2K} \tag{5.22}$$

to enable further analysis in a vector-matrix notation.

# 5.4 Feature selection

According to Section 5.3, the number of features is two times the number of samples in the input signal since the basis transforms of the reference and delay signals yield = coefficients <sup>r</sup> [], <sup>d</sup> [], respectively. Using all features would not result in a stable regression that generalizes well. Furthermore, due to the sparsity of the signals in the AWPT basis, many coefficients are nearly zero, which in turn means that the corresponding features contain little information. Therefore, a feature selection has to be performed before the remaining features are further processed by the regression. This section describes in the first subsection several SotA methods for feature selection, which are implemented and will be tested in the algorithm for the TDE problem. In addition, a new iterative selection method is proposed—this method is characterized by the fact that multiple quality criteria for the features can be combined in an easy manner.

Firstly, a few preparatory definitions are introduced:

Since one feature vector results from one of measurements, the feature vectors from multiple measurements, indexed by , can be grouped into the feature matrix

$$\Xi = [\xi\_1, \dots, \xi\_m, \dots, \xi\_M] = \left(\xi\_{km}\right) \in \mathbb{R}^{2K \times M}.\tag{5.23}$$


$$\begin{array}{ccccc}\Xi\_{\sf s} & = & \mathbf{M} & \cdot & \Xi & . \\ (K\_{\sf s}, M) & (K\_{\sf s}, 2K) & (2K, M) \end{array} \tag{5.24}$$

To this end, the selection mask is constructed from M by concatenating <sup>s</sup> = |M<sup>|</sup> rows, where each row consists of a unit vector T ∈ {0, 1}1× with exactly one 1 at the -th position. This also

ensures that there is no more than one 1 per column to avoid a multiple selection of the same features. The resulting feature matrix <sup>s</sup> contains the remaining features after the selection.


#### 5.4.1 State-of-the-art feature selection methods

Feature selection methods can be divided into three categories: filter methods, wrapper methods, and embedded methods [39]. In this thesis, representatives of all three types are presented and implemented to test their performance in feature-based regression of time delays. An overview of the evaluated methods with the corresponding references is given in Table 5.1. A more detailed description of the individual methods follows in the remainder of this subsection.

Without modifications, the presented SotA feature selection methods perform the IFS. To adapt these methods to the SS, the quality measures of both features per subspace are averaged and a new threshold or top-N strategy is used to decide which subspaces to select. However, in addition to the constraints imposed on the selection, the averaging approach can lead to worse selection performances, due to the nonlinearity of some quality measures. Moreover, some methods can only be used with the top-N strategy.

#### Filter methods

The filter methods are characterized by the fact that each feature is evaluated individually by a quality measure. A selection can then be obtained by setting a threshold or by taking the features with the <sup>s</sup> highest quality measures. In this context, the threshold or the number of features to select is a hyperparameter of the selection method. Since the subsequently used regression model is not taken into account in the evaluation, the computational effort is generally lower than for wrapper and embedded methods. In this thesis, four filter methods are investigated for their applicability to feature-based regression of time-delay differences: the Pearson correlation coefficient (PCC), the F-test, the Laplacian score, and the RReliefF algorithm. Except for the Laplacian score, all these methods are supervised, i.e., the labels of the training set are used in the selection algorithm.

The PCC—more precisely, the sample PCC—is the covariance between two variables normalized by their multiplied standard deviations. In other words, it is a measure of correlation between two quantities. Therefore, the PCC [98]

$$r\_{\xi\tau}[k] = \sum\_{m=1}^{M} \frac{(\xi\_{km} - \overline{\xi}\_k)(\tau\_m - \overline{\tau})}{\tilde{\sigma}\_m \{\xi\_{km}\} \cdot \tilde{\sigma}\_m \{\tau\_m\}} \tag{5.25}$$


Table 5.1 Overview of the used feature selection methods.

is well suited to be a quality measure that ranks the importance of the features. In (5.25), the operator ̃{⋅} denotes the sample standard deviation with regard to and , denote the average feature and time delay over all measurements, respectively.

Another similar measure is given by the -test, where a test statistic, which follows an -distribution, is set up for each individual feature. In this selection method, a large -value of the test statistic represents an important feature. According to Pirbazari et al. [94] the -value for each feature,

$$J\_{\rm F}[k] = \frac{r\_{\xi\tau}^2[k]}{1 - r\_{\xi\tau}^2[k]} \cdot (M - 2) \, , \tag{5.26}$$

can be calculated based on its respective PCC [] and the number of samples . If desired, the -value can be derived from the cumulative distribution function of the -distribution at <sup>F</sup> [].

The only unsupervised method in the selection methods under investigation, the Laplacian score, is based on a k-nearest neighbors (k-NN) similarity graph. Basically, for each feature vector, the k-NN feature vectors in terms of the Euclidean distance are determined. Subsequently, the pairwise Euclidean distances ∥ − ∥ 2 2 are inserted into a Gaussian kernel function to create the similarity matrix = () ∈ ℝ×, with = exp(− ∥ − ∥ 2 2 ). The entries in the similarity matrix that correspond to unconnected nodes are set to zero. Finally, given the similarity matrix, the Laplacian scores are defined as

$$J\_{\mathbf{L}}[k] = 1 - \frac{\sum\_{i=1}^{M} \sum\_{j=1}^{M} s\_{ij} \tilde{\xi}\_{ki} \tilde{\xi}\_{kj}}{\sum\_{m=1}^{M} \mathbf{e}\_{m}^{\mathrm{T}} \mathbf{S} \mathbf{1}\_{M1} \cdot \tilde{\xi}\_{km}^{2}},\tag{5.27}$$

with the centered features ̃ . It has to be noted that a weighted arithmetic mean is used to center the features, where the weights are the row sums of the similarity matrix. A detailed description of Laplacian scores can be found in the paper by He et al. [44].

The last filter method, the RReliefF (regressional ReliefF) algorithm proposed by Robnik-Šikonja and Kononenko [97], is an extension of the original Relief algorithm to regression problems. It is based on rewarding features whose values differ when the labels are different and punishing features whose values differ when the labels are similar. For each feature, the following steps are performed to get the quality measure. Firstly, a random observation is selected, and the k-NN in terms of the Euclidean distances of the feature vectors are determined. Secondly, the rewards and penalties of these k-NN are calculated and weighted by their respective Euclidean distance to update the quality measure. Lastly, these steps are repeated for further observations resulting in the final quality measure for each feature. More details about the algorithm can be found in the publication of Robnik-Šikonja and Kononenko [97].

By specifying the constraints of the selection—IFS or SS—and a threshold value or the number of features to select <sup>s</sup> , the selection mask M and its matrix counterpart are obtained. In the case of SS, the selection matrix has a block diagonal matrix shape, i.e., the second half of the lower rows with the row indices s/2 + 1 … <sup>s</sup> is identical to the first half of the upper rows with the indices 1 … <sup>s</sup> . This is due to the concatenation of the features to a total feature vector.

#### Wrapper methods

Contrary to the filter methods, where the features are evaluated individually without considering the regression model, wrapper methods iteratively select subsets of the features and train the regression model. Consequently, the subset of features resulting in the least regression error is returned as optimal feature selection. However, since the amount of all subsets is the same as the cardinality of the power set, evaluating all subsets would require too much computational effort. Therefore, a greedy approach is usually used, where the features either are added or removed incrementally or a combination of both. In the literature, these approaches are called *forward selection*, *backward elimination* or *step-wise selection* [62]. Due to the greedy nature, the selection in each step is always only locally optimal and the final selection does not represent the globally best solution. Furthermore, the chances of overfitting are high.

As an example, the *forward selection* applied in this thesis is briefly described here. The algorithm starts with an empty set of selected features. Subsequently, for each feature, a simple regression model with only one independent variable is trained. The feature with the best regression

performance in terms of MSE or other performance criteria such as 2 is added to the selected feature subset. These steps are repeated, where the trained regression model has an incrementally increasing number of independent variables, until a preset number of features is selected. Even if the number of regression models to be trained is greatly reduced in this way compared to the evaluation of all possible subsets, the computational effort is usually still significantly greater than with filter methods, especially when a large number of features is to be selected. Due to the linear relationship of the features with the response variable (the time-delay difference), the linear regression model is chosen for the implementation of the sequential forward selection.

Since the selection procedure is iterative and differs from the qualitymeasure-based filter methods, the SS cannot be obtained by averaging the quality measures. Therefore, the wrapper method is performed separately for both feature types and then, via a union and intersection of both selection sets, a subspace ranking is generated from which the subspaces are then selected via the top-N strategy. Threshold-based selection is not possible by means of wrapper methods.

#### Embedded methods

The last group of selection methods, the embedded methods, perform the regression training with a built-in feature selection method. From a computational point of view, they rank between the filter and wrapper methods. Again, the regression model is taken into account jointly with the feature selection, but it is also possible to specify a model for the feature selection and then discard it and use a different model for the actual regression. The three different examined embedded methods are the LASSO, the neighborhood component analysis (NCA), which was adapted to regressions, and the Gaussian process regression (GPR).

Generally, the regression model parameters are penalized if they get too large. This restricts the regression model so that it is less prone to overfitting. A simple approach is the combination of a linear LS regression with a regularization leading to the objective of LASSO [119]

$$\{\hat{\mathbf{a}}, \hat{o}\} = \operatorname\*{arg\,min}\_{o, \mathbf{a}} \left( \frac{1}{M} \sum\_{m=1}^{M} \left( \tau\_m - o - \boldsymbol{\xi}\_m^T \mathbf{a} \right)^2 + \lambda\_\mathcal{L} \left\| \mathbf{a} \right\|\_1 \right) \tag{5.28}$$

where the Lagrange multiplier <sup>L</sup> ensures that ‖‖<sup>1</sup> ≤ for some threshold . Higher Lagrange multipliers reduce the threshold resulting from the optimization, and can, therefore, be used to adjust the method to select fewer features, since more parameters are close to or directly at zero. An optimal Lagrange multiplier can be found by cross-validation, but to allow the number of selected features <sup>s</sup> to be given as a hyperparameter, the highest possible Lagrange multiplier is chosen such that the number of non-zero parameters is still greater than the specified number of features to select <sup>s</sup> . Because the magnitude of the model parameters is dependent on the data set and the Lagrange multiplier, the LASSO-based feature selection can only be made via a top-N strategy. Furthermore, the SS-based feature selection is obtained by averaging the two normalized parameter sets that result when the LASSO is performed once each for the absolute and the phase features.

The second embedded method is the NCA. Originally, it was proposed by Goldberger et al. [40] and applied as a feature selection method for classification by Yang et al. [135]. To make the NCA applicable to regression, the extension presented by Amankwaa-Kyeremeh et al. [2] is used in this thesis. The NCA is a non-parametric algorithm that aims at minimizing the average leave-one-out performance by learning the feature weights , i.e., the average quality, if the label of a feature vector is predicted by consensus of its k-NN, is maximized. Due to the length of the derivation and since the approach is not the focus of this work, only the objective function is stated here. The NCA finds the feature weights by the optimization [135]

$$\hat{\mathbf{a}} = \operatorname\*{arg\,min}\_{\mathbf{a} \in \mathbb{R}^{2K}} \frac{1}{M} \sum\_{m=1}^{M} \sum\_{j=1}^{M} \left( p\_{mj}(\mathbf{a}) \cdot (\tau\_m - \tau\_j)^2 \right) + \lambda\_\mathcal{L} \left\| \mathbf{a} \right\|\_1 \,\prime \tag{5.29}$$

where () is the probability that the feature vector is chosen as the reference point for the label prediction of the feature vector . For ≠ , it is calculated by

$$p\_{mj}(\mathbf{a}) = \frac{\exp\left(-\sigma^{-1}\sum\_{k=1}^{2K} a\_k^2 |\xi\_{km} - \xi\_{kj}|\right)}{\sum\_{j=1, j\neq m}^{M} \exp\left(-\sigma^{-1}\sum\_{k=1}^{2K} a\_k^2 |\xi\_{km} - \xi\_{kj}|\right)}.\tag{5.30}$$

The weights ̂resulting from the optimization (5.29)represent the quality measures and can again be used to select the features by thresholding or by the top-N strategy. The SS-based selection is implemented in the same way as for the LASSO method.

Finally, the last embedded method used is the GPR. In order to fit the Gaussian process to the training feature vectors and the corresponding labels, three practical assumptions are made: stationarity of the variance, the zero expectation value for all features, and a squared exponential covariance function with feature-specific variances as kernel [96]. The learned length scales of the automatic relevance detection kernel are used as quality measures of the respective features.

#### 5.4.2 Iterative subspace selection

In this section, a novel iterative method for SS-based feature selection is presented, which is motivated by the fact that the information content in a subspace affects both feature types. The new method—in the following called the iterative subspace selection (ISS)—combines several problemadapted quality measures in an iterative manner. The presented method uses only unsupervised quality measures and, thus, allows for a feature selection without ground-truth labels.

Firstly, the quality measures are presented. Due to the design of linear features and the transform into a basis where the signals are sparse, three objectives stand out: a good SNR, as linear as possible features without distortions, and high sensitivity. Therefore, the three quality measures

$$J\_i[k] = \begin{cases} \sum\_{m=1}^{M} \frac{(\xi\_{\phi}[k,m] - \widetilde{\xi}\_{\phi}[k])(\xi\_{\delta}[k,m] - \widetilde{\xi}\_{\delta}[k])}{\widetilde{\sigma}\_m \{\xi\_{\phi}[k,m]\} \cdot \widetilde{\sigma}\_m \{\xi\_{\delta}[k,m] \}} & i = 1 \\\\ \frac{1}{M} \sum\_{m=1}^{M} \xi\_{\mathbf{a}}[k,m] & i = 2 \\\\ \widetilde{\sigma}\_m^2 \left\{ \xi\_{\phi}[k,m] \right\} & i = 3 \end{cases} \tag{5.31}$$

are introduced, with the features averaged over the measurements [], [] and the sample variance ̃ 2 {⋅}. Note that even though the mean absolute coefficients <sup>a</sup> [, ] (see (5.19c)) are not used as features for regression, they can be well used here to select subspaces with high SNR. In order to keep the indices of the features short, the feature iteration

indices are denoted in brackets and not in matrix notation. The first quality measure <sup>1</sup> [] is the PCC between the absolute and phase-difference features and , where the measurements denoted by the index are considered the observations. Due to the missing labels in unsupervised approaches, the linearity between the features and the labels cannot be assessed directly by calculating the PCC between the features and the labels. However, the distortions that can occur in the features, if interfering signals are present, differ in nature for the absolute and phase-difference feature types, since the absolute value follows a root function and the phase follows an arctangent. Due to the suppression of interfering signals in the absolute-difference features and their linear correlation with the labels, it is assumed that a strong linearity between and also indicates a good corresponding subspace. The evaluations of the ISS later in Section 6.6.2 will show the usability of this approach. The second quality measure <sup>2</sup> [] represents the average absolute coefficient values of a subspace over all measurement signals. Due to the orthogonality of the AWPT, the absolute value of a coefficient can be interpreted as the root of the signal energy in the corresponding subspace. Therefore, this quality measure is ideal to assess the SNR, since the noise is considered to be uniformly distributed over all subspaces due to the assumption as AWGN. Lastly, the third quality measure <sup>3</sup> [] is the variance of the phase-difference signal. This measure has been chosen because the phase of the coefficients is independent of the signal energy and a high value indicates a high sensitivity of the subspace to the measurement effect. However, the variance of the phase-difference features is very high in subspaces with low SNR. This and the strong sensitivity to signal distortions make this quality measure the least robust.

Figure 5.6 shows an example of the three quality measures resulting from the measurement data set used in Section 3.3.1 for the identification of the measurement system. In order to visually improve the quality maps, they are displayed in logarithmic scale. Furthermore, the subspaces are ordered according to their time-frequency representation. From this example, several conclusions can be drawn, which emphasize the presumed properties.

Firstly, the inter-feature PCC (Fig. 5.6 on the left) has a hard edge, beginning exactly at the absolute time delay <sup>a</sup> at which the target signals

Figure 5.6 Normalized quality maps for time-frequency coefficients (in logarithmic scale). Left: PCC between phase-shift feature and absolute-difference feature. Middle: averaged absolute coefficient values. Right: variance of the phase-shift feature.

of the direct propagation path are expected. In the time-frequency representation, another hard edge can be observed at about 12 MHz, which is caused by the constant in the phase-difference calculation (5.19b). This can be explained by the fact, that above 12 MHz the coefficient values in the subspaces are lower than the constant, which means that no correlation can be measured anymore and the PCC tends to zero. However, apart from this, a distinction between the quality of the subspaces contained within this boundary is hardly possible. The remaining speckle noise is a display effect resulting from the use of the logarithmic scale. The absolute values (Fig. 5.6 in the middle) indicate multiple higher-order harmonics, resulting from the frequency responses of the basis functions. Additionally, at about 30 µs—the time delay of the interfering signals due to the Lamb waves—an increase of the coefficient energy can be observed. Lastly, the variance of the phase-difference feature (see Fig. 5.6 on the right) shows good agreement with the variance becoming high exactly when a target signal is present, e.g., between 60 µs and 80 µs, or when only noise is present, e.g., in the time range 0 − 30 µs.

In order to combine the advantages of the different quality measures and solve the robustness problem of the variance measure ̃ 2 { [, ]}, the ISS is proposed. Compared to a weighted mean, the iterative approach allows more flexibility such as combining the top-N strategy with a thresholding. Moreover, the approach is generally easier to parameterize. As the name implies, the quality measures are used iteratively to select a subset M() of the subspaces in each iteration step, which then gets

further reduced in the next iteration. In total, there are three iteration steps specified by the rule

$$\mathcal{M}^{(1)} = \{k \in \mathbb{N} \, | \, 1 \le k \le K\}\,\,\,. \tag{5.32a}$$

$$\mathcal{M}^{(2)} = \left\{ k \in \mathcal{M}^{(1)} \, | \, J\_1[k] > \gamma\_1 \right\} \,\, \,\tag{5.32b}$$

$$\mathcal{M}^{(3)} = \left\{ k \in \mathcal{M}^{(2)} \, | \, J\_2[k] > \gamma\_2 \right\} \, , \tag{5.32c}$$

$$\mathcal{M}^{(4)} = \left\{ k \in \mathcal{M}^{(3)} \,|\, J\_3[k] \in \mathrm{Top}\_{K\_\ast} \left( \{ J\_3[k] \} \_{k \in \mathcal{M}^{(3)}} \right) \right\}.\tag{5.32d}$$

The first selection is based on the most robust but least discriminating quality measure, the inter-feature PCC. The necessary threshold parameter <sup>1</sup> for this can be calculated by sorting the quality measures and taking the point of the strongest curvature—the same approach as for the automatic threshold estimation in Section 4.4. After this step, all signal subspaces containing only interfering signals are removed. The next selection step aims to separate the subspaces with low SNR. To this end, the VISU shrink threshold for denoising proposed by Donoho and Johnstone [27] is applied to identify all subspaces that are considered as noise. The AWGN standard deviation is estimated by the median absolute deviation of a measurement signal with only noise contained. In practice, this can be realized either by recording a signal without excitation or by observing subspaces known to contain only noise, such as the beginning of the measurement signals. Finally, the last step sorts the remaining subspaces and takes the top <sup>s</sup> subspaces. Since subspaces with only noise or interfering signals are already removed, the robustness problem of this quality measure is mitigated. It is assumed that any remaining interfering signal components in the subspaces are reduced due to the feature design or can cancel out by using multiple subspaces. Nevertheless, this is largely dependent on the number of features <sup>s</sup> to select, which is at the same time the hyperparameter to choose most carefully.

A visual example of the iterative selection process is given in Fig. 5.7. A reduction from the initial 7500 subspaces—determined by the 7500 discrete time samples—to the final 50 subspaces is shown from left to right. The negative PCC values of the first quality measure are a numeric effect and can be considered physically illogical. They are therefore removed by the threshold value <sup>1</sup> , which has been found by the automatic algorithm here to be approximately 0.95. The final time-frequency mask shows

Figure 5.7 Iterative selection of coefficients. From left to right: linearity between features, absolute value of coefficients, variance of phase-shift feature. Top row: quality maps. Middle row: sorted and squeezed quality map. Bottom row: resulting mask after selection step.

are good agreement with the measurement signal composition, since the commonly selected frequency is near to the excitation frequency (700 kHz) and three accumulation points can be observed in the time axis, which can be well explained by the three propagation paths of the ultrasonic signals.

Since only subspaces are selected by the ISS, a post-processing is necessary to get the selection mask . The final selection mask M(4) resulting from the ISS may be denoted in matrix notation as follows:

$$\tilde{\mathbf{M}} = [\mathbf{e}\_1, \dots, \mathbf{e}\_{K\_\varkappa/2}]^T \in \{0, 1\}^{\frac{K\_\varkappa}{2} \times K}. \tag{5.33}$$

Because the ISS selects from subspaces and the total number of selected features <sup>s</sup> should be consistent with the other selection approaches, the dimensionality is (<sup>s</sup> /2, ). As already mentioned, the entire selection mask is obtained by taking both feature types for the selected subspaces resulting in

$$\mathbf{M} = \begin{bmatrix} \tilde{\mathbf{M}} & \mathbf{0} \\ \mathbf{0} & \tilde{\mathbf{M}} \end{bmatrix} \in \{0, 1\}^{K\_s \times 2K}. \tag{5.34}$$

## 5.5 Robust time-delay difference estimation

The last component of the FRM is the regression model. As the features are designed to be linear with respect to the time-delay differences, linear regression models are chosen. Here, the features selected with the procedures described in the previous section represent the input variables. In addition, just as with the selection procedures, another alternative model, the GPR, is introduced. Subsequently, different training strategies to determine the parameters of the regression models are derived. Finally, an optimization-based approach is introduced to obtain the corresponding labels in order to use the FRM in an unsupervised fashion.

#### 5.5.1 Regression model

In regression problems, a high-dimensional input vector, which is also known as independent variables or predictors, has to be mapped to the labels such that the estimation error is minimal. In the present problem, the input vector corresponds to the feature vector and the labels are the time-delay differences. Let the feature vector s, ∈ ℝ<sup>s</sup> be the vector that results from the feature selection of the -th measurement. Then the three regression models

$$
\hat{\tau}\_m = \mathbf{f}\_{\mathbf{s},m}^T \mathbf{a} + o\_{\prime} \tag{5.35a}
$$

$$
\hat{\boldsymbol{\tau}}\_m = \boldsymbol{\xi}\_{s,m}^T \mathbf{a}\_{\prime} \tag{5.35b}
$$

$$
\hat{\tau}\_m = \mathcal{G}\mathcal{P}\mathcal{R}\{\xi\_{s,m}; \Xi\_s, \tau\}\tag{5.35c}
$$

allow an estimation of the time-delay difference of the -th measurement. While the former two models are both linear—the only difference

being the offset—the latter is a more complex model that can be best described as using the expectation value of a multivariate Gaussian distribution for the estimation. The Gaussian distribution is thereby determined by the given feature matrix <sup>s</sup> and the labels of the training data set.

Physically, the model (5.35b) makes the most sense, since the features are designed to be zero for = 0. However, this is only valid in the case without noise. Especially, the absolute-difference feature shows that even if the target signals and the interfering signals are suppressed due to the difference operation, the feature still contains the signal energies of the noise signals <sup>r</sup> (), <sup>d</sup> () projected onto the respective subspace. Even though only subspaces with a high SNR should be selected, they still contain noise because the noise signals are distributed uniformly across all subspaces. Therefore, adding an offset may improve the overall performance across the entire range of possible time-delay differences. Nevertheless, the linear models are based on the assumption of linearly correlated features, which in turn depend on the assumption that ≪ 1/<sup>0</sup> and on the neglect of noise. In summary, the linear models may be too restrictive and are expected to perform better in the center of the time delay range covered by the training data than near the boundaries, but at the same time due to the restrictive model the risk of overfitting of the regression is reduced to a minimum.

Since the linear models may be too restrictive, the GPR is also investigated and compared to the linear models. It has more degrees of freedom and also allows for an embedded feature importance ranking—called the automatic relevance detection—if the different features are weighted differently in the kernel function. However, training a GPR requires much more computational effort, which is why the model is combined with a preceding feature selection.

#### 5.5.2 Training strategies

The parameters of the used regression models have to be estimated during training. For the linear models, three parameterization methods are proposed, all based on a variant of

$$\begin{array}{rcl}\boldsymbol{\tau} &=& \boldsymbol{\Xi}\_{\sf s}^{\sf T} \cdot \mathbf{a} & +o \cdot \mathbf{1}\_{M1} + \underset{(M,1)}{\mathbf{n}} \ . \\\ (M,1) & (M,K\_{\sf s}) & (K\_{\sf s},1) & (M,1) \end{array} \tag{5.36}$$

The first approach is to simply solve (5.36) globally via the LS estimator

$$
\begin{bmatrix} \mathbf{a} \\ o \end{bmatrix} = \begin{bmatrix} \boldsymbol{\Xi}\_{\text{s}} \boldsymbol{\Xi}\_{\text{s}}^{\text{T}} & \boldsymbol{\Xi}\_{\text{s}} \mathbf{1}\_{M1} \\ \boldsymbol{\mathbf{1}}\_{M1}^{\text{T}} \boldsymbol{\Xi}\_{\text{s}}^{\text{T}} & M \end{bmatrix}^{-1} \begin{bmatrix} \boldsymbol{\Xi}\_{\text{s}} \\ \boldsymbol{\mathbf{1}}\_{M1}^{\text{T}} \end{bmatrix} \cdot \boldsymbol{\tau}. \tag{5.37}
$$

It follows, of course, that this method of parameter estimation has the lowest mean squared deviations on the training data set. In the second approach, the auxiliary variables ̃ and ̃ are introduced, which represent the slopes and the offsets if each feature is considered individually. Subsequently, the individual features s, are mapped onto the time-delay differences domain by

$$\tilde{\mathbf{T}} = (\tilde{\tau}\_{km})\_{k=1\dots K\_s, m=1\dots M} \text{ with } \tilde{\tau}\_{km} = \tilde{a}\_k \cdot \xi\_{s,km} + \tilde{o}\_k \,. \tag{5.38}$$

Finally, the individual slopes ̃ and offsets ̃ are weighted to get the final model parameters

$$\mathbf{a} = \mathbf{w} \odot \tilde{\mathbf{a}}\tag{5.39a}$$

$$
\rho = \mathbf{w}^T \tilde{\mathbf{o}}\_{\prime} \tag{5.39b}
$$

using the Hadamard product ⊙ and the weight vector = 1/ ‖<sup>1</sup> ‖ 1 , with <sup>1</sup> ∈ ℝ<sup>s</sup> being the first eigenvector of the eigenvalue decomposition of the covariance matrix cov() = ̃ <sup>T</sup>. This is motivated by the fact that the PCA finds the direction in which the most variance of the data can be observed. Therefore, less important features are automatically given a lower weighting factor. The last open question is how to find the individual slopes and offsets. One approach is once again the LS estimation, when only the observations of a single feature, denoted by the feature row vector T s, <sup>∈</sup> <sup>ℝ</sup>1×, are considered. This leads to <sup>s</sup> separate LS estimations

$$
\begin{bmatrix} \tilde{a}\_k \\ \tilde{o}\_k \end{bmatrix} = \begin{bmatrix} \mathbf{\tilde{\xi}}\_{s,k}^T \mathbf{\xi}\_{s,k} & \mathbf{\xi}\_{s,k}^T \mathbf{1}\_{M1} \\ \mathbf{1}\_{M1}^T \mathbf{\xi}\_{s,k} & M \end{bmatrix}^{-1} \begin{bmatrix} \mathbf{\tilde{\xi}}\_{s,k}^T \\ \mathbf{1}\_{M1}^T \end{bmatrix} \cdot \boldsymbol{\tau}. \tag{5.40}
$$

Alternatively, the slopes and the offsets can be estimated using the median-based approach

$$\tilde{a}\_k = \text{median}\_m \left\{ \frac{\tau\_m - \overline{\tau}}{\xi\_{s,km} - \overline{\xi}\_s[k]} \right\} \; \prime \tag{5.41a}$$

$$
\tilde{o}\_k = \overline{\tau} - \tilde{a}\_k \cdot \overline{\xi}\_{s,k} \,\prime \tag{5.41b}
$$

which is more robust to outliers than the LS-based estimation. In (5.41), the average of the time-delay differences and the features over the measurements are denoted by and s,, respectively. Both the median- and the LS-based estimation of the slopes ̃ can also be adapted to the more restricted model without offset (5.35b). To do this, simply remove from equation (5.40) the columns and rows containing the all-one matrix 1 and from (5.41) the average values , s,. In the following, these weighted regression training strategies are abbreviated as the weighted least squares (WLS) regression and the weighted median (WM) regression.

Compared to the linear models, the GPR model (5.35c) is harder to train. The parameters that have to be estimated are the kernel function and the noise variance. As fitting a Gaussian process to training data is not the focus of this thesis, only a short description of the assumptions and the optimization are given here. For a more detailed explanation, see the work of Rasmussen and Williams [96]. Firstly, no prior is given, i.e., the expectation values of the weights in the Bayesian linear regression model are set to zero and the variance is assumed to be stationary—this is also known as simple kriging. Secondly, the squared exponential kernel is used as the covariance kernel function. Based on these assumptions, the log marginal likelihood is minimized using a quasi-Newton method with the Symmetric Rank 1 method to approximate the Hessian matrix. Cross-validation is not used as the training takes several hours for only one run.

#### 5.5.3 Unsupervised time delay estimation

In practical applications, the requirement of having a training data set with corresponding labels is a hard constraint that severely limits the

applicability of the previously discussed methods. However, there is no known method to train a regression model without access to the labels of the training data set. Therefore, another approach for the TDE, which is based on the optimization of an extended AMDF, is presented. The aim is to estimate the time-delay differences solely from the selected feature matrix <sup>s</sup> , which is why, in principle, this approach could be used without the FRM. Nevertheless, the approach presented in the following is only used to generate the labels to the training data set, which are subsequently used for the training of the regression models. There are two main reasons for this. First, solving the optimization problem package-wise limits the dynamics of the measurement system in an online—also called real-time—application and requires much more computational effort than the simpler linear regression. Second, while, through the regression-based method, the TDE can be performed on a single measurement consisting of a delay and reference signal, the optimization approach only allows the estimation of multiple time-delay differences for an entire measurement package.

In the following, the objective function to minimize is derived. The idea is to get an estimate of the target signals from both the reference and the delay signals

$$
\tilde{s}\_{\mathbf{t}}^{\mathbf{d}}(nt\_{\mathbf{s}}) = s\_{\mathbf{d}}(nt\_{\mathbf{s}} + \tilde{\tau}\_{m}/2; m) - \tilde{s}\_{\mathbf{i}}(nt\_{\mathbf{s}} + \tilde{\tau}\_{m}/2) \, , \tag{5.42a}
$$

$$
\tilde{s}\_{\mathbf{t}}^{\mathbf{r}}(nt\_{\mathbf{s}}) = s\_{\mathbf{r}}(nt\_{\mathbf{s}} - \tilde{\tau}\_{m}/2; m) - \tilde{s}\_{\mathbf{i}}(nt\_{\mathbf{s}} - \tilde{\tau}\_{m}/2) \, , \tag{5.42b}
$$

which are obtained by removing the estimated interfering signals ̃<sup>i</sup> (<sup>s</sup> ) followed by a time shift with the estimated time-delay difference ̃ . As these two estimated target signals should be identical, if the interfering signals and the time-delay differences are estimated correctly, the objective function to minimize can be set to

$$J\left(\tilde{\tau}\_m, \tilde{s}\_\mathbf{i}(nt\_\mathbf{s})\right) = \sum\_{n=0}^{N-1} \left| \tilde{s}\_\mathbf{i}^\mathbf{d}(nt\_\mathbf{s}) - \tilde{s}\_\mathbf{i}^\mathbf{r}(nt\_\mathbf{s}) \right| , \tag{5.43}$$

which can be interpreted as the time-discrete AMDF with the input signals freed from interfering signals beforehand (cf. (2.9)). However, using the objective (5.43) has several drawbacks. As the signal model (3.25) has already shown, the time-delay difference differs for different time

ranges, i.e., the model is not globally valid. Furthermore, the frequency characteristics of the target signals are not taken into account and the optimization is too high-dimensional to converge due to each interfering signals sample ̃<sup>i</sup> [] being a parameter to optimize. These drawbacks are alleviated by the adaption that the objective is solved in the subspace spanned by the best basis function ℂ []. Thus the objective function is reformulated as

$$J(\tilde{\tau}\_m, \tilde{s}\_\mathbf{i}[n]) = \left| \sum\_{n=0}^{N-1} \tilde{s}\_\mathbf{i}^\mathbf{d}[n] \psi\_k^\mathbf{C}[n] - \tilde{s}\_\mathbf{i}^\mathbf{r}[n] \psi\_k^\mathbf{C}[n] \right|$$

$$= \left| \left< \tilde{s}\_\mathbf{i}^\mathbf{d}[n] \right>, \psi\_k^\mathbf{C}[n] \right>\_n - \left< \tilde{s}\_\mathbf{i}^\mathbf{r}[n] \right>\_n \left| \right>\_n \Big| \,. \tag{5.44}$$

Similar to before, the inner products of the basis function with the input signals represent the AWPT coefficients <sup>d</sup> [], <sup>r</sup> [] of the respective subspace. Since the basis function has compact support in the timefrequency range, only the frequency and time range of the target signals are considered by the objective function (5.44). Moreover, due to the orthogonality of the AWPT, only interfering signals of the form

$$
\tilde{s}\_{\mathbf{i}}[n] = \tilde{c}\_{\mathbf{i}}^{\text{Re}} \psi\_{\mathbf{k}}^{\text{Re}}[n] + \tilde{c}\_{\mathbf{i}}^{\text{Im}} \psi\_{\mathbf{k}}^{\text{Im}}[n] \tag{5.45}
$$

result in non-zero coefficients if the problem is projected onto the subspace. This reduces the degrees of freedom of the interfering signals to two. In order to estimate the time shift with subsample precision, the inner products are evaluated via Parseval's theorem in the Fourier domain. Inserting (5.42) and (5.45) into the objective function (5.44) and transforming the result into the Fourier domain leads to

$$J(\tilde{\tau}\_m, \tilde{c}\_{\mathbf{i}}^{\text{Re}}, \tilde{c}\_{\mathbf{i}}^{\text{Im}}) = \left| \left< S\_{\text{d}}[n\_k; m] \right>, \Psi\_{\mathbf{k}}^{\text{C}}[n\_k] \, \text{e}^{-j\pi \frac{n\_k}{N t\_x} \tilde{\tau}\_m} \right>\_{n\_k}$$

$$- \left< S\_{\text{r}}[n\_k; m] \, \left. \Psi\_{\mathbf{k}}^{\text{C}}[n\_k] \, \text{e}^{j\pi \frac{n\_k}{N t\_x} \tilde{\tau}\_m} \right>\_{n\_k} \right. \tag{5.46}$$

$$- \tilde{c}\_{\mathbf{i}}^{\text{Re}} k^{\text{Re}}(\tilde{\tau}\_m) - \tilde{c}\_{\mathbf{i}}^{\text{Im}} k^{\text{Im}}(\tilde{\tau}\_m) \right| \, \.$$

with the time-shift-dependent factors

$$k^{\rm Re}(\tilde{\tau}\_m) = \left\langle \Psi\_k^{\rm Re}[n\_k] \right\rangle, \ -2\mathrm{j}\,\Psi\_k^{\rm C}[n\_k] \sin(\pi\frac{n\_k}{N t\_s}\tilde{\tau}\_m) \Big\rangle\_{n\_k},\tag{5.47a}$$

$$k^{\rm Im}(\tilde{\tau}\_m) = \left\langle \Psi\_k^{\rm Im}[n\_k] \right\rangle, -2\mathrm{j}\,\Psi\_k^{\rm C}[n\_k] \,\sin(\pi\frac{n\_k}{N t\_s}\tilde{\tau}\_m) \Big\rangle\_{n\_k}.\tag{5.47b}$$

In these equations, capital letters represent the DFT of the respective signals. A closer look at the objective function shows that three unknown variables have to be solved while only one measurement is available, rendering the problem underdetermined.

Due to this, the objective function for multiple measurements are combined. The stationary-interfering-signals-assumption means that the factors ̃<sup>i</sup> Re , ̃i Im are constant for different measurements—assumed that the process conditions do not vary to a large extent. However, while the interfering signals are constant, the time-delay difference differs from measurement to measurement. A solution to this problem can be obtained from the PCA of the part of the feature matrix <sup>s</sup> that contains only the absolute-difference features , since these features are unaffected by interfering signals. Even though their sensitivity is dependent on the signal amplitude in the time range of the corresponding basis function, the first eigenvector of the PCA <sup>1</sup> = ()=1… is a good approximation of the relative relationships between the time-delay differences of different measurements.

Finally, the derivation of the objective function leads to the threedimensional minimization problem

$$(\hat{\tau}\_0, \hat{c}\_{\mathbf{i}}^{\text{Re}}, \hat{c}\_{\mathbf{i}}^{\text{Im}}) = \operatorname\*{arg\,min}\_{\tilde{\tau}\_0, \tilde{c}\_{\mathbf{i}}^{\text{Re}}, \tilde{c}\_{\mathbf{i}}^{\text{Im}}} \sum\_{m=1}^{M} J\left(\frac{\tilde{\tau}\_0 v\_m}{\|\mathbf{v}\_1\|\_2}, \tilde{c}\_{\mathbf{i}}^{\text{Re}}, \tilde{c}\_{\mathbf{i}}^{\text{Im}}\right),\tag{5.48}$$

which can be solved by the interior point algorithm. Consequently, the labels of the entire training data set can be obtained by ̃ = ̃<sup>0</sup> / ‖<sup>1</sup> ‖ 2 . These labels are then used to train the regression models with the previously presented methods. If the chosen selection method works in an unsupervised fashion and the labels are estimated through the optimization approach, the entire FRM is also unsupervised, rendering this class of TDE methods directly comparable to the SDMs and the SotA TDE methods.

# 6 Experimental Results

The methods presented in the chapters 4 and 5 are based exclusively on the signal model and can be applied to a wide range of ultrasonic systems, where the objective is to get a high-precision estimate of very small time delays and the interfering signals are the main influence that limit the precision. The only constraint is that the assumptions on the stationarity of the interfering signals must hold and the target signals must show a varying time delay, when an entire measurement package is observed.

In order to evaluate the proposed methods on experimental data, the following presents how the methods perform in an UFM setup. To this end, the measurement setup with six different data sets and the evaluation metrics are described in the first three sections. Subsequently, the performances of the SotA methods are evaluated and the methods that perform best on the respective evaluation metrics will be used as baseline for the new methods—SDM and FRM. After the novel methods are evaluated in terms of systematic and stochastic errors for different hyper parameter constellations, the chapter is concluded with an overall discussion and comparison of the different methods.

#### 6.1 Measurement setup

For the UFM system, eight ultrasonic clamp-on transducers (UT1,…,UT8) are connected to a pipe, which was made of 4 mm stainless steel with 8 cm diameter. The transducers are arranged in two circles, i.e., four transducers per circle, with the circles being 6 cm apart in axial direction. Four transducer pairs (TP1,…,TP4), which show a measuring effect, i.e., Δ ≠ 0, can be built with this configuration. In order to obtain a data set with a physical damping of the interfering signals, another identical pipe with the same transducer configuration but an additional damping mat

Figure 6.1 Setup of the UFM system. For better illustration, in axial direction (left) the cross-sectional view is given and in radial direction (middle) only the transducers that build a TP are drawn, i.e., the ultrasonic transducers 3, 4, 7, 8 are not visible in the axial cross section and the ultrasonic transducers 5, 6, 7, 8 are not visible in the radial view.

is constructed. As damping material aluminum butyl—a self-adhesive butyl rubber with an aluminum carrier foil—is used. The transducer and pipe setup with an overview of the dimensions is depicted in Fig. 6.1. For better illustration, only four of the eight transducers are shown in both the axial and radial view. Additionally, possible ultrasonic propagation paths are included in the setup, e.g., the direct path (violet), an axial reflection (dashed green), and a radial reflection (green). Note that the interfering signals propagate through the pipe wall (see the dashed red path in the radial view) and that the direct path's radiation angle into the medium is subjected to Snell's law.

The pipe prepared in this way, also called measurement section, was inserted into a circular flow with a pump to control the flow, a heating element to control the temperature and a reference flow meter to get the ground truth for the VoF. By installing a flow straightener in combination with a sufficient inlet length right before the measurement section, a fully developed turbulent flow profile was achieved. This system is filled with water or ethanol, depending on the data set to be recorded, and any remaining air is removed as best as possible via a bubble separator. Sometimes, especially at the beginning of the experiments, a few remaining stuck bubbles may come loose during the experiment, which distorts the measurement signals due to the scattering of the ultrasonic waves. The measurements recorded during these incidents are removed before the evaluation, which is why sometimes a few data samples are

Figure 6.2 Typical measurement signal with highlighted signal composition.

missing. In terms of temperature, the experiments are set in such a way that the measurements go up to about 30 °C, since the overall system can then be considered safe, even if, for example, ethanol is used as medium. As starting temperature, the lowest possible value is chosen, which is dependent on the room temperature since no cooling element was installed. The exact temperature curve during the experiments depends on the specific heat capacity of the medium, since the heating element was not installed in closed-loop control.

The data acquisition was performed at a sampling rate of <sup>s</sup> = 50 MHz using a preamplifier and a PXIe-1062 station with a PXIe-5171 ADC module and a PXI-5412 DAC module, which also controlled the pump and the heating element. Combining all controls, the excitation, and the recording of the ultrasonic waves into a single system allowed having synchronized measurement readings and reduced jitter. In order to get the ultrasonic recordings of the downstream (reference signal) and upstream (delay signal) direction as simultaneously as possible, and in addition, to perform measurements with multiple TPs under the same process conditions, the excitation and recording order is as follows: UT1/UT2; UT2/UT1; UT3/UT4; UT4/UT3; UT5/UT6; UT6/UT5; UT7/UT8; UT8/UT7. In this notation, the first transducer is operated as sender and the second transducer as receiver, respectively. Since the reverberation of the ultrasound in the pipe should not affect the next

measurement, the time between two consecutive measurements is set to 10 ms even though the absolute recorded time of a single measurement is only 150 µs. An example measurement is depicted in Fig. 6.2, where the estimated composition of the measured signal is highlighted. Note that this highlighted composition is only manually estimated based on the signal envelope, the geometric relationships, and the present time-delay differences. As for all ultrasonic measurements in this thesis, the sending transducer was electrically excited using (3.18), with the center frequency being 700 kHz and the bandwidth being 245 kHz.

# 6.2 Data sets

Six different data sets were recorded to cover a wide range of scenarios. The data sets can be classified according to whether the pump was set to create a constant flow or a varying flow. Furthermore, data sets with two different media—water and ethanol—are recorded, once in an undamped pipe and once in a damped pipe. An overview of the data sets can be found in Table 6.1. The separation of the data sets in constant and varying VoF is important since the FRM requires a variable time-delay difference in the training data set, which is why only data sets with varying VoF are later used to train the FRM. The media influence on one hand the absolute time delay <sup>a</sup> due to their different SoSs and on the other hand the acoustic load of the ultrasonic system, causing the level of the interfering signals to change. The last distinguishing point of the data sets, the damping mat, reduces the interfering signals by a certain factor but also


Table 6.1 Overview of the used data sets and their properties.

(a) Time delay during measurements with varying VoF. (b) Temperature during measurements with varying VoF.

(c) Time delay during measurements with constant VoF. (d) Temperature during measurements with constant VoF.

Figure 6.3 Time delay and temperature process conditions of the recorded data sets.

causes the remaining interfering signals to be no longer stationary since the attenuation of the selected material is temperature-dependent.

Figure 6.3 shows the time-delay differences and the temperature values during the measurements. For the constant flow data sets, the VoF was set to 1 m s−1 and for the varying flow data sets, the three flow velocity levels 0.3 m s−1 , 0.6 m s−1 , 1.2 m s−1 were repeated cyclically. Since the experiments were terminated at about 30 °C and the different media and pump control strategies resulted in temperature rises of different degrees, the number of measurements in the data sets is not consistent. Furthermore, the time-delay difference varies even in the case of constant flow, because varying temperature leads to varying SoS and this, in turn, affects the time-delay difference, cf. Eq. (3.26b). Especially for the medium

ethanol (D3c, <sup>D</sup>3v), the temperature increases rapidly due to the lower specific heat capacity, which is why the data sets for ethanol contain only less than 1500 measurements, while the data sets for water contain more than 2000 measurements. Also the time-delay differences of the ethanol data set are generally larger due to the smaller SoS.

Contrary to the ethanol data sets <sup>D</sup>3c and <sup>D</sup>3v, the water data sets with the damping mat attached to the pipe <sup>D</sup>2c and <sup>D</sup>2v do not differ in terms of process conditions from the water data sets without damping mat. The only difference is the level of interfering signals and the fact that the assumption on their stationarity is not met.

The last point to mention is that for each measurement in the data sets there are ultrasonic measurements from four independent TPs, each of which can be considered as a realization of the stochastic process that determines the shape of the ultrasonic signals. Here, the exact shape of the ultrasonic signals is considered random because they are sensitive to minute variations in material parameters, acoustic coupling, and the geometric arrangement of the transducers. Therefore, the probability to obtain exactly the same ultrasonic signals for two different TPs is nearly zero, even if the same transducers are removed from the pipe once and reattached to the pipe. However, the basic properties of the ultrasonic signals, such as the time of arrival of the target signals or the time-delay difference, are preserved.

Finally, the notation of the data sets is explained: each used data set is completely defined by the notation <sup>D</sup>1c|TP1, where the symbol before the vertical bar denotes which data set and the number after the vertical bar denotes which TP to use. In this example, the first TP of the constant flow water data set in the undamped pipe is meant.

## 6.3 Preliminaries and evaluation metrics

The performances of the proposed TDE methods are evaluated in terms of accuracy, robustness against noise, and possible requirements for training data sets or necessary signal dynamics. For this purpose, several metrics are introduced to describe the performance. Furthermore, the performances are investigated while varying the available signal dynamics, training data sets, and SNR. Before the methods can be compared, a few preliminary tasks have to be solved. In addition to determining the ground truth, this also includes estimating the time ranges to be evaluated for the time-based methods—which essentially include all SotA methods and the SDMs—or characteristic values such as the SNR or the CRLB.

To begin with, the ground truth is to be determined. To this end, a reference measurement system from Endress+Hauser yields the average VoF , which has to be converted into a time-delay difference via reformulation of (3.4) to the time delay . Additionally, the hydraulic correction factor (3.8) has to be multiplied by the average VoF to get the average velocity of the flow along the ultrasonic transmission path. In summary, the ground truth is calculated by

$$\tau = \frac{2k\_{\rm h}(\overline{v}, T) \cdot \overline{v} \cdot L \cdot \cos(\alpha)}{c\_{\rm M}^2(T)} \, \tag{6.1}$$

with the hydraulic correction factor [48]

$$k\_{\rm h}(\overline{v}, T) = 1.125 - 0.0115 \cdot \log\_{10} \left( \text{Re}(\overline{v}, T) \right) \tag{6.2}$$

which in turn depends on the Reynolds number

$$\operatorname{Re}(\overline{v}, T) = \frac{\overline{v} \cdot D \cdot \rho(T)}{\eta(T)}.\tag{6.3}$$

The geometric sizes, such as path length and beam angle in (6.1) are calculated by Snell's law, the transducer placement, and the pipe geometry. The other fluid properties —the SoS, the density () and the dynamic viscosity ()—are determined according to the NIST webbook [67]. Note that since the Reynolds number is Re > 10<sup>4</sup> for all temperatures and all VoFs during the experiments, the flow is always fully turbulent.

Next, the expected effects of the interfering signals on the systematic error of the TDE and a good metric that reflects the interfering signal level, in this thesis called ScNR, are discussed. Once again, the narrowband assumption is used to derive the expected estimation errors. As already shown in the introduction, the additive interfering signals lead to a phasedependent error of the TDE, if the narrowband assumption holds (cf. Fig. 1.4). However, these results are only valid for large variations of the

Figure 6.4 Relative estimation error depending on the absolute time-delay difference for different ScNRs and time-delay differences.

time delay, such as for the absolute time delay in the UFM application. When small time delays are present, such as the time-delay differences in UFM, the effects on the systematic error of the time-delay difference estimate can be described by calculating (1.4) once for the reference and the delay signal and then taking the difference between the results. This is equivalent to calculating the gradient of the curves shown in Fig. 1.4 by the secant approximation. Mathematically, the estimation error can be expressed via the difference of the arctangent functions and simplified by a Taylor series expansion, which is terminated after the linear member, leading to

$$\begin{split} \hat{\tau} - \tau &= \hat{\tau}\_{\text{a}} \left( \frac{\tau}{2}, \tau\_{\text{a}} \right) - \hat{\tau}\_{\text{a}} \left( -\frac{\tau}{2}, \tau\_{\text{a}} \right) - \tau \\ &\approx C \left( \tau\_{\text{a}}, \varphi, \frac{A\_{\text{t}}}{A\_{\text{i}}} \right) \cdot \tau\_{\text{\textdegree}} \end{split} \tag{6.4}$$

with a phase and ScNR-dependent constant and the respective estimated absolute time delays

$$\hat{\tau}\_{\mathbf{a}}\left(\tau,\tau\_{\mathbf{a}}\right) = \frac{1}{\omega\_0} \arctan\left(\frac{A\_{\mathbf{t}}\sin(\omega\_0(\tau\_{\mathbf{a}}+\tau)) + A\_{\mathbf{i}}\sin(\varphi)}{A\_{\mathbf{t}}\cos(\omega\_0(\tau\_{\mathbf{a}}+\tau)) + A\_{\mathbf{i}}\cos(\varphi)}\right) . \tag{6.5}$$

Figure 6.4 shows a numerical evaluation of the estimation error derived in (6.4) for two different ScNRs, each with three different time-delay differences. There are three insights to take from this figure. Firstly, the

different time-delay differences lead to the same relative error, which justifies the use of the percentage error as the evaluation measure. Secondly, the percentage error is dependent on the absolute time delay, which is why the evaluation of a TDE method in UFM may only be done by looking at the percentage errors over all absolute time delays covering one period 1/<sup>0</sup> . Thirdly, even for very small interfering signal levels (ScNR = 40 dB), the estimation error amounts up to 1 %. In summary, when considering the estimation error plot as a function of temperature, an oscillating error is expected, whose amplitude depends on the ScNR, since a varying temperature leads to a varying absolute time delay.

Due to the estimation error being approximately proportional to the ground truth of the time-delay difference, the performance of the TDE methods is evaluated in terms of percentage error for each measurement

$$E[m] = 100\,\% \cdot \frac{\tau\_m - \tau\_m}{\tau\_m} \,. \tag{6.6}$$

Following the argumentation above, to get the performance of a method independently from the current phase between target and interfering signals, the mean absolute percentage error

$$\text{MAPE} = \frac{1}{\tilde{M}} \sum\_{\tau\_{n,m} \in \mathfrak{t}\_{\text{g}}} |E[m]| \tag{6.7}$$

is calculated, with the absolute time delays of the test measurements a, being drawn uniformly from the interval

$$\mathfrak{k}\_{\rm g} = [\tau\_{\rm a,0}, \tau\_{\rm a,0} + 1/f\_0] \, \prime \, \tag{6.8}$$

which contains approximately one period of the signals. In equations (6.7)–(6.8), the measurement with the smallest available absolute time delay in the data set is considered the beginning of the interval a,0 and the number of measurements taken into account for the MAPE is ̃. A special case, that will be later used, is the evaluation of the FRM. Since it is a learning-based method, a distinction should be made between a global quality and a local quality when examining the performance. For the local quality, the MAPE includes only test measurements, whose absolute time delay is contained in the interval

$$\mathfrak{k}\_{\mathbf{t}} = \left[ \min(\tau\_{\mathbf{a},m}^{\mathrm{Train}}) \text{ , } \max(\tau\_{\mathbf{a},m}^{\mathrm{Train}}) \right] \, \mathsf{} \tag{6.9}$$

129

which is spanned by the absolute time delays contained in the used training data.

Additionally to the MAPE, which evaluates the estimation error on the entire test data set regardless of its origin, the standard deviation of the stochastic percentage error (STDSPE) is introduced as a metric to assess the stochastic error decoupled from the systematic error. To this end, after the systematic error is removed via subtraction of the lowpass filtered component, the STDSPE is defined as the sample standard deviation

$$\text{STDSPE} = \tilde{\sigma}\_m \left\{ E[m] - \mathcal{L} \mathcal{P} \{ E[m] \} \right\} \,. \tag{6.10}$$

Since the low-pass filtering should not result in a delay of the estimation error curve and should be robust against outliers, a moving median filter of order 41—where the odd order is advantageous to obtain a symmetric filter—was chosen. If this stochastic error is not expressed as a percentage, a theoretical lower bound, called the Cramér-Rao lower bound (CRLB), can be calculated. Although Carter [15] presented the equation to calculate the CRLB for TDE problems, the PSDs of the noise and target signals must be estimated with sufficient accuracy, which is not possible due to the measurement signals being too short. Therefore, the simplified formula according to Walker and Trahey [124]

$$\text{CRLB} = \sqrt{\frac{1 + 2 \text{SNR}}{\text{SNR}^2} \cdot \frac{3}{2\pi^2 t\_{\text{obs}}} \cdot \frac{1}{12\,\sigma\_\text{f} f\_0^2 + \sigma\_\text{f}^3}}\,\text{}\tag{6.11}$$

which assumes flat spectra with constant SNR and limited bandwidth, is applied to estimate the CRLB. Equation (6.11) is dependent on the observation time obs, the bandwidth <sup>f</sup> , the center frequency <sup>0</sup> and the SNR. Unfortunately, these quantities are also not directly known and must be estimated from the measurement signals. Consequently, the PSDs are approximated via Welch's method using the same hyperparameters as for the investigation of the target signals' spectral characteristics in Chapter 3. Based on the PSDs, the 3 dB bandwidth is obtained, i.e., the frequency range in which the PSD is above half its maximum value. Furthermore, the center frequency is chosen as the midpoint of the 3 dB frequency band. Since the AWGN is distributed over the entire time and frequency range, but the target signals are only present in a small time range and frequency band, the SNR is calculated as the ratio of

the signal energies in the considered time range and frequency band of the target signals. This has the advantage that only the part of the noise that interferes with the target signals is considered since the remaining part can theoretically be removed by filtering. It should be noted that the different methods each use a slightly different observation window and therefore the CRLB must be calculated separately for each method to obtain comparable results.

The last preliminary task is to find the time range of the target signals. This time range is only necessary for the SotA methods and the SDMs, because the training-based FRM implicitly finds its useful time ranges through the feature selection. For simplicity and since the evaluation is based on the time-delay differences, whose ground truth is given by (6.1) only for the diametral path, only the direct propagation will be used. Because the algorithm to detect the time range needs to be very robust, several conditions are imposed on the algorithm to ensure that only meaningful time ranges are returned. However, a complete description of the algorithm would distract from the focus of the work, which is why only the basic approach is discussed here. Further details can be found in [157]. The basic principles of the algorithm can be summarized by the following steps:


Finally, the local maximum, which is closest to the theoretical time delay calculated by the path length and the SoS, is taken to be the wave packet of the direct propagation path. Based on this rough determination, the upper limit of the time range is the next consecutive local minimum of the envelope, as this is considered the time, when the next wave packet starts. Either the local minimum before the determined local maximum, the time at which the envelope falls below a certain threshold or the minimum limit, which depends on the upper limit and the determined local maximum, is selected as the lower limit of the time range. The decision which lower limit to choose is made in such a way that the resulting time range is as small as possible.

# 6.4 State-of-the-art methods

In this section the baseline for the proposed TDE methods is determined. For each data set, the time-delay differences of the direct propagation path are estimated by using the time-based zero-crossing method (ZCM), and four different correlation-like methods, namely the cross-correlation (CC), the generalized cross-correlation (GCC) with the phase transform processor, the AMDF, and the ASDF. Since the time-delay difference has to be estimated with subsample precision and depends on the observed time range, the SotA methods are combined with the time range detection presented in the previous section and with an interpolation. For the interpolation of the zero crossing, the method proposed by Kupnik et al. [63] is applied. In contrast, the CC, the GCC, and the ASDF use a quadratic interpolation around the maximum of the correlation function. Among the correlation-like methods, only the AMDF method is an exception, since here the objective is to find a minimum of the AMDF, whose shape looks like the absolute value of a sine wave around its minimum. Therefore, a double linear interpolation is applied for the AMDF-based subsample TDE.

The results of the ZCM, the CC and the AMDF for the four transducer pairs of data set <sup>D</sup>1c are shown in Fig. 6.5. As already expected, the estimation error shows an oscillation with a period in the range 1− 1.5 µs, which shows a good agreement with the theory as a period duration of

Figure 6.5 Error curves of some selected SotA methods. The respective MAPE is given in brackets. Although the estimation error of all measurement signals contained in the data set <sup>D</sup>1c is shown here, the evaluation metric MAPE was only calculated on the set covering one period cycle according to (6.8).

1/<sup>0</sup> ≈ 1.4 µs was predicted theoretically. The different period durations for the different transducer pairs can be explained by the slightly varying material characteristics of each transducer. Only the local frequency of the target signals and the interfering signals in the evaluated time range are relevant. Therefore, the local frequencies may deviate from the excitation frequency. Furthermore, this figure shows that even the level of the interfering signals may differ from transducer pair to transducer pair, since, e.g., the amplitude of the oscillating estimation error [] for TP4 is significantly smaller than for TP3. Apart from that, all SotA


Table 6.2 MAPE results of the different SotA TDE methods for all data sets.

methods investigated here perform essentially the same with only slight differences. While the ZCM is best in terms of MAPE for two of the transducer pairs, the AMDF and the CC are better for the other two transducer pairs. Normally, the zero-crossing-based TDE would be more noisy, but using the approach of Kupnik et al. [63], the information of a whole period can be taken into account to get a zero crossing that is more robust against noise. For this reason, no obvious difference in the stochastic estimation error can be observed.

The MAPE results of the remaining SotA methods on all data sets are listed in Table 6.2. While the MAPEs of the respective best methods are highlighted in green, the average values over all transducer pairs are printed in bold. The best values of the respective data sets represent the baseline when, in the further course, the results of the SDMs or FRMs are given for individual transducer pairs or are averaged over an entire data set. It is noticeable that the damping mat has significantly reduced the interfering signals, because the MAPE values of the data set <sup>D</sup>2c are significantly lower compared to the data set without damping mat <sup>D</sup>1c.


Table 6.3 STDSPE results of the different SotA TDE methods for all data sets.

Furthermore, the ethanol data set <sup>D</sup>3,c seems to have the highest level of interfering signals, which can be explained by the worse acoustic coupling between the pipe wall and the liquid. Once again large differences can be observed between the different transducer pairs of the same data set. In summary, all SotA methods are in the same order of magnitude, with the GCC method showing slight advantages on average for all data set.

Complementary to the MAPE, which mainly evaluates the systematic error—except in the case when the systematic error is significantly smaller than the stochastic error—the STDSPEs for all data sets as a quality measure for the stochastic errors are given in Table 6.3. Here, it can be seen that, as already suspected from the error curves in Fig. 6.5, the stochastic errors of the methods are very similar, with the AMDF being slightly better than the other methods. Interestingly, the differencebased correlation of the AMDF can handle noise better, but against the interfering signals, it performs rather averagely.

# 6.5 Signal-dynamics method

The SDMs can be separated into the PCA-based, the B-spline(BS)-based and the joint-B-spline(JBS)-based point cloud processing, each of which can be further separated into the direct TDE estimation approach, the local interfering signals compensation and the global interfering signals compensation. The difference between the latter two is that in the local case, the interfering signals are compensated for each measurement package individually, while in the global case, all estimated interfering signals are combined, freed from outliers, and then averaged to get an estimate that is used for the interfering signals compensation of all measurement packages equally. Firstly, the direct TDE methods on a single data set are investigated, examining both the expected properties and the dependency on the process conditions. In the next step, the concept of compensating the interfering signals locally and globally is proven. Furthermore, visual examples of the estimated interfering signals, the outlier detection, and the globally estimated interfering signals are shown. This section concludes by examining how to specify the hyperparameters. To this end, the approaches are applied to the water data sets using all hyperparameter combinations. Then, using the optimized hyperparameter set, the performances of the SDMs are averaged over all data sets and TPs and compared with the GCC method—the best SotA method in terms of MAPE.

As in the rest of the thesis, for the SDMs to be applicable, the measurement signals of the data sets were divided into packages with a signal dynamic of 50 ns, i.e., within the package there are measurement signals whose absolute time delays cover an interval of the width 50 ns. An exception to this is the investigation of the influence of the signal dynamics in Section 6.5.3. Furthermore, the packages were selected from the entire data set with a variable overlap such that they uniformly cover the entire data set.

#### 6.5.1 Direct time delay estimation

In the first evaluation, the direct TDE methods are applied to the water data set with constant flow <sup>D</sup>1,c|TP1 to check whether the approach can also be applied to real data. After the signals were separated into

Figure 6.6 Results of the direct TDE using the SDM for the data set <sup>D</sup>1c|TP1. The signal dynamics of the packages are set to Δ<sup>a</sup> = 50 µs. The median absolute time delay a, of each signal package was distributed uniformly on the whole process condition range.

packages and the relevant time ranges were extracted package-wise by the method described at the end of Sec. 6.3, the time-delay differences are estimated using the PCA-based, the BS-based, and the JBS-based point cloud processing, respectively. While for all methods the same signal dynamics (50 ns) were specified, the overlap for the PCA-based TDE was increased to get a densely sampled error curve comparable to the BS- and JBS-based SDMs. This is necessary since the PCA-based direct TDE can only estimate one average time-delay difference per package. The last hyperparameter to specify is the point cloud generation for the BS- and the JBS-based approaches. According to Equation (4.47), there are two possibilities: either the point clouds of the delay and reference signals are generated and processed separately, or the delay and reference signals are concatenated to get a combined point cloud. In this evaluation, the point clouds were concatenated.

The results are depicted in Fig. 6.6. Although all three methods basically estimate the time-delay difference well, there are slight differences, which can be explained by the different properties. The PCA-based SDM is less noisy. This is to be expected because the time-delay differences are averaged over the whole signal package, which can be interpreted as an individual estimation with subsequent moving average filtering. Furthermore, the joint estimation of the BSs for two consecutive time steps obviously also reduces the stochastic component of the TDE. However,

Figure 6.7 Results of the direct TDE using the SDM for the data set <sup>D</sup>1v|TP1 with varying time delay. The signal dynamics of the packages are set to Δ<sup>a</sup> = 50 µs. The performance of the PCA-based SDM deteriorates significantly because the requirement—a constant time-delay difference—is not met at all.

in terms of systematic error, the BS-based SDM shows better results than the PCA-based and the JBS-based SDMs. The larger systematic error of the PCA-based SDM can be explained by the fact that the constant time delay assumption does not perfectly hold, since the time-delay difference varies even in constant flow scenarios due to the varying SoS (see Fig. 6.3). Contrary to this, the JBS-based approach relies more on the assumption that the interfering signals are stationary, i.e., they do not vary with varying temperature or varying VoF, compared to the BS-based SDM. This requirement is also not quite met, which is why probably the JBS-based SDM shows a slightly higher systematic error.

If the methods are evaluated for a data set with varying flow <sup>D</sup>1,v|TP1, as shown by the results in Fig. 6.7, the situation is quite different. While the spline-based methods still work well, the PCA-based SDM is no longer applicable because the requirement—a constant time-delay difference is not met at all. Contrary to the PCA-based SDM, a visual comparison of the spline-based methods with each other is hardly possible in this presentation form. Only for high absolute time delays, corresponding to low temperatures in the water data sets, where air bubbles are more often present in the experimental setup, the JBS-based approach shows more outliers, but further evaluations are needed for a general statement.

Figure 6.8 Estimated interfering signals for the data set <sup>D</sup>1c|TP1. The different interfering signals result from different signal packages, which were created using a signal dynamic of 50 ns with an adaptive overlap as described in the introduction to Section 6.5.

#### 6.5.2 Interfering signals compensation

In the alternative approach, the interfering signals are first determined and removed, and then the time-delay differences are estimated using a state-of-the-art (SotA) method, instead of estimating the time-delay differences directly. As already mentioned in Section 4.2.3, the ZCM according to Kupnik et al. [63] is used for the subsequent TDE after the interfering signals compensation. In Figure 6.8 all estimated interfering signals are shown that result when the interfering signals are estimated separately for each of the signal packages. Equally to the evaluation of the direct TDE, the data set is <sup>D</sup>1c|TP1, the signal dynamics are set to 50 ns per package, the point clouds are concatenated for the splinebased approaches, and the PCA-based SDM uses the constant amplitude assumption. Although most signals are estimated to be quite similar for all methods, some differences occur that cannot be explained by the nonstationarity of the interfering signals. The JBS-based approach, which is expected to be the most robust method, shows the least variation of the estimated interfering signals. Therefore, it is likely that the reasons for the variation are the noise influence and outliers in the point clouds that deteriorate the individual BS-based shape fit of the point cloud. Furthermore, an incorrect assumption of the signal model, e.g., that the amplitude of the target signals does not depend on the temperature, could be the cause.

Figure 6.9 TDE results using the local interfering signals compensation for the undamped water data sets with constant and varying VoF. Overlapping TDEs between the signal packages are resolved by averaging.

If each signal package is compensated by the respective estimated interfering signals, application of the zero-crossing-based TDE yields the time-delay differences for all signals in the package—this is called the local interfering signals compensation. Any overlapping estimates resulting from the overlap of the signal packages can be resolved by averaging. The results for the data sets <sup>D</sup>1c|TP1, <sup>D</sup>1v|TP1 can be observed in Fig. 6.9(a) and Fig. 6.9(b), respectively. In Fig. 6.9(a) it can be seen that the PCA-based SDM continues to perform well for the constant flow data set. However, the BS-based point cloud processing does not work in combination with the local interfering signals compensation (red curve). The inconsistency at 69.7 µsis caused by a gap in the measurement signals, which results from the automatic outlier removal of signals degraded by bubbles in the experiment. Figure 6.9(b) shows that the PCA is still not comparable to the spline-based methods for varying flow data sets. This is due to the fact that the abrupt changes of the flow lead to noncompact point clouds, which cannot be correctly processed by the PCA. In contrast, the spline-based methods show that they are suitable for data sets with highly varying flow velocity because the curved form of the BSs is flexible enough to fit even point clouds with large gaps.

Figure 6.10 Automatic outlier detection to remove bad estimations of interfering signals from Fig. 6.8. For better visualization, an offset has been added to the results from the PCA and the JBS-based method. Left: the detected outliers are shown in dots. On the right, the final estimated interfering signals can be seen, resulting from averaging after outlier removal.

In the global interfering signals compensation, for each signal package the same interfering signals estimate is used for compensation. This can reduce the effect of bad estimates of the interfering signals, which result from outliers in some of the point clouds. To this end, the different estimates are collected and outlier signals are removed by applying the outlier detection algorithm proposed in Section 4.4. The remaining estimates for the interfering signals are averaged to obtain the globally valid estimate—in the following, using this globally valid estimate will be referred to as global interfering signals compensation. A visual example of this procedure can be observed in Fig. 6.10. On the left side, all estimates for the interfering signals are plotted for each point cloud processing method—an offset has been added for better visualization. While the remaining estimates after the outlier detection are drawn solid, all signals classified as outliers are drawn dotted. It is easy to observe that due to the adaptive histogram the limit that defines which signals are identified as outliers is dependent on the variability of the estimates. In the case of lower variability, such as for the JBS-based SDM, even small deviations lead to a classification as outlier, while a higher overall variability, such as for the PCA- or BS-based SDM, does not classify signals as outliers even with larger deviations. The averaged results are then depicted on the

Figure 6.11 TDE results of the global interfering signals compensation for the undamped water data sets with constant and varying VoF, resulting if all measurement signals are compensated by subtracting the same averaged interfering signal.

right side. It shows that the resulting averages of the BS- and JBS-based SDM are quite close, leading to the expectation that the subsequent TDE based on this estimate will perform similarly.

Application of the global interfering signals compensation to the TDE for the water data sets <sup>D</sup>1c|TP1, <sup>D</sup>1v|TP1 yields the estimation errors depicted in Fig. 6.11. Through the usage of a global estimate for the interfering signals, the SDM now produces similar results for the constant and varying flow data sets. Only slightly increased stochastic components at measurements with low time-delay differences can be observed (see the small ripples in Fig. 6.11(b)). Moreover, the expected similar behavior of the BS- and the JBS-based SDMs is confirmed. An important point to notice is that even a small phase shift in the estimated noise signals (Fig. 6.10 on the right) can lead to large estimation errors. This is shown by the larger errors of the PCA-based SDM in Fig. 6.11, whereas the interfering signals estimate exhibited by the PCA-based SDM is only slightly phase-shifted compared to the BS- and JBS-based SDM.

In summary, each of the SDMs has its own advantages and disadvantages. The PCA-based point cloud processing, regardless of whether direct TDE, local or global compensation is used, can only be applied with sufficient estimation quality to constant flow data sets. For data

sets with varying flow velocity, the spline-based methods are generally better. In addition, the spline-based SDM using the global interfering signals compensation has been shown to have the lowest estimation error, regardless of whether the method was applied to a constant or a varying flow data set.

### 6.5.3 Hyperparameter optimization and overall results

The previous two sections only represent a proof-of-concept of the SDMs with a standard set of hyperparameters using the water data sets as an example. While the results could be used to make simple statements about the different approaches, they were neither evaluated quantitatively, nor tested on all data sets with all hyperparameter combinations, nor compared to the SotA methods. Therefore, in this section, the SDMs are firstly tested for all hyperparameter combinations, where a focus lies on the available signal dynamics and their interaction with the available SNR. Finally, all SDMs in their best hyperparameter configurations are applied to the water and ethanol data sets, evaluated in terms of MAPE, and then compared to the best SotA method—the GCC.

To begin with, the different hyperparameter combinations are tested. Figure 6.12 shows the performance of all SDMs in the form of a star plot where the categorical hyperparameters are varied for the water data sets <sup>D</sup>1c|TP1, <sup>D</sup>1v|TP1. A list of the investigated hyperparameter sets are found in Table 6.4. Note that for the PCA-based SDMs only the individual point cloud generation is possible since the geometric


Table 6.4 Combinations of hyperparameters available in the SDMs.

Figure 6.12 MAPE results of all three major point cloud processing methods in direct, local compensation and global compensation form for both data sets <sup>D</sup>1c|TP1 and <sup>D</sup>1v|TP1. The MAPE is coded as the distance from the origin for <sup>D</sup>1c|TP1 and <sup>D</sup>1v|TP1 in green and blue, respectively. Each angle in the star graph represents a hyperparameter configuration of the respective processing method. The baseline in its best configuration (best SotA method) is drawn dashed. For better visualization MAPE values are capped to 7.5 %.

interfering signals estimation requires the PCA of two distinct point clouds to triangulate the interfering signals. Since the computational effort to calculate the hyperparameter combinations full factorial is too high, the signal dynamics are fixed to 50 ns in this first hyperparameter investigation. The subsequent signal dynamics evaluation will show that this setting is justified. Furthermore, the BS- and JBS-based SDMs share the same hyperparameter sets, denoted by BS 1/2, because they only differ in the way the splines and the misalignments are estimated, but not in the way the point clouds are generated. The results from Fig. 6.12 once again shows, that the constant flow data set is a lot easier to manage and that the PCA cannot be applied in the varying flow case with sufficient

Figure 6.13 Interaction between used signal dynamics Δ<sup>a</sup> and measurement noise for the data set <sup>D</sup>1c|TP1. Two different noise levels were examined: the original SNR (85.3 dB, solid lines) and the deteriorated SNR by adding artificial AWGN (62.9 dB, dashed lines).

accuracy. Nevertheless, the varying amplitude assumption ( PCA 3 ) shows slightly better results compared to the constant amplitude assumption, which is why it will be used in the following investigations for the PCAbased SDMs. For both spline-based approaches, the concatenated point clouds are preferable on average. However, even if the hyperparameter combination BS 1 is used, both spline-based methods are still better than the baseline, except for the direct estimation with individual point clouds.

Based on these hyperparameter configurations, the influence of the available signal dynamics is investigated. To this end, the evaluation of each method on the data set <sup>D</sup>1c|TP1 is repeated with different signal packaging strategies, where the package size was adapted such that the signal packages contain the specified signal dynamics. Starting from the minimum 20 ns, the signal dynamics were increased in steps of 5 ns, 10 ns, and 20 ns, depending on the required computational effort of the

respective method and the expected variation in the results—for higher signal dynamics the step size was consequently increased. Figure 6.13 depicts the results of this investigation. Since the graphic examines several aspects of the different SDMs, the results are interpreted step by step.

Firstly, the MAPE representing the systematic error is investigated. As can be easily seen, except for the BS-based SDM using the local compensation, the systematic errors are not significantly reduced further for signal dynamics greater than 50 ns. The JBS-based SDM would even allow lower signal dynamics with only minor degradation of the MAPE. To investigate the relationship to the available SNR, artificial AWGN is added to simulate a lower SNR. The MAPE curve for the lower SNR (drawn as dashed line) shows that the lower limit of the signal dynamics, below which the SDMs can no longer be used with sufficient quality, shifts towards higher signal dynamics. Furthermore, the lower robustness of direct TDE is revealed, which is concluded from the fact that the MAPE varies much more with different signal dynamics and requires larger signal dynamics to perform comparably. Summarizing the dependence of the systematic error on the available signal dynamics, it can be said that the influence is small as long as the signal dynamics are above the lower limit, which is SNR dependent.

Secondly, the STDSPE representing the stochastic error is investigated. Except for the direct estimation form of the SDMs, the stochastic error is again hardly dependent on the available signal dynamics. Similar to the systematic error, the lower limit of the signal dynamics, below which the SDMs can no longer be used, increases. However, additionally to the influence on the lower limit, a lower SNR leads to an increased offset of the stochastic error. The only exception to these statements is the direct estimation. Due to the averaging effect, the PCA-based direct estimation SDM can reduce the STDSPE with increasing signal dynamics, but as already mentioned this is bought by an increased response time of a measurement system based on this algorithm. The other spline-based SDMs do not show that they can reduce the stochastic error below the thresholds approached by the local, and global compensation methods. In summary, as long as the signal dynamics are higher than the SNRdependent limit, further increasing the signal dynamics provides no

Figure 6.14 MAPE performance of all SDMs averaged over the different TPs. The results achieved with the best SotA method is drawn as a dashed line. The error bars indicate the respective best and worst results received if the TPs are investigated individually.

advantage, except for the direct estimation methods, since they are less robust to noise.

Based on the hyperparameter influences presented above, the final overall evaluation of the SDMs is done using signal dynamics of 50 ns, the constant amplitude assumption for the PCA-based SDM in its interfering signals compensation variants, and the individual point cloud generation for the spline-based SDMs. Since the systematic error is more significant in the scope of this thesis due to its correlation with the level of the interfering signals, the results are compared in terms of MAPE in Fig. 6.14. Here, the averaged MAPE is depicted for the water and ethanol data sets without damping mat. Averaged MAPE means that the value is determined by averaging the MAPEs of the four transducer pairs per data set. Therefore, the basis for comparison—the performance of the GCC—is also averaged over all four respective transducer pairs. Note that the water data set with damping mat is not included in this evaluation since the temperature-dependent damping violates the assumption of stationary interfering signals. The results show that the PCA-based SDMs cannot be successfully applied to data sets with a strongly varying flow. In contrast, using the local or global compensation, the JBS-based SDM is significantly better than the best SotA for all data sets. Especially, the global compensation works well for both constant and varying flow data sets. Although the BS-based SDM is also often better than the GCC, some combinations, such as the direct estimation for <sup>D</sup>1c, or the local compensation for <sup>D</sup>1,v, perform worse. The last point worth mentioning here is the range of systematic errors between the transducer pair with the largest and smallest MAPE. Some SDMs can significantly reduce this range, e.g., the JBS-based global compensation reduces the initial MAPE range for the ethanol data set <sup>D</sup>3c from about 11 % to about 3 %.

In summary, except for the PCA-based and some variants of the BSbased SDM, all SDMs perform significantly better than the respective best SotA method in terms of averaged systematic error. Furthermore, the interfering signals compensation does not induce an increased stochastic error.

### 6.6 Feature-based regression method

Unlike the SDMs, the FRM requires a training data set, but it also allows to obtain an estimate for each signal individually without having to separate the signals into packages with some signal dynamics. Since the training set should contain varying time-delay differences and different process conditions, such as the temperature, the training data set for this approach is always chosen from the varying flow data sets. Accordingly, the corresponding constant flow data sets, whose time-delay differences are not included in the training data set, are used as test data sets to predict the time-delay differences of measurement signals that were not

seen by the model during the training steps. For a compact representation, a simple notation is introduced, to indicate which data sets are used as training and test data set, respectively. Exemplary, the notation <sup>D</sup>1v|TP1/D1c|TP1, denotes that the first TP of the varying flow data set <sup>D</sup>1v is used for training and the corresponding TP of the constant flow data <sup>D</sup>1c set is used to test the estimation performance.

However, due to the large number of options to select the features, the available regression models and their training strategies, the FRM requires the specification of much more hyperparameters than the SDM, where only the signal dynamics and the point cloud generation are important. A full factorial evaluation takes to much computational effort. For this reason, starting bottom-up from the regression model, the hyperparameters are investigated consecutively and then fixed to their respective best choices in the first two subsections.

Subsequently to the determination which regression model, training strategy, and feature selection method to use, the properties and performances of the FRM are evaluated in different scenarios. Firstly, the robustness against AWGN is tested, followed by an investigation if frames—in this case an overcomplete AWPT tree structure—can further improve the results. Finally, the overfitting and the generalizability are evaluated, since these properties are highly important in learning-based approaches.

All performances of the FRMs in this section are expressed in terms of the MAPE (6.7) for the systematic and the STDSPE (6.10) for the stochastic error in order to reduce the entire evaluation of a specific hyperparameter set to a single value.

#### 6.6.1 Selection of the best regression model

To begin with, the different regression models are investigated. However, since the evaluation of the FRM depends not only on the regression model but also on the feature selection, the regression models are evaluated according to how well they perform on average when using different feature selections. Thereby, not all feature selections can be considered in the average evaluation, because there are too many combinations of feature selections possible, i.e., the quality measure (F-test, PCC, …), the SS or the IFS, and the number of remaining features or thresholds can

Table 6.5 Average MAPEs using the three different regression models and their corresponding training strategies. The averaged MAPE values are obtained by using different numbers of selected features from <sup>s</sup> = 10 … 100. Here, the data sets <sup>D</sup>1v|TP1/D1c|TP1 are used for the training and the test.



be combined almost arbitrarily. Therefore, the type of feature selection is fixed to the ISS where the parameter of the remaining features <sup>s</sup> is varied.

According to sections 5.4 and 5.5, a complete hyperparameter set for the FRM contains the following parameters:


Additionally, in order to apply the method, the data sets for training and testing also need to be specified.

Table 6.5 shows the resulting average MAPEs of all available regression models and training strategies, if the FRM is applied multiple times with a differing number of selected features <sup>s</sup> and the individual MAPEs are averaged. Note that the parameters of the GPR cannot be trained using the least-squares or median-based approaches, so simple kriging is used here to fit the GPR. To simultaneously evaluate the dependence of the results on the number of features <sup>s</sup> , the standard deviation over the results of the different feature numbers is given in round brackets. Comparing the GPR with the linear regression approaches, it performs worst in the supervised case and only average in the unsupervised case. Therefore, and due to the fact that it requires the highest computational effort, this regression approach is not pursued further. Furthermore, it is noticeable that while the LS training yields the best results for the supervised FRM, the performance in the unsupervised FRM is significantly worse, suggesting that this type of training is less robust to errors in the labels. In contrast, the weighted training methods (WLS, WM) show similar behavior for both the true labels and the estimated labels. A comparison between the regression models indicates that the decision whether the model should include an offset or not is only relevant for the WM training. The application of this strategy leads on the one hand to the best results on average, if combined with the linear model with offset (5.35a), and on the other hand to nearly the worst results on average, if combined with the linear model without offset (5.35b). Observing the standard deviation shows that the dependence on the number of selected features is similar for all weighted training strategies, independently of the regression model. Only the LS method is significantly less dependent on <sup>s</sup> .

In summary, the use of Gaussian processes for regression is not appropriate for the regression of ultrasonic TDE in this algorithmic context. Since the WM method in combination with model (5.35a) performs best on average and the LS method has both the least dependence on the number of selected features <sup>s</sup> and the best results for the supervised FRM, these two strategies are further investigated. Here, the LS method is used without offset, because the model without offset physically makes more

Table 6.6 Evaluation of different methods to select the subset of features to be used for the supervised regression. The averaged MAPE values are obtained by using different numbers of selected features from <sup>s</sup> = 10 … 180. The regression model (5.35a) was chosen and trained using the WM strategy. As training set and test set, <sup>D</sup>1v|TP1 and <sup>D</sup>1c|TP1 were used, respectively.


sense when the features are designed to yield zero without time-delay difference.

#### 6.6.2 Feature selection evaluation

After setting the regression model to the linear model trained by the WM or LS method, the different feature selection methods can be quantitatively evaluated. Given the stable results for both supervised and unsupervised FRM, all available feature selection methods are tested by keeping the regression fixed to the linear model (5.35a) with the WM-trained parameters. Again, the different FRM setups are applied exemplarily to the water data sets <sup>D</sup>1v|TP1/D1c|TP1 and the results are averaged over different top-N values <sup>s</sup> to become invariant to this hyperparameter.

The results of this evaluation are listed in Tab. 6.6. For all methods, except for the ISS, where only the subspaces can be selected and not the feature individually, the MAPE is given once for the SS and once for the IFS. Among the tested methods, only the ISS, the NCA, and the Laplacian scores allow feature selection without knowledge of the labels. Therefore, these methods are preferable if the MAPEs are comparable.

Figure 6.15 Dependence of the best five feature selection methods on the number of selected features <sup>s</sup> , if the WM parametrized linear regression with offset is applied. The regression is trained with the ground truth for the labels.

Since the RReliefF, the wrapper method, and the Laplacian and F-test scores perform significantly worse on average, only the ISS, the PCC, the NCA, the LASSO, and the GPR are further investigated. An interesting fact is that, although the Gaussian processes were not suitable as a regression model, the learned length scales in the kernel function could be used well as a quality measure for the feature selection. The last point that can be seen from this study is that—for the feature selection in this specific learning-based TDE approach—the simple PCC is the best supervised method and the ISS proposed in this thesis is the best unsupervised method.

In order to determine the number of selected features that yields the best overall results, the performances of the five best feature selection methods with respect to MAPE and STDSPE are plotted against <sup>s</sup> in Fig. 6.15. While the IFS using the NCA or the LASSO becomes unstable at high numbers of features, the SS is stable for all methods and all <sup>s</sup> . However, even though the IFS is unstable for some methods, there are also combinations, such as the PCC or the LASSO for a low number of features, that result in the best MAPEs that can be achieved overall. Since the feature selection should yield a good robustness against noise, a low systematic error, and as few selected features as possible to save computational effort, the number of features will be fixed to <sup>s</sup> = 100 in the following studies. Figure 6.15 shows that using 100 features is a good compromise between the objective to have few features and high robustness against noise (see the subfigure at the bottom on the left side). The LASSO and the NCA selection methods are not considered further, because in some configurations their systematic error is too high. Furthermore, the GPR is also discarded due to the significantly larger computational effort with only comparable results.

#### 6.6.3 Robustness against additive white Gaussian noise

In the previous section, the many possible combinations of feature selection methods, regression models, training strategies and label choices for the training were evaluated and reduced to a few remaining possible combinations. The FRM with one of these remaining configurations can now be investigated regarding robustness against noise. Since the unsupervised FRM is most comparable to the SotA, which also assumes no known labels, the features are selected by the ISS (<sup>s</sup> = 100) and the linear regression model trained with estimated labels (and the WM strategy) is applied to the TDE. Although the present SNR cannot be further increased, the performance of the FRM can be investigated for decreased SNR by adding artificial AWGN to the training data set, the test data set, or both.

To this end, the water data sets <sup>D</sup>1v|TP1/D1c|TP1 are selected and deteriorated by adding noise. If the level of added AWGN is varied and the results of the FRM are plotted over the respective noise level, the diagram in Fig. 6.16 is obtained. Depending on which data set the AWGN is added to or if the data is smoothed by a prefilter, the different curves result. Note that the MAPE of the baseline (AMDF) is not drawn, since its

Figure 6.16 Impact of AWGN in the training and test data on the regression accuracy. Left: Influence on the systematic error evaluated by the MAPE. Right: Influence on the stochastic error evaluated by the STDSPE. The level of the added noise is specified by , where the subscripts (⋅)Tr and (⋅)Ts denote whether the noise is added to the training data set, the test data set, or both. For reference, the STDSPE of the best SotA method in terms of robustness against AWGN (the AMDF) is given.

systematic error is higher than the visible range. Some interesting effects can be observed in Fig. 6.16:


here, the systematic error of the AMDF method is almost independent of the added noise level, leading to the fact that the FRM is only better than the SotA if the SNR during training is sufficient.

#### 6.6.4 Application of frames in the feature generation

As the evaluation of the different feature selection methods and the dependence on the hyperparameter <sup>s</sup> has shown, the features, which are available for the regression, are an important influencing factor. Therefore, not only the selection process is relevant, but also the initial set of all features. Remembering Section 5.3, the features used in the regression are created by the absolute and phase differences of the coefficients that result from the AWPT of the measurement signals. However, the AWPT also requires the filter types and tree structure to be uniquely defined. Since the two preliminary studies [151, 159] have shown that the 6th level of the AWPT tree provides a good compromise between time and frequency resolution, all previous FRM evaluations are based on the coefficients of the 6th level in the tree. Nevertheless, the validation of this choice is verified again in this section. To this end, the set of available initial features is extended by using all coefficients of the 5th, 6th and 7th level in the AWPT to generate the features. This has the special effect that the subspaces covered by the coefficients are now overlapping in the time-frequency domain, which induces a redundancy in the representation—also called frame. Furthermore, the information is redundant, since, e.g., the coefficients from the 7th level contain all the information that is already contained in the coefficients from the 6th level.

The bar graph in Figure 6.17 visualizes the MAPEs—averaged over all TPs and the respective best and worst results—for the differently configured FRMs both when using only the 6th level coefficients and when using the redundant coefficients of the 5th, 6th and 7th level, which build a frame. In order to make the set of selected features comparable, the number of features is set to <sup>s</sup> = 100 for all selection methods. Except for the ISS in combination with the LS regression and estimated labels, using frames yields similar or worse results. There are several possible explanations for this behavior, which have to take into account the different characteristics of the feature selection methods, the choice

Figure 6.17 Average MAPEs over all TPs of the data sets <sup>D</sup>1v/D1c when using the 6th level AWPT coefficients (<sup>W</sup> = 6) to generate the features, and when using all coefficients of the 5th, 6th and 7th level (<sup>W</sup> ∈ {5, 6, 7}) to generate the features. The error bars indicate the respective best and worst result received if the TPs are investigated individually. For all feature selection methods, <sup>s</sup> = 100 applies.

of the labels, and the training strategy of the regression. Firstly, the PCC is considered, where the features are selected according to how well they are linearly correlated with the ground truth. Here, the results are very similar regardless of regression training or selection strategy, which in this context means, whether to use SS or IFS. This suggests that the selected features of the 6th level already contain the best features, confirming the validity of the initial choice of coefficients for the feature generation. In fact, a detailed examination of the resulting selection masks (not shown here due to the poor visual evaluability) reveals that 32 features between the initial feature set with <sup>W</sup> = 6 and with <sup>W</sup> ∈ {5, 6, 7} are equal.

Next, the ISS is considered, where after several iterative selection steps, the features with the largest variances are selected. Here, the application of frames for the feature generation leads to worse results when the WM regression is used, and to better or similar results when the LS regression is used (cf. the red and blue bars with the ISS labels in Fig. 6.17). Two interesting conclusions can be drawn from this behavior. Firstly, even though the LS regression is not very robust to errors in the labels, as

shown in Sec. 6.6.1, it can better balance different feature selections by suppressing features with low or poor influence in regression. This statement is based on the fact that the MAPEs of the LS-based regression with ISS and supervised training are significantly better than the MAPEs of the WM-based regression. Secondly, the ISS returns a worse performing selection of features when the initial feature set is larger due to the frames. Obviously, the objective to search for the maximum variance cannot perfectly represent the quality measure that as much information as possible should be contained in the selected subspaces.

The conclusion of this investigation is that in the problem at hand the use of frames does not improve the performance, since a sufficient number of good features can already be derived from the coefficients of the 6th level given the sampling rate 50 MHz.

### 6.6.5 Generalizability of the learning approach

Since it is the nature of learning-based methods that the quality of the training data set significantly affects the performance of the prediction, this section presents a quantitative assessment of the generalizability of the FRM. The effect that the systematic error on the training data set is much lower than that on the test data set is called overfitting. It indicates that the model may be too complex and, therefore, can perfectly interpolate the training data but hardly predict values not contained in the training data. The opposite of this, which means that the performance on the test data set is comparable to the performance on the training data set, is called generalizability.

For this reason, first, the performance on the test data set is compared with the performance on the training data set, and second, the resulting MAPEs are examined when the training data is restricted to include only a subset of the process conditions.

#### Performance on training data set vs. test data set

Figure 6.18 shows both the MAPE of the approximation of the trained FRM on the training set and the prediction error on the test set. In all selection and training scenarios (unsupervised/supervised), the performance on the training data set is worse than or similar to the performance

**WM regression with offset**

Figure 6.18 Average MAPEs over all TPs with the WM-trained linear regression model (5.35a). Blue bars: the MAPEs calculated on the training data set <sup>D</sup>1v. Red bars: the MAPEs calculated on the test data set <sup>D</sup>1c. From left to right, the hyperparameter combinations for feature selection and the used labels are plotted. The error bars indicate the respective best and worst results received if the TPs are investigated individually. For all feature selection methods, <sup>s</sup> = 100 applies.

on the test data set. While the MAPEs are quite similar for the PCC-based selection, the ISS yields a smaller systematic error on the test sets. From these results, two conclusions can be drawn. Firstly, the linear regression model is probably too simple to accurately approximate the time-delay differences on the training data set. This can be explained by the fact that the time-delay differences of the training data set are highly varying and have three clustering points due to the design of the experiment. Both properties make the training data set more complex than the test data set. Secondly, the PCC-based selections yield in general better feature sets, which can also be applied to the estimation of time delays that are closer to the boundaries of the process conditions that the FRM was trained with. However, this is to be expected since the PCC directly ranks the features according to their correlation with the labels and the ISS can only make use of quality measures that work without knowledge of the labels, such as the signal energy or the variance. Whether the ground truth is used for the training of the regression or the labels are estimated using the optimization approach seems to have no influence in this investigation.

#### Generalizability to different process conditions

In contrast to the investigation of the MAPE difference between the training and test data sets, the influence of constraining the training sets is investigated in Fig. 6.19. The aim is to evaluate the extent to which the FRM can be used for TDE if the process conditions of the test set are outside the process conditions used for training. To this end, the MAPEs are given for the subsets of the test data set, which only contain process conditions that are also included in the training data set, and for the entire test data set. In the following, the MAPEs on the subsets of the test set, which contain only the process conditions present in the training data set, are referred to as local MAPEs whereas the MAPEs on the entire test set are referred to as global MAPEs. Note that, if range{<sup>t</sup> }/range{<sup>g</sup> } = 1, the process conditions of the training and test data set are identical, resulting in the local and global MAPEs yielding the same values. Furthermore, the results show that the LS-trained regression is less generalizable to other process conditions. This can be inferred from the global MAPEs, which demonstrate degradation in performance relative to the respective best SotA method if the training data set is constrained to contain less than half of the available absolute time delay variations. In contrast, the local MAPEs are equally good for all regression methods. Note that the scale of the plots for the LS-based regressions goes up to 9 % due to the larger systematic errors. An interesting fact is that the unsupervised WM-trained regression shows little dependence on the available training set. This can be seen from the behavior of the average and standard deviation of the MAPEs, which show a similar magnitude regardless of the available training data set.

In summary, the supervised WM-trained regression generalizes best to different process conditions. Furthermore, even though the unsupervised WM-trained regression has a higher variance, it can be applied with only little degradation if the available training set contains less varying process conditions. However, note that WM-trained regression is better than the best SotA method—here the GCC—regardless of the available training set or whether the labels used for training are the ground truth or estimated by the approximation approach.

Figure 6.19 Performance of the FRM on the test data set <sup>D</sup>1c|TP1 (blue), if the range of a that is available during training is constrained. In red the performance on the local test data set <sup>D</sup>1cl = {<sup>D</sup> <sup>⊂</sup> <sup>D</sup>1c|TP1 | a, <sup>∈</sup> t}, which only contains process conditions that are contained in the constrained training data set, is given. In order to get the constrained training data sets, different subsets are sampled from the entire training data set. On the x-axis, the ranges of the absolute time delays a,, which are contained in the used subset of the training data set, normalized by the maximum range of the entire data set <sup>D</sup>1v|TP1 are given. Since not only the range of the available a, but also their location influences the result, subsets with equal range are sampled at ten different locations each. The result of each used subset is represented by a blue cross mark for the performance on the entire test data set and a red dot mark for the performance on the local test data set. The solid lines represent the respective average values and the shaded area represents the one-sigma range, both calculated based on the results of the ten subsets per given range. The LS- and the WM-trained regression are combined with the linear model without offset and with offset, respectively. For each method, the features are selected by the ISS with <sup>s</sup> = 100.

# 6.6.6 Domain adaption to different media and attenuation characteristics

In the previous section, the generalizability to other process conditions was discussed. However, another interesting question is whether the training using the data of a specific pipe and medium can also be transferred to other media or to other attenuation characteristics. For example, the medium can be changed to ethanol or a damping mat can be installed. In the field of machine learning, this process of transferring the trained model to other domains—different media or attenuation situations in this thesis—is called domain adaption. Usually, the trained parameters are adapted or fine-tuned to the new domain by considering a small set of labeled measurement signals of the target domain. Since this would require further calibration measurements, which are not desirable, this section investigates the extent to which an already trained FRM can be applied to other media and damping properties *without* fine-tuning or parameter adaptation.

To this end, Figure 6.20 shows the MAPEs of the unsupervised FRM that result when different combinations of the available training and test data sets are used. Note that for the training data sets, only the varyingflow data sets, and for the test data sets only the constant-flow data sets are used, since varying time-delay differences have to be present in the training data sets. Due to there being three varying-flow data sets and three constant-flow data sets, nine different combinations of training and test data sets exist. Because the previous investigation only considered the water data sets without a damping mat, the necessary initial feature set may deviate. Therefore, the results are given for different AWPT tree levels.

It is easy to realize that a transfer from water to ethanol and vice versa is not possible (see the bottom row and the right column of Fig. 6.20). In contrast to that, regardless of whether the FRM is trained on the water data set with or without a damping mat, it can be applied to all water data sets with similar quality. The explanation for this behavior is that the different SoS means that the learned time ranges cannot be transferred to other media. The fact that the unsupervised FRM still shows significantly lower MAPEs for ethanol than the best SotA method (10.8 %) shows that

Figure 6.20 Domain adaptability of the unsupervised FRM, using the ISS (<sup>s</sup> = 100) and the WM-trained linear regression model (5.35a). Here, the MAPE values of the different TPs using the initial feature sets of the 4th to 9th AWPT tree levels are investigated. In order to test the transferability of the training to situations with different media or attenuation characteristics, all available training data sets are combined with all available test data sets. After training, the FRM is *not* fine-tuned to adapt to the other domain, i.e., it is evaluated to which extent the parameters trained on a specific data set can be applied for the TDE of a data set with a different medium or attenuation characteristic. For the baseline, the MAPE values, when test and training are performed on the same medium and with the same attenuation characteristic, are also included and shown on the diagonal. The data set identifiers can be found in Table 6.1.

the proposed method also works for other media—provided it is also trained on the other medium.

Another interesting parameter to observe is the AWPT-tree level used to generate the initial feature set. For the present ultrasonic system with its sampling rate and bandwidth, the AWPT-tree level does not have too much influence on the results as long as it is chosen smaller than 9. If the level is chosen too high, the resulting basis function will be too extended in the time domain, which will prevent the resolution of different propagation paths.

The last point to notice is the MAPE of the combination <sup>D</sup>2v/D2c, i.e., using water as medium in a pipe with a damping mat installed. Comparison with the SotA shows that the FRM cannot further improve the systematic error. One possible explanation is the complexity of this data set. Even though the damping mat reduces the interfering signals, the temperature-dependent damping leads to a violation of the assumption that the interfering signals are stationary. Therefore, a regression with constant parameters cannot adequately estimate the delay differences over the wide temperature range of the test data set.

# 6.7 Estimator performances compared with the Cramér-Rao lower bound

The errors in TDE can be separated into systematic errors and stochastic errors. While the systematic errors arise due to wrong model assumptions, interfering signals, or calibration errors, the stochastic errors originate from measurement noise, jitter, or in this particular case, flow fluctuations since the flow is fully turbulent. The systematic errors, which in the considered UFM scenario are mainly caused by the interfering signals, have been extensively studied and compared to the SotA in the previous sections by determination of the MAPEs. Therefore, this section focuses on the stochastic errors of the SDM and FRM. To this end, the stochastic errors of the respective best SDM and FRM are compared to the CRLB—a theoretical lower bound for the variance of unbiased estimators. Note that the CRLB is a valid theoretical lower bound only if the estimators are unbiased and the signal properties such as PSD and observed time range are accurately known. Furthermore, it only represents

a lower limit for stochastic errors that arise due to measurement noise and not due to flow fluctuations. Nevertheless, the CRLB can be used as a reference point to evaluate the magnitude of the stochastic errors in the UFMs. For comparison, the obtained stochastic errors of two SotA methods are also examined.

As the input of the SDM and the FRM are measurement signals with the aim to estimate a time-delay difference, both methods can be considered time delay estimators. Although they are not unbiased due to the existence of interfering signals, the stochastic error can nevertheless be assessed by removing the systematic error beforehand. Similarly, the ZCM and the CC method also represent time delay estimators that are biased due to the interfering signals and whose stochastic errors can be compared to the CRLB. These two methods are specifically chosen because they are representative of a time-based TDE and a correlation-based TDE, respectively.

For the investigation of the stochastic error, artificial AWGN of different levels is added to the measurement signals of the data set <sup>D</sup>1c|TP1. The level of the artificial AWGN is gradually increased in 20 steps to reduce the initially available SNR stepwise from about 85 dB to about 63 dB. Note that the actual SNR slightly differs for each measurement, since only the amplification of the AWGN can be specified and therefore the actual signal energy of the AWGN is random around its specified mean value. After applying each method under test to the data set at each noise level, the stochastic errors are calculated via removing the systematic errors with the median-based lowpass similar to (6.10). The only difference is that the errors are not processed as percentages but as absolute errors <sup>a</sup> [] = ̂ − , leading to the standard deviation of the stochastic error ( ̂ − ).

The resulting stochastic errors depending on the SNR are depicted in Fig. 6.21 on the left side. The time ranges used in the ZCM and CC are calculated according to the algorithm described at the end of Sec. 6.3. For the FRM, which is trained on <sup>D</sup>1v|TP1, the hyperparameters are as follows: ISS (<sup>s</sup> = 100) for the feature selection, approximated labels for the training, WM-trained linear model with offset for the regression. Furthermore, the results of the SDM in its JBS-based global compensation form are plotted. It is easy to observe that the ZCM and the CC method

Figure 6.21 Considerations on the standard deviation of the proposed estimation methods compared with two SotA methods. As a reference, the results were plotted against the CRLB for the present TDE problem.

perform equally well at high SNR, but the CC method can still be used at low SNR without much degradation. Since the SDM is essentially the same as the ZCM only with a preprocessing step that removes the estimated interfering signals, the stochastic errors are determined by the stochastic components in the estimated interfering signals and the original stochastic errors of the ZCM. From the fact that the SDM and the ZCM perform equally it can be shown that the interfering signals compensation does not add any stochastic components to the measurement signals. The second proposed method, the FRM, can even significantly improve the stochastic error. This can be explained by two properties of the approach. On the one hand, the feature selection leads to a timefrequency selection mask that considers in sum a larger time range for the estimation of the time-delay differences. On the other hand, the combination of several time ranges also achieves an averaging effect over different propagation paths, which also reduces the influence of the flow fluctuation.

Since the different methods use different time ranges of the measurement signal, the CRLB is also different according to (6.11). Therefore, the results are shown again as a function of the CRLB in Fig. 6.21 on the right side. This plot underlines that the FRM not only performs best in terms of stochastic errors but also has the smallest distance to the CRLB, which

can be seen from the fact that for a fixed SNR, the absolute distance to the dashed line is the smallest. The other investigated methods have essentially the same Cramér-Rao efficiency.

# 6.8 Comparison and discussion

All presented methods for TDE (including the SotA) differ in their requirements, such as labeled or unlabeled training sets, signal packages with sufficient signal dynamics, or simply a reference and delay signal. Furthermore, the many hyperparameters of the methods, which all lead to varying results, make an objective and direct comparison of the methods difficult. Therefore, the methods are compared with respect to their requirements: the number of hyperparameters, the necessary preprocessing steps, the achievable measurement dynamics, the practicality, and the results. Finally, it is discussed which method is considered the best overall in the example application presented—the UFM scenario.

Firstly, the requirements are discussed. The SotA methods used as baselines in this dissertation usually only require a single measurement consisting of a reference and delay signal. In contrast to that, the SDM requires a signal package consisting of multiple measurements with varying signal dynamics, i.e., the absolute time delay, the time-delay difference, or both have to vary within the signal package. Then, the timedelay difference can be estimated for each signal within the package. Similarly, the FRM requires such a signal package for the training, for which even the ground truth for the time-delay differences must be known as labels in the case of the supervised FRM.

These requirements on the available measurement signals primarily influence the practicality and the measurement dynamics of a measurement system based on the respective algorithm. The SotA methods, the FRM and the SDMs in global compensation form only need one measurement for the actual TDE, and thus the measurement dynamics are not limited by the algorithm. In contrast, the SDMs in local compensation form or in direct estimation form can only provide estimates after all signals of a signal package have been recorded. This delays the availability of an estimation result on average by half of the time needed to record a signal package with sufficient signal dynamics. Since the signal dynamics

are purely process-driven, this can sometimes be very long. Although the FRM and the SDMs in global compensation form can estimate a timedelay difference individually for each measurement, and thus do not limit the measurement dynamics, they both require a training data set beforehand for training in the case of the FRM and for the globally valid estimation of the interfering signals in case of the SDM. Therefore, in practice, a calibration procedure would be performed before use. While the unsupervised FRM and the SDM can be calibrated without knowledge of the ground truth, the supervised FRM needs the ground truth, which can be provided either by another measuring device or by preparing the time-delay differences in a calibration environment. However, both calibration methods are costly, which is why the unsupervised FRM and the SDMs are preferable in practice.

The next distinguishing feature is the number of hyperparameters. The FRM has the most hyperparameters of all methods, because the AWPT, the feature selection, the regression model and its training strategy must be defined in order to uniquely define the FRM. Nevertheless, the investigations have shown that only certain combinations of hyperparameters lead to very good results, and many hyperparameters, such as the number of features have little effect on quality if they are chosen within certain limits. This can significantly reduce the number of possible hyperparameter combinations. In contrast, the ZCM has only two parameters, namely which zero crossing is to be used and the time range used for the linear approximation of the phase signal. Between the ZCM and the FRM, the rest of the methods fall in line, with the correlation-like SotA methods having more hyperparameters than the ZC method—namely the window type and the processor—and fewer than the SDMs. The SDMs can also be differentiated even more finely. Here, the JBS method has clearly more hyperparameters since the optimization for the spline fit is more complex.

Finally, the quality of the estimates is compared in terms of systematic and stochastic errors. For this purpose, it is assumed that the basic requirements of all methods are met. Then, the supervised FRM leads to the smallest systematic error on average, followed by the unsupervised FRM and the SDM in its global compensation form. The SotA methods all yield approximately similar systematic errors, which are significantly

worse than the SDMs and the FRMs on all data sets. However, this is to be expected, as they are not specifically designed to suppress or filter interfering signals. While the FRM is a separate learning-based approach, those SDMs based on the estimation and subtraction of the interfering signals need to be combined with the SotA methods to perform the TDE. Although the SDMs are combined only with the ZCM in this dissertation, the combination with the other correlation-like methods is also possible, but since the results are expected to be similar, this study was not performed. Regarding the stochastic error, the results show that, again, the FRM performs best and the SDM neither improves nor worsens the SotA methods.

In summary, the unsupervised FRM and the SDM with global compensation are considered to be the best methods because the training data can be collected during normal operation, the measurement dynamics is not limited and the interfering signals are significantly suppressed without the stochastic error becoming worse.

Here, it should be noted briefly that since the successful application of a learning-based approach to TDE, such as the FRM, suggests that TDE can also be accurately performed using approaches from deep learning, various network architectures were tested in a Bachelor thesis [150] to determine the extent to which they can be applied for the present problem. However, the thesis has shown that due to the lacking training data even the deep neural network trained in a supervised fashion is not comparable to the unsupervised FRM, unless a lot of time and effort is invested into developing an adapted network architecture and a targeted augmentation strategy. For these reasons, deep learning approaches were not considered in this dissertation.

# 7 Conclusion

## 7.1 Summary

In this thesis, two TDE techniques for ultrasonic measurement systems have been presented, which are specifically designed to filter or be robust against interfering signals in scenarios where very small time-delay differences have to be estimated with high accuracy. Since these challenges arise in transit-time UFM systems, where the objective is to accurately measure the VoF, the experimental evaluation of the proposed TDE methods is based on a transit-time UFM setup. For the quantitative evaluation, the VoFs in the experiments are converted into the time-delay differences that are to be estimated from the ultrasonic measurement signals. Based on these ground truths of the time-delay differences, the proposed methods can be evaluated in terms of systematic and stochastic error.

Firstly, a signal model has been presented that describes the TDE problem and the origin of the interfering signals in transit-time UFMs. Both presented approach classes—the SDM and the FRM—are based on this signal model, although their principles differ, as the FRM is a learning-based method for the interference-invariant estimation of the time-delay differences and the SDM is a method to estimate and remove the interfering signals before applying a SotA method for TDE.

The requirements of the approaches are that multiple measurements with varying time delays are available, where each measurement signal needs to have a small bandwidth. Furthermore, the additive interfering signals have to be stationary, i.e., they do not vary between consecutive measurements, and symmetric, i.e., they are additively superimposed equally on both the reference and delay signals. Evidence that these requirements are met has been provided by a system identification, which also quantitatively evaluates the extent to which the assumptions of stationarity and symmetry hold. It has been shown that the amplitude and the time delay of the interfering signals are slightly temperaturedependent, which limits the accuracy of the interfering signals estimation, and thus the achievable reduction of the systematic errors.

For the separation of the interfering signals, the SDM employs the principle that the time delays of the target signals vary when multiple measurements are considered, while the interfering signals are stationary. Therefore, this class of methods represents a generalization of the methods proposed by Roosnek [99], where it was assumed that the time delays in the considered measurement package uniformly cover one period of the fundamental oscillation. The proposed SDM can significantly mitigate the boundary condition, so that the time delays can be arbitrarily distributed within the measurement package and the variation of the time delays only needs to be large enough to outweigh the measurement noise. Within the scope of this work, different variants of the SDM have been presented, with the JBS-based global compensation performing the best in the exemplary UFM scenario. The JBS-based SDM in its global compensation form reduced the systematic error on all considered data sets to less than half compared to the best SotA methods. This leads to the conclusion that although the interfering signals could not be completely removed due to their minor non-stationarity, they can be significantly reduced using the proposed SDM.

The FRM uses the AWPT to get a complex-valued time-frequency representation of the measurement signals, which is subsequently used to generate features that correlate with the time delay. After an adapted feature selection, the remaining features are used as input for a regression to directly estimate the time delay differences. The experimental results have shown that the FRM works on different media and is always better than the best SotA method both in terms of systematic and stochastic error. Especially for the stochastic errors, the method is closest to the CRLB. With regard to the dependence on the available training data, the FRM has been shown to estimate the time-delay difference very well even outside the process conditions with which it was trained. In addition, an extension of the FRM has been proposed that also allows an unsupervised training, i.e., the ground truth of the labels for the training data set does not need to be known. Thus, the learning-based approach is directly comparable to the SotA method in terms of practicality, since

a measurement system can easily be adapted to store past measurement signals.

In summary, the objective to reduce the influence of interfering signals in ultrasonic TDEs using the UFM as the application example was achieved. The residual estimation errors of the proposed methods in the experimental results can be explained by the deviation of the signals from the assumed signal model in case of the SDM and by the constraints imposed on the regression model in the case of the FRM.

# 7.2 Outlook

The presented SDM puts a strong focus on the representation of the signal dynamics as point clouds and their processing. However, this limits the approach to the fact that only variable time delays can be used. Using an alternative approach as a Bayesian estimation problem, where the dependence of amplitudes and time delays on process conditions are learned as functions, the interfering signals estimation can then be generalized to determine those interfering signals which have the highest probability, given an observed signal package.

In contrast, the FRM is limited to small time delays due to the feature design—where small means that the transit time is small relative to the period duration of the ultrasonic transducer's operating frequency. This limitation can be overcome by defining other features that combine multiple scales in the AWPT tree to be unambiguous over larger transit times. This in turn requires abandoning the restriction to linear regressions, so other regression types have to be considered in the extended case.

Apart from improving the proposed methods, there are three directions for further research: the application of the proposed TDE algorithm to other applications such as the ultrasonic liquid level metering, the implementation of the improved path-specific TDE for flow profile imaging in UFMs and the extension of the FRM to other machine learning concepts.

While the application to other ultrasonic measurement tasks falls into the category of further validation of the methods, and flow profile imaging is an extension of UFM, from a signal processing perspective, the extension of the methodology to modern learning techniques is most

interesting. Here, different neural network architectures can be explored that can efficiently estimate a time delay on one hand and reduce the influence of interfering signals on the other hand. For this purpose, concepts have to be developed on how to reduce the requirements on the amount of training data as well as on how to design suitable augmentation strategies.

# Bibliography


# List of publications


# List of supervised theses


[160] Wilk, K. *Signalangepasste Zeit-Frequenz-Analyse von Ultraschallsignalen mittels Wavelet-Packets*. Bachelor thesis. 2019.

### **Forschungsberichte aus der Industriellen Informationstechnik (ISSN 2190-6629)**


Die Bände sind unter www.ksp.kit.edu als PDF frei verfügbar oder als Druckausgabe bestellbar.


Die Bände sind unter www.ksp.kit.edu als PDF frei verfügbar oder als Druckausgabe bestellbar.


Gedruckt auf FSC-zertifiziertem Papier

Model-based Filtering of Interfering Signals