**Mathematics of Planet Earth 11**

Bertrand Chapron · Dan Crisan · Darryl Holm · Etienne Mémin · Anna Radomska  *Editors*

# Stochastic Transport in Upper Ocean Dynamics II

STUOD 2022 Workshop, London, UK, September 26–29

# **Mathematics of Planet Earth**

# Volume 11

# **Series Editors**

Dan Crisan, Imperial College London, London, UK Ken Golden, University of Utah, Salt Lake City, UT, USA Darryl D. Holm, Imperial College London, London, UK Mark Lewis, University of Victoria, Victoria, BC, Canada Yasumasa Nishiura, Tohoku University, Sendai, Miyagi, Japan Joseph Tribbia, National Center for Atmospheric Research, Boulder, CO, USA Jorge Passamani Zubelli, Instituto de Matemática Pura e Aplicada, Rio de Janeiro, Brazil

This series provides a variety of well-written books of a variety of levels and styles, highlighting the fundamental role played by mathematics in a huge range of planetary contexts on a global scale. Climate, ecology, sustainability, public health, diseases and epidemics, management of resources and risk analysis are important elements. The mathematical sciences play a key role in these and many other processes relevant to Planet Earth, both as a fundamental discipline and as a key component of cross-disciplinary research. This creates the need, both in education and research, for books that are introductory to and abreast of these developments.

Springer's MoPE series will provide a variety of such books, including monographs, textbooks, contributed volumes and briefs suitable for users of mathematics, mathematicians doing research in related applications, and students interested in how mathematics interacts with the world around us. The series welcomes submissions on any topic of current relevance to the international Mathematics of Planet Earth effort, and particularly encourages surveys, tutorials and shorter communications in a lively tutorial style, offering a clear exposition of broad appeal.

#### **Responsible Editors:**

Martin Peters, Heidelberg (martin.peters@springer.com) Robinson dos Santos, São Paulo (robinson.dossantos@springer.com)

#### **Additional Editorial Contacts:**

Donna Chernyk, New York (donna.chernyk@springer.com) Masayuki Nakamura, Tokyo (masayuki.nakamura@springer.com) Bertrand Chapron • Dan Crisan • Darryl Holm • Etienne Mémin • Anna Radomska Editors

# Stochastic Transport in Upper Ocean Dynamics II

STUOD 2022 Workshop, London, UK, September 26–29

*Editors*

Bertrand Chapron Ifremer – Institut Français de Recherche pour l'Exploitation de la Mer Plouzané, France

Darryl Holm Imperial College London London, UK

Anna Radomska Imperial College London London, UK

Dan Crisan Imperial College London London, UK

Etienne Mémin Campus Universitaire de Beaulieu Inria - Institut National de Recherche en Sciences et Technologies du Numérique Rennes, France

ISSN 2524-4264 ISSN 2524-4272 (electronic) Mathematics of Planet Earth ISBN 978-3-031-40093-3 ISBN 978-3-031-40094-0 (eBook) https://doi.org/10.1007/978-3-031-40094-0

Mathematics Subject Classification: 60Hxx, 60H17, 70L10, 35R60, 37M05, 37-11, 35Qxx, 65Pxx, 00B25

This work was supported by Horizon 2020 Framework Programme (856408)

© The Editor(s) (if applicable) and The Author(s) 2024. This book is an open access publication.

**Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Paper in this product is recyclable.

# **Preface**

This volume contains the Proceedings of the 3rd Stochastic Transport in Upper Ocean Dynamics Annual Workshop held on 26–29 September 2022. The workshop is part of the project "Stochastic Transport in Upper Ocean Dynamics" (STUOD) supported by an European Research Council Synergy Grant. The four principal investigators of the projects, Prof. Dan Crisan (ICL), Prof. Bertrand Chapron (IFREMER) Prof. Darryl Holm (ICL) and Prof. Etienne Memin (INRIA), were delighted to organize the annual workshop: the event was largest to date with an impressive range of inspiring talks, engaging theoretical and applied sessions, as well as networking opportunities.

The STUOD project is supported by an ERC Synergy Grant, led by three worldclass institutions: Imperial College London (ICL), National Institute for Research in Digital Science and Technology (INRIA) and the French Research Institute for Exploitation of the Sea (IFREMER). The project aims to deliver new capabilities for assessing variability and uncertainty in upper ocean dynamics and provide decision makers a means of quantifying the effects of local patterns of sea level rise, heat uptake, carbon storage and change of oxygen content and pH in the ocean. The project will make use of multimodal data and will enhance the scientific understanding of marine debris transport, tracking of oil spills and accumulation of plastic in the sea.

As in the previous years, the 3rd STUOD Annual Workshop 2022 focused on a range of fundamental topical areas, including:


Each chapter in the present volume illustrates one or several of these topical areas. Many chapters offer new mathematical frameworks that are intended to enhance future research in the STUOD project. The workshop was held in the hybrid mode and brought together participants from countries such as: UK, France, USA, Italy, Germany, Saudi Arabia, the Netherlands and Switzerland. It was well attended by early-career academics, post-graduate students, senior members of the community and other invited guests.

The scientific program of the four-day conference was divided into five sessions covering Data Assimilation, Physics Models, Data, Numerics for Ocean Models, and Theoretical Analysis of SPDEs. Several members of the STUOD External Advisory Board, Prof Sebastian Reich (University of Potsdam), Prof Baylor Fox-Kemper (Brown University), Prof Rosemary Morrow (Laboratoire d'Études en Géophysique et Océanographie Spatiales), Prof Laurent Debreu (INRIA) and Prof Arnaud Debussche Ecole Normale Supérieure de Rennes, gave invited talks. The programme also included individual presentations by the STUOD principal investigators and post-doctoral researchers and snapshot presentations by invited speakers such as Dr Ali Mashayekhi (Imperial College London) and Dr Hamza Ruzayqat (King Abdullah University of Science and Technology). The workshop also provided opportunities for investigators at an early stage of their career to have discussions with established scientists, fostering potential future research collaborations, networking as well as inclusion and training of the next generation of researchers.

Some of the in-person participants attending the event

The following is a brief description of the 15 contributions included in the proceedings:

The submitted manuscripts include the chapter by Adrien Bella, Noé Lahaye, and Gilles Tissot entitled "Internal Tides Energy Transfers and Interactions with the Mesoscale Circulation in Two Contrasted Areas of the North Atlantic". The chapter investigates the energy budget of the internal tide and its life cycle with a highresolution numerical simulation and a vertical normal mode decomposition. Two areas of interest are considered: the Azores Islands over the mid-Atlantic ridge and the Gulf Stream offshore the North of the US East coast shelf break. Low mode (1 and 2) internal tides are found to propagate from 100 km (mode 2) to 1000 km (mode 1) away from their generation sites. Waves lose a significant portion of their energy as they propagate through the Gulf Stream, in contrast to the Azores domain. In the Gulf Stream domain, the mesoscale circulation is responsible for energy transfers from low to high mode internal tides, while the topographic scattering is dominant in the Azores area. This transfer of energy toward high modes favours energy dissipation. The mesoscale is significant in the energy budget of modes higher than mode 1 for both domains, and for all baroclinic modes in the Gulf Stream area. The internal tide is found to extract or lose energy toward the mesoscale circulation, but this accounts for less than 14% of the energy scattered from low internal tide modes to higher ones once summed over all contributions of the modal energy budget.

The contribution of Paolo Cifani, Sagy Ephrati and Milo Viviani entitled "Sparse-Stochastic Model Reduction for 2D Euler Equations" introduces a reduction technique for ideal fluids.

The 2D Euler equations are a simple but rich set of non-linear PDEs that describe the evolution of an ideal inviscid fluid, for which one dimension is negligible. Solving numerically these equations can be extremely demanding. Several techniques to obtain fast and accurate simulations have been developed during the last decades. In this chapter, the authors present a novel approach which combines recent developments in the stochastic model reduction and conservative semi-discretization of the Euler equations. In particular, starting from the Zeitlin model on the 2-sphere, they derive reduced dynamics for large scales and we close the equations either deterministically or with a suitable stochastic term. Numerical experiments show that, after an initial turbulent regime, the influence of small scales to large scales is negligible, even though a non-zero transfer of energy among different modes is present.

Franco Flandoli, Silvia Morlacchi, and Andrea Papini present in their work, "Effect of Transport Noise on Kelvin–Helmholtz Instability", a numerical investigation of the dissipation properties of very small-scale transport noise. As a test problem, the authors consider the Kelvin–Helmholtz instability and compare the inviscid case, the viscous one, both without noise, and the inviscid case perturbed by transport noise. They observe a partial similarity with the viscous case, namely a delay of instability.

The chapter by Daniel Goodair and Dan Crisan, entitled "On the 3D Navier-Stokes Equations with Stochastic Lie Transport", investigates the existence and uniqueness of maximal solutions to the 3D SALT Navier-Stokes Equation in velocity and vorticity form, on the torus and the bounded domain, respectively. The work partners an earlier paper of the authors as an application of the abstract framework presented there. They demonstrate the efficacy of the earlier work in showing well-posedness for both the velocity and vorticity form of the equation, as well as obtaining the first analytically strong existence result for a fluid equation perturbed by Lie transport noise on a bounded domain.

Darryl Holm, Ruiao Hu, and Oliver Street in their paper entitled "On the Interactions Between Mean Flows and Inertial Gravity Waves" derive a generalized Lagrangian mean (GLM) theory as a phase-averaged Eulerian Hamilton variational principle expressed as a composition of two smooth invertible maps. Following an earlier work of Holm, they consider 3D inertial gravity waves (IGWs) in the Euler-Boussinesq fluid approximation. The authors provide both deterministic and stochastic closure models for GLM IGWs at leading order in 3D complex vector WKB wave asymptotics. The chapter brings the earlier results of Holm at leading order into an easily assimilable short form and proposes a stochastic generalization of the wave mean flow interaction (WMFI) equations for IGWs.

The contribution of Quentin Jamet, Etienne Mémin, Franck Dumas, Long Li, and Pierre Garreau entitled "Toward a Stochastic Parameterization for Oceanic Deep Convection" investigates parametrizations for ocean convections. Current climate models are known to systematically overestimate the rate of deep-water formation at high latitudes in response to too deep and too frequent deep convection events. The authors propose in this study to investigate a misrepresentation of deep convection in Hydrostatic Primitive Equation (HPE) ocean and climate models due to the lack of constraints on vertical dynamics. They discuss the potential of the Location Uncertainty (LU) stochastic representation of geophysical flow dynamics to help in the process of re-introducing some degree of non-hydrostatic physics in HPE models through a pressure correction method. The authors then test these ideas with idealized Large Eddy Simulations (LES) of buoyancy-driven free convection with the CROCO modelling platform. Preliminary results at LES resolution exhibit a solution obtained with our Quasi-nonhydrostatic (Q-NH) model that tends toward the reference non-hydrostatic (NH) model. As compared to a pure hydrostatic setting, our Q-NH solution exhibits vertical convective plumes with a larger horizontal structure, a better spatial organization, and a reduced intensity of their associated vertical velocities. The simulated Mixed Layer Depth (MLD) deepening rate is however too slow in our Q-NH experiment as compared to the reference NH, a behaviour that opposes to that of hydrostatic experiments of producing too fast MLD deepening rate. These preliminary results are encouraging, and support future efforts in the direction of enriching coarse resolution, hydrostatic ocean, and climate models with a stochastic representation of non-hydrostatic physics.

The contribution of Alexander Lobbe, Dan Crisan, Darryl Holm, Etienne Mémin, Oana Lang, and Bertrand Chapron, entitled "Comparison of Stochastic Parametrization Schemes Using Data Assimilation on Triad Models", investigates stochastic parametrization schemes. In recent years, stochastic parametrizations have been ubiquitous in modelling uncertainty in fluid dynamics models. One source of model uncertainty comes from the coarse-graining of the fine-scale data and is in common usage in computational simulations at coarser scales. In this chapter, the authors look at two such stochastic parametrizations: the Stochastic Advection by Lie Transport (SALT) parametrization and the Location Uncertainty (LU) parametrization. Whilst both parametrizations are available for full-scale models, the authors study their reduced order versions obtained by projecting them on a complex vector Fourier mode triad of eigenfunctions of the curl. Remarkably, these two parametrizations lead to the same reduced order model, termed the helicitypreserving stochastic triad (HST). This reduced order model is then compared with an alternative model which preserves the energy of the system, and which is termed the energy preserving stochastic triad (EST). These low-dimensional models are ideal benchmark models for testing new Data Assimilation algorithms: they are easy to implement, exhibit diverse behaviours depending on the choice of the coefficients, and come with natural physical properties such as the conservation of energy and helicity.

Erwin Luesink and Bernard Geurts consider in the chapter "An Explicit Method to Determine Casimirs in 2D Geophysical Flows" a new method to construct Casimirs for geophysical flows.

Conserved quantities in geophysical flows play an important role in the characterization of geophysical dynamics and aid the development of structure-preserving numerical methods. A significant family of conserved quantities is formed by the Casimirs: These are integral conservation laws that are in the kernel of the underlying Poisson bracket. The Casimirs hence determine the geometric structure of the geophysical fluid equations among which the enstrophy is well known. Often Casimirs are proposed on heuristic grounds and later verified to be part of the kernel of the Poisson bracket. In this work, the authors explicitly construct Casimirs by rewriting the Poisson bracket in vorticity-divergence coordinates thereby providing explicit construction of Casimirs for 2D geophysical flow dynamics.

The work of Igor Maingonnat, Gilles Tissot, and Noé Lahaye is entitled "Correlated Structures in a Balanced Motion Interacting with an Internal Wave" and investigates the correlations between a balanced motion and the incoherent part of a wave in an idealized configuration.

Characterizing the loss of coherence of an internal tide propagating through mesoscale turbulence has been a major challenge in oceanography, particularly due to its implications for the interpretation of satellite data. The authors introduce a new modal decomposition technique, named broad-band proper orthogonal decomposition (BBPOD), which consists in performing a proper orthogonal decomposition (POD) on complex demodulated variables. After connecting BBPOD to the standard SPOD, they show that BBPOD, coupled with the extended POD technique, enables them to associate the principal components of the incoherent field to the slow flow structures responsible for this loss of coherence through triadic interactions with the incident wave.

The chapter by Étienne Mémin, Long Li, Noé Lahaye, Gilles Tissot, and Bertrand Chapron, entitled "Linear Wave Solutions of a Stochastic Shallow Water Model" investigates the wave solutions of a stochastic rotating shallow water model. The approximate model covered in the chapter provides an interesting simple description of the interplay between waves and random forcing ensuing either from the wind or coming as the feedback of the ocean on the atmosphere and leading in a very fast way to the selection of some wavelength. This interwoven, yet simple, mechanism explains the emergence of typical wavelength associated with near-inertial waves. Ensemble-mean waves that are not in phase with the random forcing are damped at an exponential rate, whose magnitude depends on the random forcing variance. Geostrophic adjustment is also interpreted as a statistical homogenization process in which, in order to conserve potential vorticity, the small-scale component tends to align to the velocity fields to form a statistically homogeneous random field.

The contribution of Said Ouala, Bertrand Chapron, Fabrice Collard, Lucile Gaultier, and Ronan Fablet entitled "Analysis of Sea Surface Temperature variability Using Machine Learning" analyses sea surface temperature (SST). SST is a critical factor in the global climate system and plays a key role in many marine processes. Understanding the variability of SST is therefore important for a range of applications, including weather and climate prediction, ocean circulation modelling, and marine resource management. In this study, the authors use machine learning techniques to analyse SST anomaly (SSTA) data from the Mediterranean Sea over a period of 33 years. The objective is to best explain the temporal variability of the SSTA extremes. These extremes are revealed to be well explained through a nonlinear interaction between multi-scale processes. The results contribute to better unveil factors influencing SSTA extremes, and the development of more accurate prediction models.

The contribution of Sebastian Reich entitled "Data Assimilation: A Dynamic Homotopy-Based Coupling Approach" covers a new approach for data assimilation based on homotopy. Homotopy approaches to Bayesian inference have found widespread use especially if the Kullback–Leibler divergence between the prior and the posterior distribution is large. The author extends one of these homotopy approach to include an underlying stochastic diffusion process. The underlying mathematical problem is closely related to the Schrödinger bridge problem for given marginal distributions. He shows that the proposed homotopy approach provides a computationally tractable approximation to the underlying bridge problem. In particular, the implementation builds upon the widely used ensemble Kalman filter methodology and extends it to Schrödinger bridge problems within the context of sequential data assimilation.

Valentin Resseguier, Yicun Zhen, and Bertrand Chapron show in their work entitled "Constrained Random Diffeomorphisms for Data Assimilation" that both the Stochastic Advection by Lie Transport (SALT) and the Location Uncertainty (LU) equations can be recovered using a prescribed definition of a random diffeomorphism used to perturb the physical space. However, unlike the SALT and LU settings, they propose a perturbation scheme does not directly rely on a particular physics. Hence, the random mapping is more flexible and can be applied to any PDE.

The work of Gilles Tissot, Étienne Mémin, and Quentin Jamet entitled "Stochastic Compressible Navier-Stokes Equations Under Location Uncertainty" provides a stochastic version under location uncertainty of the compressible Navier-Stokes equations. To that end, some clarifications of the stochastic Reynolds transport theorem are given when stochastic source terms are present in the right-hand side. The authors apply this conservation theorem to density, momentum, and total energy in order to obtain a transport equation of the primitive variables, i.e. density, velocity, and temperature. They show that performing low Mach and Boussinesq approximations to this more general set of equations allows to recover the known incompressible stochastic Navier-Stokes equations and the stochastic Boussinesq equations, respectively. Finally, they provide some research directions for using this general set of equations in the perspective of relaxing the Boussinesq and hydrostatic assumptions for ocean modelling.

Francesco Tucciarone, Étienne Mémin, and Long Li present in their work entitled "Data-Driven Stochastic Primitive Equations with Dynamic Modes Decomposition" an implementation of a stochastic version of the primitive equations into the NEMO community ocean model to assess the capability of the so-called Location Uncertainty framework in representing the small scales of the ocean flows. The work is important as planetary flows are characterized by interaction of phenomena in a huge range of scales, and it is unaffordable today to resolve numerically the complete ocean dynamics.

Finally, the STUOD Organizing Committee would like to acknowledge the financial and in-kind support received from several sources: the European Research Council (ERC) under the European Union's Horizon 2020 Research and Innovation Programme (ERC, Grant Agreement No 856408)—for providing funds to cover the travel expenses of the invited speakers, catering costs, and administrative support; Imperial College London—for offering the conference venue.

May 2023

Plouzané, France Bertrand Chapron London, UK Dan Crisan London, UK Darryl Holm Rennes, France Etienne Mémin London, UK Anna Radomska

# **Contents**



# **Internal Tides Energy Transfers and Interactions with the Mesoscale Circulation in Two Contrasted Areas of the North Atlantic**

**Adrien Bella, Noé Lahaye, and Gilles Tissot**

**Abstract** The energy budget of the internal tide and its life cycle is investigated with a high resolution numerical simulation and a vertical normal mode decomposition. Two areas of interest are considered: the Azores Islands over the mid Atlantic ridge and the Gulf Stream offshore the North of the US East coast shelf break. Low mode (1 and 2) internal tides are found to propagate from 100 km (mode 2) to 1000 km (mode 1) away from their generation sites. Waves loose a significant portion of their energy as they propagate through the Gulf Stream, in contrast to the Azores domain. In the Gulf Stream domain, the mesoscale circulation is responsible for energy transfers from low to high modes internal tides, while the topographic scattering is dominant in the Azores area. This transfer of energy toward high modes favours energy dissipation. The mesoscale is significant in the energy budget of modes higher than mode 1 for both domains, and for all baroclinic modes in the Gulf Stream area. The internal tide is found to extract or loose energy toward the mesoscale circulation, but this accounts for less than 14%, of the energy scattered from low internal tide modes to higher ones once summed over all contributions of the modal energy budget.

**Keywords** Internal tide · Mesoscale flow · Energy budget · Vertical modal decomposition · High resolution numerical simulation

# **1 Introduction**

Internal tides are a category of oceanic internal inertia-gravity waves that are generated when the astronomical tide interacts with topographic features such as shelf breaks, ridges or seamounts. They are encountered in many areas in the ocean [Zaron et al. 2022, Zhao et al. 2016, among others] and are of crucial importance for

A. Bella (-) · N. Lahaye · G. Tissot

INRIA Rennes Bretagne Atlantique and IRMAR – UMR CNRS 6625, Rennes, France e-mail: adrien.bella@inria.fr

<sup>©</sup> The Author(s) 2024

B. Chapron et al. (eds.), *Stochastic Transport in Upper Ocean Dynamics II*, Mathematics of Planet Earth 11, https://doi.org/10.1007/978-3-031-40094-0\_1

the mixing of deep water and the closure of the Meridional Overturning circulation [Munk and Wunsch 1998]. The role of mesoscale currents for the energy pathway starting from the barotropic tide input and leading to this mixing is not yet fully understood. As a different thread of motivation, internal tides exhibit the same spatial scale as (sub-)mesoscale features, which renders challenging the estimation of the geostrophic velocities and the disentanglement of the two types of motions from satellite altimetry [Ponte and Klein 2015]. This problem is exacerbated by the ability of the new SWOT mission to obtain a resolution of 20 km to 30 km, compared to the approximately 100 km of current altimeter products [Ballarotta et al. 2019]. It is therefore needed to better understand the dynamics of the internal tide if we hope to disentangle it from the geostrophic field via data assimilation for instance.

Since a decade or so [e.g. Arbic et al. 2010], global and regional realistic simulations have reached a sufficient resolution to be able to represent both internal tides and mesoscale features, opening the way to studying their interactions. Most studies focusing on these interactions use the vertical modal decomposition framework [Gill 1982], which allows to get a clear separation of the barotropic and baroclinic tides as well as alleviating the computational cost associated with processing 3D simulation outputs.

The importance of the mesoscale circulation and buoyancy field have been shown by Kelly and Lermusiaux [2016] in a realistic configuration over the Gulf Stream area for the energy budget of the first baroclinic mode (denoted mode 1). They quantified the energy transfers between the mode 1 and the mesoscale circulation as well as its relative importance compared to topographic effects, showing that the mesoscale explains first order interactions pattern visible on the mode 1 energy flux divergence. Similarly, Pan et al. [2021] have analysed the energy budget of the mode 1 in a realistic setup including the mesoscale circulation and heterogeneous buoyancy field in the Gulf Stream area and the Mallau Island, leading to the same conclusion concerning the importance of mesoscale—internal tides interactions. Both studies have also considered realistic simulations with baroclinic currents, and have used a modal decomposition approach to define their energy budget. The deviation of internal tide energy flux have been analysed by Duda et al. [2018], indicating a refracting behaviour of a Gulf Stream like current on beams of internal tides energy fluxes. Dunphy and Lamb [2014] have shown in an idealised setup that the interactions of the first baroclinic internal tide mode with the barotropic eddy and the first baroclinic eddy mode lead to a modification of the energy flux propagation and a scattering of energy toward higher modes, respectively.

In the present paper, our focus will be on the energy budget yielded by a modal analysis and we will in particular extend the work of other studies to the 10 first baroclinic modes, as well as studying the couplings between modes.

The contribution of the mesoscale flow and buoyancy field will be taken into account and the relative importance of the coupling between modes as well as their spatial pattern will be quantified in an area where topographic features are prominent, and another area with a strong mesoscale circulation.

The document is organised as follows: Sect. 2 will explain the theoretical framework with the modal decomposition and the governing dynamical equations. Section 3 will detail the dataset used and computational techniques. Section 4 will present results. Section 5 closes this article with a conclusion.

## **2 Governing Equations and Energy Budget**

To investigate the modal internal tide energy budget, we rely on linearized equations of motions projected on vertical normal modes. The derivation follows previous studies (e.g. Kelly and Lermusiaux [2016]) and is only briefly depicted here. It starts from the hydrostatic primitive equations under the Boussinesq approximation with a free surface [e.g. Vallis 2017]:

$$
\partial\_t \mathbf{u}\_h + \mathbf{u} \cdot \nabla \mathbf{u}\_h + f \ddot{e}\_z \wedge \mathbf{u}\_h = -\nabla\_h p - \nabla \Pi\_{\text{tide}},\tag{1a}
$$

$$
\partial\_{\mathbb{Z}} p - b = 0,\tag{1b}
$$

$$
\partial\_t b + \mathfrak{u} \cdot \nabla b = 0,\tag{1c}
$$

$$\nabla \cdot \boldsymbol{\mu} = 0,\tag{1d}$$

where *u* is the velocity, *b* the buoyancy, *p* the reduced pressure (divided by *ρ*0, the reference density), and the index *h* denotes horizontal component of a 3D vector. Other variable names follow standard notations. Here, the forcing and dissipation terms have been omitted on purpose, except for the tidal potential term *Π*tide.

Next, the flow is decomposed into a low-frequency mesoscale flow and a highfrequency component that includes the internal tide: i.e., for the velocity, we split *u* = *U* + *u*˜. We will further decompose the low frequency part of the buoyancy into a long-time mean and a slowly variable part: *B* = *B*¯ + *B* (· denotes long time average), and introduce the Brunt-Väisälä frequency: *<sup>N</sup>*<sup>2</sup> <sup>=</sup> *∂zB* and *<sup>N</sup>*¯ <sup>2</sup> <sup>=</sup> *∂zB*¯ . Subtracting the low frequency equations (1) to the initial system and further assuming that nonlinearities amongst the high-frequency flow are negligible, one obtains the following linear system of equations:

$$
\partial\_t \tilde{\mathfrak{u}} + U \cdot \nabla \tilde{\mathfrak{u}} + \tilde{\mathfrak{u}} \cdot \nabla U + f \tilde{e}\_{\varepsilon} \wedge \tilde{\mathfrak{u}} = -\nabla \tilde{p} - \nabla \Pi\_{\text{tide}}, \qquad (2a)
$$

$$
\partial\_{\varepsilon} \tilde{p} - \tilde{b} = 0,\tag{2b}
$$

$$
\partial\_t \tilde{b} + \mathbf{U} \cdot \nabla \tilde{b} + \tilde{\mathbf{u}}\_h \cdot \nabla\_h (\bar{B} + B') + \tilde{w} (\bar{N}^2 + N'^2) = 0,\tag{2c}
$$

$$\nabla \cdot \tilde{\mathfrak{u}} = 0.\tag{2d}$$

Note that, since the above system of equations is linear in the high-frequency variables, and assuming that most of the mesoscale variability has a timescale longer or equal to the cutoff period used in the internal tides complex demodulation (3 days), band-pass filtering of these equations leaves the system unchanged. Hence, in our study, we will focus on the semi diurnal (including M2 and S2 frequencies) internal tide with a period centred on 12*.*2 h, relying on the above set of equations where the high-frequency variables describe this frequency band. These equations are completed by the usual linearized boundary conditions at the mean surface *η* and the bottom at *z* = −*H*:

$$
\tilde{p}(\overline{\eta}) = \mathbf{g}\,\overline{\eta},\tag{3}
$$

$$
\tilde{w}(\overline{\eta}) = \partial\_t(\tilde{\eta}) + U\_h(\overline{\eta}) \cdot \nabla\_h(\tilde{\eta}), \tag{4}
$$

$$
\tilde{w} = -\tilde{u}\_h(-H) \cdot \nabla H. \tag{5}
$$

We thus have the internal tide part of the primitives equations that are now ready to be projected on a set of verticals modes. The latter are given locally by the standard Sturm-Liouville problem for internal waves in a rotating stratified ocean [e.g. Gill 1982, Vallis 2017], assuming a flat ocean, a horizontally-homogeneous stratification profile and no background flow:

$$\left(\frac{\boldsymbol{\Phi}\_{n}^{\prime}}{\bar{N}^{2}}\right)^{\prime} + \frac{\boldsymbol{\Phi}\_{n}}{c\_{n}^{2}} = 0,\text{ with }\boldsymbol{\Phi}\_{n}^{\prime} = 0 \text{ at}\,\boldsymbol{z} = -H,\text{ and }\boldsymbol{g}\boldsymbol{\Phi}\_{n}^{\prime} + \bar{N}^{2}\boldsymbol{\Phi}\_{n} = 0 \text{ at}\,\boldsymbol{z} = \overline{\eta},\tag{6}$$

where prime denotes vertical derivative. The obtained set of vertical modes *Φn* is complemented by another set of vertical functions which obey:

$$
\boldsymbol{\phi}\_n^{'} = \boldsymbol{\Phi}\_n,\ \boldsymbol{\Phi}\_n^{'} = -\frac{\bar{N}^2}{c\_n^2} \boldsymbol{\varphi}\_n,\tag{7}
$$

and these modes follow the orthogonality conditions:

$$\int\_{-H}^{\overline{\eta}} \Phi\_m \Phi\_n \, \mathrm{d}z = \int\_{-H}^{\overline{\eta}} \frac{\bar{N}^2}{c\_n^2} \varphi\_m \varphi\_n \, \mathrm{d}z + \frac{g}{c\_n^2}, \varphi\_m(\overline{\eta}) \varphi\_n(\overline{\eta}) = H \, \delta\_{mn}.\tag{8}$$

Physical fields can then be expanded over these bases through:

$$[\mathfrak{u}\_h, p] = \sum\_n [\mathfrak{u}\_n, p\_n] \Phi\_n,\tag{9}$$

$$[w, b] = \sum\_{n} [w\_n, \bar{N}^2 b\_n] \varphi\_n,\tag{10}$$

and the modal projection coefficients are obtained as follows:

$$\left[\mathfrak{u}\_n,\,p\_n\right] = \left<[\mathfrak{u}\_h,\,p],\,\Phi\_n\right>,\tag{11}$$

$$w\_n = \frac{1}{c\_n^2} \left( \left< \varphi\_n, w \bar{N}^2 \right> + \frac{g}{H} w(\overline{\eta}) \varphi\_n(\overline{\eta}) \right),\tag{12}$$

Internal Tides Energy Transfers and Interactions with the Mesoscale Circulation... 5

$$b\_n = \frac{1}{c\_n^2} \left( \langle \varphi\_n, b \rangle + \frac{\text{g}}{H} \frac{b(\overline{\eta})}{\bar{N}^2(\overline{\eta})} \varphi\_n(\overline{\eta}) \right), \tag{13}$$

with the inner product defined as such: *f, g* <sup>=</sup> <sup>1</sup> *H η* <sup>−</sup>*<sup>H</sup> f (z)g(z)* <sup>d</sup>*z*. Note that the two bases have a parametric dependency on the horizontal coordinates because of the varying topography *H (x, y)* and stratification profile *<sup>N</sup>*¯ <sup>2</sup>*(z*; *x, y)*. In this paper, we are going to consider the modes 0 to 10, with the mode 0 being the barotropic tide, and all others being the baroclinic internal tide.

The dynamical equations are then obtained by projecting the tidal momentum equations and continuity on a mode *Φm*, and buoyancy equations on the mode *ϕm*, using the relations exposed above plus integration by part and Leibniz formula, and substituting the buoyancy modal amplitude by the pressure anomaly (after projection of the hydrostatic balance on *ϕm*) [Kelly and Lermusiaux 2016, Kelly 2016, Pan et al. 2021]. Omitting the ˜· notation from now on, one obtains—after a tedious but straightforward derivation:

$$\begin{split} \partial\_{l} \mathbf{u}\_{m} + \nabla\_{h} p\_{m} + f \vec{e}\_{z} \wedge \mathbf{u}\_{m} &= -\sum\_{n} U\_{mn} \cdot \nabla \mathbf{u}\_{n} - \sum\_{n} \mathbf{u}\_{n} U\_{mn}^{\Phi} \\ &- \sum\_{n} \mathbf{u}\_{n} \cdot U\_{mn}^{\nabla} - \sum\_{n} w\_{n} U\_{mn}^{z} \\ &- \sum\_{n} p\_{n} T\_{mn} - \frac{1}{H} \nabla \varPi\_{\text{idde}} (\wp\_{m}(\overline{\eta}) - \wp\_{m}(-H)), \end{split} \tag{14a}$$

$$\begin{split} \partial\_{l}p\_{m} - c\_{m}^{2}w\_{m} &= -\frac{g}{H}\varphi\_{m}(\overline{\eta})U\_{h}(\overline{\eta}) \cdot \nabla\eta - \sum\_{n} U\_{mn}^{p} \cdot \nabla p\_{n} \\ &- \sum\_{n} p\_{n} \left\langle \varphi\_{m}, \mathcal{U}\_{h} \cdot \nabla \left(\frac{\bar{N}^{2}}{c\_{n}^{2}}\varphi\_{n}\right) \right\rangle \\ &+ \sum\_{n} \mathfrak{u}\_{n} \cdot (B\_{mn} - \bar{B}\_{mn}) + \sum\_{n} w\_{n} \left\langle \varphi\_{m}, \varphi\_{n}N^{2} \right\rangle, \end{split} \tag{14b}$$

$$\nabla\_h \cdot (H\mathfrak{u}\_m) + H\, w\_m = H \sum\_n \mathfrak{u}\_n T\_{mn},\tag{14c}$$

with:

$$\begin{aligned} T\_{mn} &= \left< \Phi\_m, \nabla\_h \Phi\_n \right>, \qquad U\_{mn} = \left< \Phi\_m, U\_h \Phi\_n \right>, \qquad U\_{mn}^{\Phi} = \left< \Phi\_m, U\_h \cdot \nabla\_h \Phi\_n \right>, \\\ U\_{mn}^{\nabla} &= \left< \Phi\_m, \Phi\_n \nabla U\_h \right>, \quad U\_{mn}^{\sharp} = \left< \Phi\_m, \varphi\_n \partial\_{\boldsymbol{\varepsilon}} U\_h \right>, \quad U\_{mn}^{p} = \left< \varphi\_m, \operatorname{U}\_{\mathrm{tot}} \frac{\bar{N}^2}{c\_n^2} \varphi\_n \right>, \\\ B\_{mn} &= \left< \varphi\_m, \Phi\_n \nabla B \right>, \qquad \bar{B}\_{mn} = \left< \varphi\_m, \Phi\_n \nabla\_h \bar{B} \right>. \end{aligned}$$

We then take the scalar product of the projected momentum equation (14a) with *un*, multiply the pressure equation (14b) with *pm* and substitute *wn* using the continuity equation (14c) to obtain the energy budget of the internal tide mode *m* in presence of a slowly varying mesoscale circulation and buoyancy field:

$$
\begin{aligned}
\boxed{\frac{\text{D}}{\text{d}t}\left(\frac{\text{u}\_{m}^{2}}{2}+\frac{p\_{m}^{2}}{2c\_{m}^{2}}\right)}\quad&\overleftrightarrow{\text{divergence of energy flux}}\ \text{au}\ \text{s}\ \text{where}\ \text{s}\ \text{u}\ \text{s}\ \text{u}\ \text{s}\ \text{u}\ \text{s}\ \text{u}\ \text{u}\ \text{s}\ \text{u}\ \text{u}\ \text{s}\ \text{u}\ \text{u}\ \text{s}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text{u}\ \text$$

with

advection of ssh by the background flow

$$\begin{split} R\_m &= -\frac{P\_m}{Hc\_m^2} \operatorname{g} \varphi\_m(\overline{\eta}) \boldsymbol{U}\_h(\overline{\eta}) \cdot \nabla \overline{\eta} \\ &\quad \underbrace{\operatorname{aduction of mean stratification by the background flow}}\_{n} \\ &- \sum\_n \quad \frac{P\_m}{c\_m^2} \sum\_n p\_n \left\langle \varphi\_m, \boldsymbol{U}\_h \cdot \nabla\_h \left( \frac{\bar{\boldsymbol{N}}^2}{c\_n^2} \varphi\_n \right) \right\rangle}\_{-\left( \overline{\boldsymbol{\mu}\_m} \cdot \nabla \overline{\boldsymbol{H}\_{\boldsymbol{I}\boldsymbol{L}} \boldsymbol{\mu}\_{\boldsymbol{I}}} \right)} \widetilde{\boldsymbol{N}}\_{\boldsymbol{I}}(\overline{\boldsymbol{\mu}\_m}) \cdot \nabla \overline{\boldsymbol{\mu}\_{\boldsymbol{I}} \left( \frac{\bar{\boldsymbol{N}}^2}{c\_n^2} \varphi\_n \right)} \widetilde{\boldsymbol{N}}\_{\boldsymbol{I}} \\ &- \overbrace{\boldsymbol{\mu}\_m \cdot \nabla \boldsymbol{H}\_{\boldsymbol{I}\boldsymbol{L}\boldsymbol{L}} \boldsymbol{\mu}\_{\boldsymbol{I}} (\varphi\_m(\overline{\boldsymbol{\eta}\_l}) - \varphi\_m(-\boldsymbol{H}))}^{\text{Tidal potential work}}. \tag{16}$$

By anticipation of the results in Fig. 3 in Sect. 4.2, the term *Rm* gathers all contributions that are not of primary importance (and will not be discussed in this paper).

This equation describes the different energy exchanges (coupling) between modes as well as energy sources and sinks. Most coupling terms can be represented with coupling matrices –noted *Kmn* here—that will be shown below in Sect. 4.2 (Fig. 2). Generally, this coupling matrix contains an anti-symmetric part, describing net exchanges of energy between modes, and a symmetric part, with the diagonal included, that describes gain/loss of energy for the internal tide.

The first three terms in the right-hand side (RHS) arise from the presence of a mesoscale flow, while the next two ones include contributions from associated buoyancy perturbations as well as seasonal variations. The term labelled as slowly variable stratification arises during the derivation of the CSW buoyancy equation (14) because of our choice to construct the modal bases by using a stationary profile of stratification in the problem (6). In case of a definition of the vertical modal bases with the instantaneous stratification profile, this term would vanish. Indeed, for a linear propagation of an internal wave without topography, mesoscale and dissipation, the vertical modes diagonalise the operator, thus decouples the components from each other. By time-averaging the stratification profile, the associated vertical mode basis becomes inadequate to perform this decoupling. The slowly variable stratification term then arises, taking into account in the energy equation coupling between modes associated with this linear propagation. This term can be interpreted as energy exchanges between modes during a linear propagation of an internal wave without topography, mesoscale and dissipation. Since vertical modes are eigenfunctions of this associated idealised operator, we associate the variable stratification term to the inadequacy of the basis to represent internal wave dynamics (through the time-averaged stratification profile). We do not expect, however, that using a basis defined with a slowly time-varying stratification would change significantly the magnitude of the dominant interaction terms discussed in this paper (see e.g. Sect. 4.2), and in particular the mesoscale and topographic contributions.

Finally, *Cmn* = −*Cnm* is the topographic scattering, which includes the generation of baroclinic tide from the barotropic tide (*Cm*0). Note that this term has no symmetric part—it can only redistribute energy amongst the vertical modes.

Once averaged in time over a sufficiently long period (typically, one month), the time-variation of the energy vanishes and the above modal energy equation reduces to the following, simplified form:

$$\nabla\_h \mathbf{F}\_m = -\sum\_n K\_{mn} + R\_m,\tag{17}$$

where **F***<sup>m</sup>* denotes the modal energy flux, *Kmn* gathers the energy transfers, sources and sinks that are not negligible and *Rm* gathers all the neglected terms. These include a contribution from the free surface and the tidal potential, which would both vanish under a rigid-lid assumption for the baroclinic modes—in agreement with the fact that we find they have negligible magnitude in our diagnostics (cf. Sect. 4). The remaining term in *Rm* involves horizontal variations of the modes and associated background stratification, which is small since a year average is used.

# **3 Data and Method**

## *3.1 eNATL60 Simulation*

The high-resolution realistic simulation of the North Atlantic Ocean eNATL60 [Brodeau et al. 2020] is used to diagnose the energy budget of the internal tide following Eq. (15). It uses the Nucleus for European Modelling of the Ocean (NEMO) model, which solves the primitive equations under the Boussinesq and hydrostatic approximations with an Arakawa C grid and z-coordinate with partial step [Madec and the NEMO team 2008]. The eNATL60 run includes astronomical tidal forcing with the M2, S2, N2, O1 and K1 frequencies. In addition, surface forcing from the 3-hourly ERA-interim (ECMWF) reanalysis is used, which enables the simulation to develop a realistic mesoscale field. The horizontal resolution of the simulation is 1/60◦ (about 1.5 km in the mid latitudes) and features 300 vertical levels with a thickness starting from less than 1 m at the top of the ocean to 100 m at the bottom. We processed the hourly outputs of the horizontal velocity *u*, sea level, temperature and salinity fields from which we computed the pressure and stratification.

We will use the 2009 October month over two subdomains representative of different dynamical regimes of the ocean. The first one is centred over the Azores Islands and the North Mid-Atlantic Ridge. The second one is located offshore the North of the US East coast and includes a portion of the Gulf Stream. The Azores domain is characterised by the predominance of topographic features such as the mid Atlantic ridge exhibiting low amplitudes topographic variation over 100 km, and a group of seamount with strong slope and a typical scale of 10 km. Comparatively to the Gulf Stream area, the mesoscale flow is weak. The Gulf Stream domain is marked by a strong mesoscale activity and a large continental slope leading to a flat abyssal plain with a few surrounding seamounts (see Fig. 1 discussed below).

## *3.2 Filtering and Computing Methods*

In order to obtain the vertical normal modes bases, the mean stratification *N*¯ <sup>2</sup> is computed from the time average of temperature and salinity using the nonlinear equation of state from TEOS-10. The Sturm-Liouville problem (6) is then discretised and solved on the staggered vertical grid in each horizontal cell, giving the two modal bases on the T grid, at the centre (in the horizontal) of each cell. From there, the modal amplitude *un*, *vn* and *pn* are obtained by projecting the corresponding fields *u* and *p*, following (11)–(13). Since the horizontal grid is staggered, the horizontal velocity is located at the edge of each cell. We thus interpolate the base *Φn* before projecting the velocities. Unfortunately, this step induces a loss of orthogonality for the newly interpolated basis. The coefficients determined

**Fig. 1** One-month average of the horizontal energy flux divergence for mode 1 (left column) and mode 3 (right column), for the Azores domain (top) and the Gulf Stream domain (bottom). A positive value indicates a gain of energy for the mode. Black arrows show the square root of the corresponding modal horizontal energy flux. A spatial Gaussian filter with a half modal wavelength kernel has been applied on these fields in order to remove the local alternate of positive and negative values and make the plot cleaner. Green streamlines indicate, qualitatively, the mesoscale horizontal currents at 52 m depth, and isobaths at 1000, 2000, and 3000 m are superimposed in grey contours

by innerproduct between the horizontal velocity fields and the interpolated basis are corrected *a posteriori* by inverting the cross-correlation matrix between the interpolated basis and the local basis issued from the Sturm-Liouville problem (6) on the *u, v*-grid. This allows us to obtain the projection coefficients *un* and *vn* defined on the *u, v*-points. Vertical velocity is then computed using the continuity equation formulated with the discrete scheme employed in the NEMO code [Madec and the NEMO team 2008].

The next step consists in separating the mesoscale contribution from the semi diurnal internal tide. The latter is extracted by means of a complex demodulation at a central semi-diurnal frequency *<sup>ω</sup>* <sup>=</sup> <sup>1</sup>*.*415×10−4rad s-1 and a low-pass cutoff cutting period of 3 days. The mesoscale flow is directly low-pass filtered with the same cutting period. The mesoscale slowly-variable buoyancy fields is computed from the daily-averaged temperature and salinity, and the corresponding stratification is then estimated from this buoyancy field and a mean sound speed profile [using standard formula, see e.g. Vallis 2017].

Since our modal decomposition is inaccurate in areas of shallow water, a mask is applied to remove all locations shallower than 300 m in the Azores domain, and 250 m in the Gulf Stream domain. The mask is slightly shallower in the Gulf Stream domain in order to retain the topographic generation occurring at the shelf break.

Since high modes are expected to propagate over shorter distances from their generation site compared to low modes, we usually group the modes 4–10 together, and interpret them as part of dissipative processes in the energy budget. Finally, all terms of Eq. (15) display small scale patterns, particularly pronounced when the mesoscale and internal tides interact. These patterns are not of particular physical significance for the diagnostics discussed in this paper. Therefore, we apply a Gaussian spatial filter before displaying any term in (15), with a kernel width equal to 25 grid points (a little less than 40 km). This procedure is only applied for map plotting.

## **4 Results**

The energy budget of internal tides in both domains are examined in this section, first by considering the geographical distribution of their generation, propagation and sinks, and secondly by quantifying the spatially integrated and temporally averaged energy budget.

## *4.1 Life Cycle of the Internal Tide*

Both the Azores Islands and Gulf Stream domains are sites of powerful internal tides generation, exposed to contrasted background conditions. The divergence of the energy flux averaged in time over one month (for the period of October 2009) is shown in Fig. 1, along with the mesoscale currents at 52 m depth and the topography. In the Azores area, mode 1 energy flux originates from the seamounts and propagates far away (at least 1000 km), indicating a moderate energy loss during the propagation (Fig. 1, top left panel). Modes 2 (not shown) and 3 propagate over shorter distances. All dominant sources and sinks are located around the topography (visible in Fig. 1). For the third mode, we find a local loss of energy just next to the generation at the seamount, of the same order of magnitude but smaller value. This property holds for higher modes as well (not shown). In the Gulf Stream domain, in contrast, the input of energy for mode 1 is located at the shelf break, producing a strong offshore beam. This beam manifests strong energy loss as it propagates as can be seen in the negative patch displayed in the mode 1 energy flux divergence, Fig. 1, bottom left panel. A first hot spot of sinks for the mode 1 is directly next to the shelf-break, another being at the encounter with the strong north-eastward current. This beam is also refracted by the Gulf Stream, a behaviour previously reported in Duda et al. [2018]. As we will show in the next section (cf. Figs. 2 and 3), this loss of energy is mainly caused by the advection terms in Eq. (15) and is largely converted in mode 2 energy. Mode 3 (Fig. 1, bottom right panel) displays a more complex repartition of sources and sinks, with an overall gain of energy at and near the shelfbreak and an overall loss of energy further away from the land, where it encounters the current. The sources and sinks of energy from modes 1 and 3 are therefore much less localised around the prominent topographic feature than in the Azores domain, and clearly exhibit strong interactions with the mesoscale currents.

# *4.2 Importance of the Different Contributions in the Energy Transfers*

We now focus on quantifying the causes of the multiple interactions of the internal tide with its surrounding which results in the pattern in the energy flux divergence previously described (Fig. 1).

#### **4.2.1 Detailed View of Coupling Terms**

To this aim, the modal energy budget (15) is integrated over the two domains considered, averaged over the October month and the different contributions are displayed under their matrix form in Fig. 3. The matrix are decomposed into their anti-symmetric part (upper triangle) denoting a transfer of energy from mode to mode, referred afterwards as scattering, and symmetric part (diagonal plus lower triangle) showing sources or sinks for the internal tide. We have only plotted terms of (15) that are of first order for at least one mode in one domain.

Amongst these terms, the topographic contribution is at least one order of magnitude higher than all others for the barotropic mode in both domains and for the first baroclinic mode in the Azores domain. However, for the Gulf Stream area, the advection coupling becomes of similar magnitude for the first mode and exceed the topographic scattering for modes 2 and 3. In the Azores domain, this is non negligible for modes 2 and 3 but never exceeds the topographic contribution. In this region, the second largest contribution is given by the variable stratification coupling for the baroclinic modes. In The Gulf Stream area, the mesoscale vertical shear production and the horizontal buoyancy gradient loss are also of first order. However, as already found in Kelly and Lermusiaux [2016], these two terms seem to compensate each other leaving a near zero energy gain for mode 1. The same phenomenon approximately occurs for higher modes, cf. Fig. 2: the lower triangles of buoyancy and shear matrix cancel out.

These dominant processes are dominated by their anti-symmetric part. The topographic contribution is purely anti-symmetric by construction, while the sum of the symmetric part for a given mode for the advection coupling is always smaller than 10% of the sum of the anti-symmetric part when this term is significant in regard to the topographic scattering. This quantity never exceeds 20% for the variable stratification coupling in the Azores domain. For the Gulf Stream domain **Fig. 2** Matrices *Knm* summarising the mean energy transfer integrated over the whole subdomains between a mode *n* (row) and a mode *m* (column). The upper-right triangle is the anti-symmetric part, and the lower-left triangle (diagonal included) the symmetric part. The sign convention follows Eq. (15): a positive value indicates a gain of energy for the reference—"receiving" mode *m* from *n* mode and conversely. Modes 4 to 10 are grouped together. From top to bottom are the coupling caused by the topography, the internal tide advection by the background flow, the terms implying the horizontal and vertical shear of the background flow and the term implying the gradient of the buoyancy field (both variable and annual mean). Notice the different range of values in the colorbar for the topographic coupling terms compared to the other ones

however, it is close to one for the first baroclinic mode, indicating that symmetric and antisymmetric parts have comparable magnitudes. The background flow shear and buoyancy gradient couplings have a dominant symmetric part, but these two terms are cancelling each others. Last, if we sum all the contributions of the modal energy budget (15), the symmetric transfer of energy is found to be worth less or equal to 14% for modes 1 , 3 in both domains and 2 in the Azores one. Therefore, the interaction between the mesoscale circulation and the internal tides mainly contribute to a scattering of modes equivalent to the one produced by the topography without exchanges between the internal tides and the mesoscale circulation and buoyancy field.

As a last comment, Fig. 2 shows that topographic scattering is the only process that significantly transfers energy between non-neighbouring modes. For the other dominant processes, the energy transfer toward non-neighbouring modes is one order of magnitude lower than the one toward a neighbouring mode, except for the variable stratification coupling in the Gulf Stream area. For the mesoscale contributions, this implies that only the third mode is able to transfer energy toward what we consider as part of dissipative processes.

#### **4.2.2 Modal Energy Budget**

We are now looking at a more aggregated view of (15) thanks to Fig. 3 which shows the modal energy budget of mode 1, 2 and 3 for both domains. Here we have separated the energy transfers with the following decomposition:

Topographic Scattering : *Cnm* = *pmu<sup>n</sup>* · *Tnm* − *pnu<sup>m</sup>* · *Tmn* Advection of the internal tide : *Anm* <sup>=</sup> *<sup>u</sup><sup>m</sup>* · *(Umn* · **<sup>∇</sup>***un)* <sup>−</sup> *<sup>u</sup><sup>m</sup>* · *<sup>u</sup>nU<sup>Φ</sup> mn* <sup>+</sup> *pm c*2 *m Up mn* ·

**∇***pn* Shear of background flow : ∇*Unm* = *u<sup>n</sup>* · *U***<sup>∇</sup>** *mn* · *<sup>u</sup><sup>m</sup>* <sup>+</sup> *<sup>u</sup><sup>m</sup>* · *<sup>U</sup><sup>z</sup> mnwn* Horizontal gradient of buoyancy field : ∇*Bnm* <sup>=</sup> *pm H c*2 *m u<sup>n</sup>* · *(B*¯*mn* + *Bmn)* Surface contribution and advection of the mean stratification profile : *Snm* =

<sup>−</sup> *pm H c*2 *m gϕm(η)Uh(η)* · **<sup>∇</sup>***η*˜ <sup>−</sup> *pm c*2 *m <sup>n</sup> pn ϕm, U<sup>h</sup>* · **∇***<sup>h</sup> N*¯ 2 *c*2 *n ϕn* Variable stratification : *Nnm* <sup>=</sup> *pm c*2 *m wn ϕm, N*<sup>2</sup>*ϕn* 

We also have separated the symmetric part (\*) of the antisymmetric one, and have furthermore divided the latter into an energy exchange coming from lower modes, and an energy exchange going from higher modes (\*\*). The flux coming from lower modes is again divided into an energy transfer coming from the barotropic mode (\*\*\*\*), and transfers coming from the baroclinic ones (\*\*\*). A positive value means a transfer of energy toward the mode of interest.

If we consider i the index of the mode of interest and *Anm* the energy transfer matrix already separated into symmetric and antisymmetric part as in Fig. 2, then we have :

$$\begin{array}{l} \mathbb{K}^{\& \#\mathbb{K}} = A\_{0i} \\ \mathbb{K}^{\& \#\mathbb{K}} = \sum\_{n=1}^{i-1} A\_{ni} \\ \mathbb{K}^{\& \#\mathbb{K}} = -\sum\_{n=i+1}^{4} A\_{in} \\ \mathbb{K}^{\&} = \sum\_{n=0}^{i} A\_{in} - \sum\_{n=i+1}^{4} A\_{n} \mathbb{K} \end{array}$$

Comparing the integrated value of the energy divergence flux, the Gulf-Stream domain is overall less energetic than the Azores domain: mode 1 has a divergence around 4 time larger in the last domain than in the first and modes 2 and 3 are also

**Fig. 3** Mean modal budget for modes 1 (top), 2 (middle row) and 3 (bottom) for the Azores domain (left) and the Gulf Stream (right) domains, as described by Eq. (15). The divergence of the modal flux is plotted in red, while others contributions are separated into the symmetric part (green) and the anti-symmetric part where interaction with lower modes and with higher modes (orange) are separated. The anti-symmetric part coming from low modes is furthermore separated into fluxes coming from the barotropic mode (blue) and fluxes coming from the baroclinic modes (fuchsia). A positive value means a gain of energy for the mode. Processes taken into account are Topo : *Cnm* , Adv : *Anm*, shear : ∇*Unm*, ssh + Ns : *Snm*, Buo : ∇*Bnm* , Strat : *Nnm*

more energetic in the Azores area. In the results displayed in Fig. 3, the mode 1 in the Gulf-Stream domain is the only one to have an overall negative energy flux divergence.

The barotropic mode has an importance in the energy budget of mode 1 to 3 only through the topographic scattering, and is dominant in the energy budget of mode 1 for both areas. However, it becomes less important for modes 2 and 3, with a contribution to the topographic scattering toward mode 3 inferior to the one caused by modes 1 and 2, but still significant.

Among the first order energy transfers exposed in all panels of Fig. 3, none transfers energy from small scale to large scale. Mesoscale and topographic couplings therefore create a forward energy cascade.

Last, Fig. 3 enables identifying the terms of the modal energy budget (15) that are of negligible importance. The advection of ssh and stationary stratification are negligible, which justifies gathering these terms in Eq. (15) in a small residual term. Similarly, the astronomical tidal potential (not included in Fig. 3) only projects weakly on the baroclinic modes. The mean of temporal variations of potential and kinetic energy are also small: one month is enough to flatten the two weeks periodic variation introduced by the spring neap cycle.

## **5 Conclusion**

The semidiurnal internal tide energy budget was characterised based on a vertical mode decomposition, allowing a detailed investigation of the energy transfers amongst different vertical scales—and associated horizontal scales—of the internal tide field. We showed that the Azores area, a region of weak mesoscale activity and strong topographic features, is essentially dominated by the topographic induced energy transfer from low to higher modes. While this effect is dominant for low modes, variable stratification and—to a lesser extent—advection of the internal tide by the mesoscale flow become of similar importance for high modes (equal or larger than 2). Some questions remain concerning the precise role of the variable stratification, in relation with the definition of the modal basis expansion and how this definition would change the energy exchanges between modes. In the Gulf Stream area, the importance of the advection of the internal tide by the background flow is comparable to the topographic scattering, and even greater for modes 2 and higher. The variable stratification term is less important. However, the net impact of the mesoscale circulation on the energetic of internal tide is mainly limited to a scattering of energy from the low to high-order modes. In comparison, exchange of energy between the internal tide and the mesoscale fields are negligible: the energy exchanged by modes 1 to 3 with the background is always less than 14% of the total energy transfer for the respective mode.

The type of modal analysis conducted in the present paper on a high resolution primitive equation simulation is able to give some insights on the energetic impact of mesoscale circulation and its associated buoyancy field on the internal tide and the transfer of energy from the astronomical tide toward dissipation. This work could be extended by investigating the deep water mixing caused by these interactions. We have in particular overlooked for now the internal tide dissipation, but it is worth investigating in the near future.

Finally, the temporal variability of mesoscale—internal tides interactions will also be studied in the near future.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Sparse-Stochastic Model Reduction for 2D Euler Equations**

**Paolo Cifani, Sagy Ephrati, and Milo Viviani**

**Abstract** The 2D Euler equations are a simple but rich set of non-linear PDEs that describe the evolution of an ideal inviscid fluid, for which one dimension is negligible. Solving these equations numerically can be extremely demanding. Several techniques to obtain fast and accurate simulations have been developed during the last decades. In this paper, we present a novel approach that combines recent developments in stochastic model reduction and conservative semi-discretization of the Euler equations. In particular, starting from the Zeitlin model on the 2-sphere, we derive reduced dynamics for large scales and we close the equations either deterministically or with a suitable stochastic term. Numerical experiments show that, after an initial turbulent regime, the influence of small scales to large scales is negligible, even though a non-zero transfer of energy among different modes is present.

# **1 Introduction**

The 2D Euler equations are a fundamental model for ideal fluids [10]. During the last two centuries, these equations have stimulated an intense activity both in terms of mathematics and physics (see for example the seminal works of Helmholtz and Arnol'd [14, 2]). In computational science and numerical analysis, retaining at a discrete level the rich non trivial structure of these equations is still

P. Cifani · S. Ephrati

University of Twente, Enschede, The Netherlands e-mail: p.cifani@utwente.nl; s.r.ephrati@utwente.nl

M. Viviani (-) Scuola Normale Superiore of Pisa, Pisa, Italy e-mail: milo.viviani@sns.it

**Supplementary Information** The online version contains supplementary material available at https://github.com/cifanip/GLIFS.

a challenging problem [1, 19]. One main computational issue is the "curse of dimensionality". Indeed, turbulent phenomena vary in different spatial and time scales and the distribution of energy over a vast range of scales of motion makes it computationally infeasible to fully resolve the flow in a numerical simulation. A well-established technique to mitigate large computational costs is large-eddy simulation (LES), where a spatial filter is applied to the governing equations after which only the large scales of motion are resolved [13, 21]. The filter may be defined explicitly, as a smoothing function where the filter width determines the level of detail left in the filtered solution and is chosen by the user, or implicitly, by coarsening the discretization operator. Either approach requires a closure model to represent the effect of the filtered scales on the unfiltered resolved scales. In this paper, we apply an explicit spectral cut-off filter to the 2D Euler equations, where the cut-off frequency is based on observed intrinsic scale separation. This filter leads to a discrete problem formulation in which matrix sparsity can be exploited to reduce computational costs. At the same time, this formulation still allows for an explicit representation of the effects of small scales on large scales. The equations are closed by a deterministic or stochastic model term, based on high-resolution measurements. Here, we choose to model small-scale flow features using a stochastic term mimicking high-resolution numerical data. Subsequently, we analyze the model performance by means of energy fluxes between the scales of motion. The proposed stochastic closures and assessment of energy fluxes in the high-resolution data can serve as a point of departure for further development of stochastic closure models.

A peculiar aspect of 2D ideal fluids is the presence of infinitely many conservation laws. In particular, as firstly described by Kraichnain [18], the conservation of energy and enstrophy (the *L*<sup>2</sup> norm of the curl of the velocity field) implies a double cascade phenomenon: the energy tends to move from small scales to large scales, whereas the enstrophy tends to follow the opposite direction. Hence, in terms of the curl of velocity, or vorticity, it is possible to clearly separate two regimes: one slowly evolving at large scales and one fast at small scales. This was shown to hold numerically in [20] for the Euler–Zeitlin model on the sphere. Theoretically, the study of non-deterministic fluid models for different regimes have gained interest in the SPDE community [12]. The equations studied in [12] and the results proved therein, show a precise connection between different space-time regimes with a reduced model for large scales. Indeed, it is shown that a suitable model for large scales is given by the so-called *Stochastic Advection by Lie Transport* (SALT) equations [15], in which a transport noise term models the infinitesimal action of the small scales on the large ones. Several numerical tests have shown the usefulness of the SALT equations as a powerful tool for model reduction [6, 9].

However, defining precisely what large and small scales are is still an open problem. In this paper, we present a criterion for defining large scales in terms of truncation of the Fourier expansion. We point out that other choices and interpretations of large and small scales are possible (see for example [20]). Let us first introduce the governing equations for the vorticity field *ω*, defined on the 2-sphere S<sup>2</sup> embedded in R3:

$$\begin{aligned} \dot{\omega} &= \{ \psi, \omega \} \\ \Delta \psi &= \omega. \end{aligned} \tag{1}$$

The Poisson bracket is defined as

$$\{\psi, \omicron\} := \nabla \psi \cdot \nabla^{\bot} \textit{or}$$

and the Laplacian is the Laplace–Beltrami operator on S2. As mention above, equations (1) have infinitely many first integrals: energy *H (ω)* <sup>=</sup> <sup>1</sup> 2 - <sup>S</sup><sup>2</sup> *ψω*, Casimirs *Cn(ω)* = - <sup>S</sup><sup>2</sup> *<sup>ω</sup>n*, for *<sup>n</sup>* <sup>≥</sup> 1, and angular momentum. Understanding the role played by these invariants is still an open problem, especially for the long-time evolution of the fluid [7].

In order to gain numerical insight on this question, V. Zeitlin proposed a spatial discretization of (1), which retains many of the first integrals above [22, 23]. The Euler–Zeitlin equations are defined as follows:

$$\begin{aligned} W &= [P, W] \\ \Delta\_N P &= W. \end{aligned} \tag{2}$$

Here *W* is a *N* × *N* skew-Hermitian matrix with zero trace, that is, an element of the Lie algebra su*(N )*. The bracket [*P,W*] is the usual matrix commutator and the discrete Laplacian *N* is defined such that its spectrum is a truncation of the spectrum of [17]. As mentioned above, the Euler–Zeitlin equations possess the following integral of motions: energy *H (W )* <sup>=</sup> <sup>1</sup> <sup>2</sup>Tr*(P W )*, Casimirs *Cn(W )* = Tr*(Wn)*, for *<sup>n</sup>* <sup>=</sup> <sup>2</sup>*,...,N*, and angular momentum. The core of the Zeitlin model is how the original vorticity *ω* and the discrete one *W* are linked. Indeed, the representation theory of *SU(*2*)* provides a deep connection between the discrete Laplacian *N* and a particular basis {*Tlm*} of su*(N )*, for *l* = 1*,...,N* − 1 and *m* = −*l,... ,m* [17, 4]:


The classical way to determine large and small scales is to choose a wave number *l* as a threshold for the large scales (see for example [3, 6]). In this work, we propose the following criterion to set the threshold *l*. Consider a time scale in which the fluid's energy spectrum profile has reached a stationary state. Then, typically (that is, out of equilibrium) the spectrum exhibits a double slope, which determines a kink at a certain wave number *l*. Then, we define the large-scales *W* as the filtered vorticity with modes up to *l*, obtaining a banded matrix. We propose three possible ways, both deterministic and stochastic, of closing the equations for *W*, by choosing different interactions with the small scales. Finally, we provide numerical tests to assess the different models introduced.

## **2 Sparse-Stochastic Model Reduction**

The Euler–Zeitlin equations (2) allow to study some typical features of the 2D fluids in the matrix language. In this section, we propose a way to reduce the complexity of the Eq. (2), by defining from *W* a sparse matrix *W* which retains the relevant large-scale information. Then, we show different ways of closing the equations for *W*, by adding a suitable stochastic term.

In the Zeitlin model, the basis element *Tlm* of su*(N )* has non-zero entries only in the lower and upper ±*m* diagonal. If we look at the anti-diagonals, instead, we are looking at the components determining the value of the vorticity field at certain latitude bandwidth on S2, as shown in Fig. 1.

The large scales are typically chosen to be the modes such that *l* is smaller than a threshold level *l*. In the Euler–Zeitlin model, this corresponds to considering the banded matrices limited in the diagonals ±*l* ≤ *l* and then removing the components corresponding to *l > l*. The Poisson equation which defines the stream matrix *P* preserves this sparsity structure, since the basis elements *Tlm*, the eigenvectors of the Laplacian, are themselves sparse. However, the Lie bracket does not restrict to this space. Indeed at each time-step we have to project the vector field into the right space.

Usually, we do not have any chance to guess the contribution of the small scales to the evolution of the large ones. However, we expect that after an initial turbulent transition, the fluid exhibits two clearly separated spatial scales. The hint for such a scenario is due to several numerical simulations of the Euler–Zeitlin equations [3, 20]. Eventually, the energy profile reaches a fixed configuration with two slopes. The first part of the spectrum represents the distribution of energy at large scales, whereas the second part the distribution of energy at small scales. Typically, the separation between large and small scales occurs at a wave number *<sup>l</sup>* <sup>≈</sup> <sup>√</sup>*N*.

$$W = \begin{pmatrix} NOTT H, m = 0 & \dots & \dots & \dots & |m| = N - 1 \\ \vdots & \ddots & \ddots & \ddots & \vdots \\ \downarrow & \ddots & \searrow & \ddots & \downarrow \\ \vdots & \ddots & \ddots & \ddots & \vdots \\ \vdots & \swarrow & \ddots & \ddots & \vdots \\ |m| = N - 1 & \dots & \dots & \dots & SOUTH, m = 0 \end{pmatrix}$$

**Fig. 1** Structure of the discrete vorticity *W* in the Zeitlin model

$$
\overline{W} := \begin{pmatrix}
\overline{\omega}\_{11} & & \dots & \overline{\omega}\_{1\overline{l}} & 0 & & \dots & \\
\vdots & \ddots & \ddots & \ddots & \ddots & \vdots \\
\overline{\omega}\_{1\overline{l}} & & \ddots & \ddots & \ddots & \vdots \\
0 & & \ddots & \ddots & \ddots & \vdots \\
& & \ddots & \ddots & \ddots & \vdots \\
\vdots & & \ddots & \overline{\omega}\_{N\overline{l}} & & \dots & \overline{\omega}\_{NN}
\end{pmatrix}
$$
 
$$
W = \begin{pmatrix} \omega\_{11} & & \dots & & \omega\_{1N} \\ \vdots & \vdots & & \vdots \\ \omega\_{N1} & & \dots & & \omega\_{NN} \end{pmatrix}
$$
 
$$
\begin{array}{ccccccccc} & & & & & \\ & \vdots & & \ddots & & \\ & & & & & \\ & & & & & \\ & & & & & \\ & & & & & \\ & & & & \overline{\omega}\_{1\overline{l}} & & \dots & & \overline{\omega}\_{1N} \\ & & & & & \\ & & & & \overline{\omega}\_{N\overline{l}} & & \dots & & \overline{\omega}\_{NN} \end{array}
$$

**Fig. 2** Filtering of large-scale and definition of random small scale vorticity *<sup>W</sup>*, via the independent Brownian motions *βlm*

For wave numbers larger than *l* the energy spectrum has the characteristic slope of *l* <sup>−</sup>1, which is the one of white noise, see Fig. 3. The universal nature of the small scales suggests a model reduction in terms of large-scale evolution combined with a stochastic term contribution. In Fig. 2, we show the procedure to get the two new fields *<sup>W</sup>* and *<sup>W</sup>*. To define *<sup>W</sup>*, we introduce the orthogonal projection *<sup>π</sup>* onto the modes *<sup>l</sup>* <sup>≤</sup> *<sup>l</sup>*. The small-scale field *<sup>W</sup>* is defined as the linear combination of the basis elements *Tlm*, for *l > l* with coefficients *βlm* as independent Brownian motions, with mean and variance obtained from the high-resolution DNS. This choice of coefficients ensures that the random field yields the same mean energy spectrum at the small scales as measured from the DNS. Additionally, applying the Kolmogorov–Smirnov and Anderson–Darling tests for normality to the highresolution data suggests that the distribution of the basis coefficients for *Tlm*, for *l >* ¯*l*, is Gaussian.

Hence, we define *<sup>W</sup>* := *πW* and *<sup>W</sup>* := *N*−<sup>1</sup> *l*=*l*+1 *l <sup>m</sup>*=−*<sup>l</sup> <sup>β</sup>lmTlm*. With these new fields, we essentially have three possible choices. The first one consists of a deterministic closure simply via the projection of the vector field onto the large scales:

$$
\begin{array}{l}
\overline{W} = \pi \left[ \overline{P}, \overline{W} \right] \\
\Delta\_N \overline{P} = \overline{W},
\end{array}
\tag{3}
$$

which we refer to as the no-model closure. The second model is the large scale enstrophy-preserving stochastic closure, which is up to the projection *π* a type of SALT equation (see [15]):

$$\begin{cases} d\overline{W} = \pi \lbrack \overline{P}, \overline{W} \rbrack dt + \sum\_{l=\overline{l}+1}^{N-1} \sum\_{m=-l}^{l} \frac{1}{-l(l+1)} \pi \lbrack T\_{lm}, \overline{W} \rbrack \diamond d\beta^{lm} \\ \Delta\_N \overline{P} = \overline{W}. \end{cases} \tag{4}$$

We recall that the symbol ◦ denotes the Stratonovich integral and indeed we find that the large scale enstrophy is conserved, via

$$\begin{split}d\frac{1}{2}\text{Tr}(\overline{W}^2) &= \text{Tr}(\overline{W}d\overline{W}) = \text{Tr}(\overline{W}\pi[\overline{P},\overline{W}])dt\\ &+\sum\_{l=l+1}^{N-1}\sum\_{m=-l}^{l}\frac{1}{-l(l+1)}\text{Tr}(\overline{W}\pi[T\_{lm},\overline{W}]) \diamond d\beta^{lm} = 0,\end{split}$$

being *πW* = *W* and [*W , W*] = 0. Notice that the other large-scale Casimirs are not preserved, since in general *πW<sup>k</sup>* <sup>=</sup> *(πW )k*, for *k >* 1.

Finally, the third model is a large-scale energy-preserving stochastic closure (see [11] for its analysis and [16, 8] for more recent applications). We note that [16] introduces this stochastic closure as *Stochastic Forcing by Lie Transport* (SFLT) and provides its general definition. Here, we refer to this closure as energy-preserving noise (EPN) and adopt the following definition:

$$\begin{cases} d\overline{W} = \pi[\overline{P}, \overline{W}]dt + \sum\_{l=l+1}^{N-1} \sum\_{m=-l}^{l} \pi[\overline{P}, T\_{lm}] \diamond d\beta^{lm} \\ \Delta\_N \overline{P} = \overline{W}. \end{cases} \tag{5}$$

The proof of conservation of the large scale energy for (5) is identical to the one of large scale enstrophy conservation by noticing that *π* commutes with *N* . The benefit of the stochastic models (4) and (5) compared to the no-model closure is that the stochastic models provide an explicit representation of the small-scale flow features and, doing so, aim to truthfully affect the evolution of the large scales of motion. No such representation is included in the no-model closure. In the next section, we perform a numerical test for the three different models (3), (4), (5), comparing them with the high-resolution DNS.

## **3 Numerical Simulations**

In this section, we carry out a numerical experiment to study the performance of the models proposed in the previous section. The numerical experiment is conducted as follows. We set the high-resolution level at *N* = 128. Then we generate a random initial condition and we run a high-resolution DNS. We stop the simulation once

**Fig. 3** Initial vorticity obtained via high-resolution DNS. Top left, the field *W*, top right, the filtered field *W*, bottom left, *W* − *W*, bottom right, the energy spectrum of *W*. Note the change of slope in the energy profile at *<sup>l</sup>* <sup>≈</sup> <sup>√</sup>*<sup>N</sup>*

a stationary energy profile is reached (see Fig. 3). The solution at this point in time defines the initial condition of the reference solution and the ensuing model simulations.

From the DNS we select the large-scale threshold as wave number *<sup>l</sup>* <sup>≈</sup> <sup>√</sup>*N*, at which the kink in the energy spectrum appears. In our numerical simulation the kink is found to be at *l* = 14.

**Remark 1** The kink in the energy spectrum must depend on the truncation level *N*. Indeed, it was shown that the tail of the energy distribution at small scales for conservative schemes, like the Euler–Zeitlin one, has a characteristic slope of *k*−1. This would imply unbounded energy for *N* → ∞, which contradicts the fact that we only consider vorticities with bounded energy. Therefore, the kink wave number between the two slopes in the energy profile must increase with *N*. Numerically, we have observed that it grows like <sup>√</sup>*N*.

Then, we define our large-scale field as *W* := *πW*, where *π* denotes the orthogonal projection onto the modes for *l* ≤ *l*. The projection consists of two steps: first we extract the components up to *l* and then we generate the field *W*. The cost of calculating each component is O*(N )* and since we need to repeat this operation *l* <sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>≈</sup> *<sup>N</sup>* times, the total cost of extracting the components is <sup>O</sup>*(N*2*)*. Clearly, to construct the field *<sup>W</sup>* we have to perform <sup>O</sup>*(N*2*)* operations. Hence, the total computational cost of the projection *<sup>π</sup>* is <sup>O</sup>*(N*2*)*. Hence, given two matrices *A, B* ∈ su*(N )*, the cost of evaluating *π*[*πA, πB*] is given by the evaluation of *π* plus the cost of multiplying *πA* and *πB*. Since we are interest only in the ±*l* diagonals we need to perform O*(Nl)* vector-vector multiplications of the cost O*(l)*, which implies a total cost of O*(Nl* 2 *)* <sup>≈</sup> <sup>O</sup>*(N*2*)*. The same cost <sup>O</sup>*(N*2*)* holds for the stochastic term, since we can actually consider only *m* = −*l,... , l*. We also define *<sup>W</sup>*, as explained in the previous section. Finally, we define the reference solution as the high-resolution numerical simulation using the initial condition previously defined. The large-scale field obtained by projecting this initial condition serves as a starting point for the simulations where the small scales are modeled as described in Eqs. (3), (4), (5). All simulations are run for 250 time units from the initial condition. In our numerical simulations, the time integration is done via the Heun-type scheme adapted for the SDEs, with time-step *h* = 0*.*25. In the following, we perform an ensemble of realizations for the stochastic closures. Each ensemble consists of 25 realizations, which is sufficient to show the qualitative difference between the proposed models.

We notice from Figs. 4 and 5 that the no-model solution and the mean SALT solution perform well compared to the reference solution, in terms of qualitative vorticity evolution as well as the kinetic energy spectrum. On the contrary, the energy-preserving scheme loses accuracy and a cascade of energy to lower wave numbers occurs. We attribute the observed difference between SALT and EPN to two causes. Firstly, the orders of magnitude of the stochastic terms entering the equations differ between the methods. The stochastic forcing types are parametrizations of components of the effect of small scales on large scales, respectively given by *<sup>π</sup>*[*P , <sup>W</sup>*] and *<sup>π</sup>*[*P , <sup>W</sup>*]. Figure <sup>6</sup> shows these quantities for a high-resolution snapshot, which illustrates that the term parametrized by the EPN is significantly larger than the term that SALT parametrizes. Therefore, it is reasonable to expect that the proposed use of EPN leads to more substantial deviations from the no-model

**Fig. 4** Evolution of the large scales over 250 time units, via the different models proposed and the high-resolution simulation. No model corresponds to (3), SALT to (4) and EPN to (5). The shown results using SALT and EPN are the mean of an ensemble each consisting of 25 independent realizations

**Fig. 5** Energy spectra at *t* = 250 using SALT (left) and EPN (right), both computed from an ensemble of 25 independent realizations and compared to the reference and no-model energy spectra. Note that the standard deviation of the ensemble obtained using SALT is too small to discern in this figure

**Fig. 6** Components of the evolution of the reference vorticity projected onto the large scales, generated from the reference vorticity field at *t* = 250. Note that the ranges of the color bars vary per field, which highlights the difference in magnitudes between the fields

simulation than the use of SALT. In fact, the proposed use of SALT only leads to very small changes compared to the no-model setting. Secondly, in the energypreserving scheme, no energy can leave the large scales. Hence, if the transfer of energy between different modes is non-zero, the conservation of the large-scale energy prevents the energy to flow from large scales to small scales, causing an extra accumulation of energy *l* ≈ *l*.

In order to check this thesis, we compute the energy transfer among different modes in the high-resolution DNS. Let us consider the energy at a level *l*:

$$E(l) = \frac{1}{2} \sum\_{m=-l}^{l} \frac{\alpha\_{lm}^2}{I(l+1)}.$$

Then, the energy variation in time is given by

$$\frac{dE(l)}{dt} = \sum\_{m=-l}^{l} \frac{\alpha\_{lm} [P, W]\_{lm}}{l(l+1)}.$$

Let *F (l)* := | *dE(l) dt* | be the absolute value of the energy transfer due to the nonlinearity of the vector field [*P,W*]. In Fig. 7, we plot the energy transfer contributions of the four possible couplings of large and small scales. We notice that the transfer of energy between large and small scales is non-zero. In particular, the main driver of the energy for the components of *W* is the vector field [*P , W*], whereas for small scales it is [*P , <sup>W</sup>*].

The plots in Fig. <sup>7</sup> show that the term [*P , <sup>W</sup>*] becomes more and more relevant in the energy flux at large scales, while approaching the threshold level ¯*l*. Therefore, one might expect the stochastic model (5), which takes into account this term too, to be more accurate than (3) or (4). However, form Figs. 4 and 5 this does not seem the case. Indeed, the way we model the term [*P , <sup>W</sup>*] in (5) prevents the energy from flowing from large scales to small scales. However, also in (3) there is no energy flow from large scales to small scales. Therefore, we suggest that at large scales the term [*P , <sup>W</sup>*] is responsible for redistributing the energy from lower to higher frequencies and in absence of a dissipation mechanism the rearranging of energy diverges from the correct spectrum.

The fact that (3) and (4) perform equally well is quite surprising and indicates that the small scales do not affect the large scales much when a stationary energy profile

**Fig. 7** Energy transfer among different modes computed from a snapshot of the high-resolution simulation. Left, energy transfer among modes at large scales, right, at small scales

is reached. However, for much longer times than those we have run, it is possible that the effect of small scales on large scales become more relevant and (4) becomes more accurate than (3) in terms of large-scale dynamics. Further investigations on this aspect are ongoing research and will be presented in future work.

## **4 Conclusions and Outlook**

In this paper, we have presented a possible strategy to reduce the complexity of the Euler–Zeitlin model, while performing long-time simulations. Numerical evidence shows that the Euler–Zeitlin equations exhibit a clear separation of scales such that the large-scale dynamics are quite robust to different couplings with small scales, either deterministic or stochastic. Interestingly, the energy-preserving scheme we have defined shows that the energy at large scales cannot be exactly conserved. This means that large and small scales are never completely decoupled, even when the energy spectrum profile reaches a stationary regime. This indicates that for very long times a non-zero transfer of energy among different scales is present.

The Zeitlin model has been criticized for unrealistic conservation of enstrophy and other Casimirs at a finite level of truncation *N*. Our result shows that this issue can be understood such that the Euler–Zeitlin equations are quite robust and precise in describing large scales, which means for wave numbers *<sup>l</sup>* <sup>≈</sup> <sup>√</sup>*N*. On the other hand, the remaining modes are themselves a model for the small scales, which correctly mimic the energy flux among different modes.

In conclusion, we have shown that the Zeitlin model can be a useful tool for simulating long-time large-scale dynamics. In future work, we aim to perform more systematic simulations using the parallelized code developed in [5] and available on https://github.com/cifanip/GLIFS. Additionally, further analysis of energy and enstrophy transfers between the scales of motion may serve to derive tailored datadriven stochastic closure models for the Euler-Zeitlin equations.

## **References**


two-dimensional euler equation. *SIAM/ASA Journal on Uncertainty Quantification*, 8(4): 1446–1492, 2020.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Effect of Transport Noise on Kelvin–Helmholtz Instability**

**Franco Flandoli, Silvia Morlacchi, and Andrea Papini**

**Abstract** The effect of transport noise on a 2D fluid may depend on the spacescale of the noise. We investigate numerically the dissipation properties of very small-scale transport noise. As a test problem we consider the Kelvin-Helmholtz instability and we compare the inviscid case, the viscous one, both without noise, and the inviscid case perturbed by transport noise. We observe a partial similarity with the viscous case, namely a delay of the instability.

# **1 Introduction**

Stochastic transport is a new fundamental perspective on fluid dynamics, see e.g. [29, 39, 13, 8] and [22]. A transport type noise in a fluid dynamic model may be seen, loosely speaking, as a simplified description of small-medium space-scales of motion. In the numerical simulations (see for instance [8]) we may observe the way it perturbs large scale motion; in general, this perturbation destabilizes large scales producing smaller eddies.

This has been observed for homogeneous noise where small scale perturbations with possibly local backscattering of energy are observed. While there are cases in which transport noise has been shown also to trigger secondary circulation at large scales for inhomogeneous noises [3].

In this note we want to explore how transport noise may affect large scales, somewhat opposite to the one mentioned above. The key difference is the assumption that it is very-small-space-scale. The noise used below is made of very small, low intensity, vortex structures. In such a case it may happen that the transport noise acts as a dissipation, an additional viscosity.

It corresponds to Joseph Boussinesq intuition [6] that "turbulent small scales may be dissipative on the mean flow". The physical intuition, beyond the specific

Scuola Normale Superiore, Pisa, Italy

F. Flandoli (-) · S. Morlacchi · A. Papini

e-mail: franco.flandoli@sns.it; silvia.morlacchi@sns.it; andrea.papini@sns.it

<sup>©</sup> The Author(s) 2024

B. Chapron et al. (eds.), *Stochastic Transport in Upper Ocean Dynamics II*, Mathematics of Planet Earth 11, https://doi.org/10.1007/978-3-031-40094-0\_3

mathematical derivation, is that fluid particles move so erratically to produce effects similar to the molecular motion. In a sense, it is like a macroscopic version of the molecular dynamics: as macroscopic kinetic energy transfers to molecular kinetic energy (heat), reducing the macroscopic motion, similarly large-scale kinetic energy moves to small scale turbulence and reduce the intensity of the mean flow. There have been many attempts to prove the validity of this picture, see for instance [5, 16, 21, 32, 30, 44], but a final conclusion is not clear, also because the empirical validity is sometimes moderate [32, 41]; in particular, the inverse cascade in 2D seems to break the viscosity effect due to small-scale turbulence [14].

However, this idea is certainly very useful for numerical simulations of fluid dynamics model, being the basis of the LES method [5], and, in case of vorticity equations, the vortex blob method [11].

Theoretically, this kind of turbulent-small-scale transport noise has been investigated in rigorous works. This line of research was initiated in [23] and developed in several works, see e.g [16, 19] and many others (see [22]).

However, its stabilizing power has not been tested numerically yet. Here we observe its action on one of the strongest and most common instabilities: the Kelvin-Helmholtz one. This is an instability that occurs when a velocity difference across the interface between two fluids is present, producing macroscopic vortexlike structures. For a description of Kelvin-Helmholtz instability see [31] or [36], Chapter 6, and our description below; as a typical instability of shear flows, see also [14].

We compare the inviscid case, the viscous one, both without transport noise, and the inviscid case perturbed by transport noise. The results are described in Sect. 4 . Since we approximate the fluid dynamic equations by vortex methods, which fits quite well and in a unified way with the three cases analyzed here (inviscid, viscous and stochastic transport), we describe some preliminaries on this topic in Sects. 2 and 3.

## **2 Model Formulation**

We consider the two-dimensional flow of an incompressible fluid on a 2D domain, either the full plane or T2, the 2D torus with periodic boundary conditions.

As usual, the equations for the motion of the fluid are the conservation of mass and linear momentum, see e.g. [2], expressed through the Navier-Stokes equations in the null-divergence formulation. The main interest for our numerical simulations is the equation for the evolution of vorticity *ω(t, x)*, which can be derived from the Navier-Stokes equations:

$$
\partial\_l \rho + \boldsymbol{\mu} \cdot \nabla \rho = \boldsymbol{\nu} \,\Delta \boldsymbol{\nu} \,. \tag{1}
$$

Here *ν* is the kinematic viscosity, and *u* is the velocity of the fluid solving the NSequation. In the 2D case, we can express *ω* := ∇ × *u* as the curl of the velocity field, a vector normal to the flow plane. We denote *x* = *(x*1*, x*2*)* to be any point in the domain.

## *2.1 Point Vortex Method for Inviscid Flows*

In order to state the problem we investigate numerically, we first recall several well-known facts; for a general introduction, see e.g. [36, 37, 43]. For the sake of simplicity, we focus on an inviscid fluid; the vorticity equation (1) reduces to,

$$
\partial\_t \boldsymbol{\omega} + \boldsymbol{\mu} \cdot \nabla \boldsymbol{\omega} = \boldsymbol{0}, \tag{2}
$$

From the null-divergence hypothesis, if *u* has zero average we can define a stream function associated to the fluid *ψ(t, x)*, see [2], such that the velocity *u* is given by *u* = −∇⊥*ψ*. We obtain the stream function by solving the Poisson equation

$$
\Delta \psi = -\omega.\tag{3}
$$

To compute the solution, we need to express the Green's function and the convolution with the variations of constants methods, which for flows in the full plane R<sup>2</sup> reads as

$$
\psi(t, \mathbf{x}) = \int G(\mathbf{x} - \mathbf{y}) \, \omega(t, \mathbf{y}) \, d\mathbf{y} := G \ast \omega,\tag{4}
$$

where *G* is the Green's function, the fundamental solution of the Laplace equation. The explicit form of *G* in the full plane R<sup>2</sup> is

$$G(\mathbf{x}, \mathbf{y}) = \frac{1}{2\pi} \log(|\mathbf{x} - \mathbf{y}|). \tag{5}$$

Using (4) we obtain the velocity field

$$
\mu(t, \mathbf{x}) = \int \ K(\mathbf{x} - \mathbf{y}) \,\,\omega(t, \mathbf{y}) \,\,d\mathbf{y} := K \,\, \ast \,\omega,\tag{6}
$$

where *K* is given by

$$K(\mathbf{x}) = \nabla^{\perp} G(\mathbf{x}).\tag{7}$$

Equations (6) and (7) for the velocity are known as the Biot-Savart law, and *K* is the Biot-Savart Kernel. Note that we must correct this velocity for flows over bounded or periodic boundaries domains to satisfy the boundary condition (See Sect. 4.1, for details). Therefore, we have to change the Green's function according to the Poisson equation.

Consider now a fluid particle X*<sup>t</sup>* , moving in the velocity field; from (2), the path of the particle is

$$\frac{d}{dt}X\_I = \mu(t, X\_I)$$

$$\mu(t, X\_I) = a\_0(X\_0).$$

Therefore, considering the fluid as distinct "fluid particles" of constant vorticity, these abstract objects' motion determines the scalar field's evolution. This is the premise of the point vortex method to compute (2), (see [36, 37] and [43] for details).

To this end, consider *N* point vortices, idealizing a 2D inviscid fluid and occupying positions *X*<sup>1</sup> *<sup>t</sup> , ..., X<sup>N</sup> <sup>t</sup>* , with intensities (circulations) 1*, ..., N* respectively. They move accordingly to the following set of ordinary differential equations derived from the previous computations (8):

$$\frac{dX\_t^i}{dt} = \sum\_{j \neq i} \Gamma\_j K\left(X\_t^i, X\_t^j\right) \tag{8}$$

where the vector-valued kernel *K (x, y)* is the Biot-Savart kernel, equal to <sup>1</sup> 2*π (x*−*y)* ⊥ |*x*−*y*| 2 in full space, suitably modified on a torus or in a bounded domain (see Sect. 4.1). We refer to [15] for well-posedness of the ODE system for almost all initial configurations (including the case when the system is perturbed by additive noise). One can prove that the empirical measure

$$w\_i(t, \cdot) := \sum\_{i=1}^N \Gamma\_i \delta\_{X\_i^t}$$

is a weak solution of 2D Euler equations in vorticity form (suitably interpreted for distributional fields as in [42]), see [24] and [26, 27] for a discussion of wellposedness in this context):

$$\begin{aligned} \partial\_t \boldsymbol{\omega} + \boldsymbol{u} \cdot \nabla \boldsymbol{\omega} &= 0 \\ \boldsymbol{u} \left( t, \mathbf{x} \right) &= \int \boldsymbol{K} \left( \mathbf{x}, \mathbf{y} \right) \boldsymbol{\omega} \left( t, \mathbf{y} \right) d\mathbf{y} \\ \boldsymbol{\omega}|\_{t=0} &= \boldsymbol{\omega}\_0 \end{aligned} \tag{9}$$

with *<sup>ω</sup>*<sup>0</sup> := *<sup>N</sup> <sup>i</sup>*=<sup>1</sup> *j δXi* 0 . Here *ω* is the (scalar) vorticity, *u* is the (vector) velocity.

Moreover, consider the empirical measure parametrized by *N* and rescaled by 1*/N*:

$$a\_N\ (t, \cdot) = \sum\_{i=1}^N \frac{\Gamma\_i}{N} \delta\_{X\_i^\ell} \cdot$$

In [36] a convergence result to Euler equations was proved under some restrictions (specifically, in that case *X<sup>i</sup> <sup>t</sup>* solves a vortex dynamics with mollified Biot-Savart Kernel, with vanishing mollification as *N* → ∞, with rates of converge of the parameters a bit far from optimal). The best result in the literature of convergence to Euler equations was then [42], see also other references therein.

Notice that the velocity field associated to the distributional vorticity *ωN (t,* ·*)* is

$$\mu\_N\left(t,\chi\right) := \frac{1}{N} \sum\_{l=1}^N K\left(x,X\_l^l\right).$$

This is a well defined vector field, of class *L<sup>p</sup> loc* for every *p <* 2 but not for *p* = 2.

## *2.2 Point Vortex Method for Viscous Flows*

To investigate viscous flows, following [10, 33, 38], we modifying the previous scheme by adding independent 2D Brownian motions *W*<sup>1</sup> *<sup>t</sup> , ..., W<sup>N</sup> <sup>t</sup>* to the equations of point vortices

$$dX\_t^l = \sum\_{j \neq l} \Gamma\_j K\left(X\_t^l, X\_t^j\right) dt + \sqrt{2\nu} dW\_t^l. \tag{10}$$

Then, the empirical measure *ωN (t,* ·*)* converges weakly, in probability, to the unique solution of the 2D Navier-Stokes equations in vorticity form

$$\begin{aligned} \partial\_t \boldsymbol{\omega} + \boldsymbol{\mu} \cdot \nabla \boldsymbol{\omega} &= \boldsymbol{\nu} \Delta \boldsymbol{\omega} \\ \boldsymbol{\mu} \left( t, \mathbf{x} \right) &= \int \boldsymbol{K} \left( \mathbf{x}, \mathbf{y} \right) \boldsymbol{\omega} \left( t, \mathbf{y} \right) d \mathbf{y} \\ \boldsymbol{\omega}|\_{t=0} &= \boldsymbol{\omega}\_0 \end{aligned} \tag{11}$$

We refer to [7] for convergence to 2dNS on the full plane, and to [33, 38, 25] for results in the case of domains with boundary.

## **3 Point Vortex Method with Environmental Noise**

As mentioned above in the Introduction, several works, e.g. [29, 39, 13, 28, 9, 8] and [22], indicated an interest in the following stochastic modification of the Euler equations

34 F. Flandoli et al.

$$d\boldsymbol{\omega} + \boldsymbol{u} \cdot \nabla \boldsymbol{\omega} dt = \sum\_{k \in K} \sigma\_k \cdot \nabla \boldsymbol{\omega} \diamond \boldsymbol{d} \boldsymbol{B}\_t^k \tag{12}$$

where *σk* = *σk (x)* are given vector fields, that we assume divergence-free, *Bk t k*∈*K* are independent 1D Brownian motions and the stochastic operation ◦ stands for the Stratonovich integral. Due to this, formally, vorticity is conserved (it is transported randomly by the field *u dt* + *<sup>k</sup>*∈*<sup>K</sup> σk* ◦ *dB<sup>k</sup> t* ).

This model bears similarities with the viscous flows, and the objective of this paper is to show differences and similarities of the elliptic operator obtained from such a transport-advection noise.

## *3.1 Transport Noise and Deterministic Scaling Limit*

The point vortex dynamics associated to the model (12) is then given by the following expression:

$$dX\_t^l = \frac{1}{N} \sum\_{j \neq l} K\left(X\_t^l, X\_t^j\right) dt + \sum\_{k \in K} \sigma\_k\left(X\_t^l\right) \diamond dB\_t^k. \tag{13}$$

Notice that this is a model of common noise (also called environmental noise): the BM's *B<sup>k</sup> <sup>t</sup>* are the same for all particles, in contrast to the model (11) where each particle *X<sup>i</sup> <sup>t</sup>* was affected by an independent BM *W<sup>i</sup> <sup>t</sup>* . See [18] for an example of theoretical results on this model. For models similar to this one, it has been proved (see e.g. [12]) that the empirical measure converges to the solution of the SPDE (12). At the same time, following [23] and subsequent works, if the noise is parameterized in such a way to become more and more small scale, the SPDE (12) converges to the deterministic equation with additional viscosity (see Sect. 3.2 for details on possible noise selection)

$$
\partial\_l \omega + \boldsymbol{\mu} \cdot \nabla \omega = \boldsymbol{\nu} \Delta \boldsymbol{\nu}.\tag{14}
$$

Inspired by [20], we consider a sort of mixed scaling limit: we take the point vortex dynamics with common noise (13), which for given fields *σk* would converge to the SPDE (12), and we choose more and more small scale coefficients *σk* in order to be close to the deterministic equation (14).

The scheme just described has two parallel interpretations. If we look at the point vortex dynamics just as a numerical method of discretization of Euler equation, the transport noise in the point vortex dynamics (13) is just like the transport noise in the Euler vorticity dynamics (12). Therefore the scaling limit when the coefficients *σk* more concentrated is a numerical realization of the theoretical scaling limit investigated in [16], from the stochastic Euler equation to the deterministic Navier-Stokes equation and, as such, it is a form of proof of the validity of Boussinesq hypothesis that small scale turbulence enhances the viscosity. That is the first interpretation. The second one is more related to particle systems and noise. Consider the particle system (13), having common noise and compare it with the more classical particle system with independent noise acting on each particle (Sect. 2.2 above). In the scaling limit of *σk* described above, the common noise becomes more and more spatially de-correlated, close to independent noise at each space position, and thus it acts almost independently on each particle. Since point vortices with independent noise converge to the Navier-Stokes equations, it is natural to expect that also with common noise but very concentrated *σk* the empirical measure of the particle system is close to the solution of the Navier-Stokes equations. This viewpoint has been investigated theoretically in [20], while the subject of linking the turbulence stresses to the mean flow, which is in synthesis the Boussinesq idea, was explored in appendix A and B of [40] and theoretically in [17], providing a link to eddy viscosity and an environmental small scale noise.

## *3.2 A Digression on the Theoretical Selection of the Noise*

In this section, in the same spirit as [20, 22, 40], we explore some property of the environmental noise that we use in a simplified way in our numerical simulations. Following the works on modeling of passive scalars [34], when considering the scaling limit of (13) to *ω(***x***)* solution of the viscous Euler equation (14), we consider a model of noise in the fluid which is delta-correlated in time, namely a white noise with a precise space dependence.

$$\mathbf{W}(t, \mathbf{x}) \, dt = \sum\_{k \in K} \sigma\_k \, (\mathbf{x}) \, dB^k\_l \tag{15}$$

where *(σ<sup>k</sup> (***x***))k* is a family of smooth divergence free vector fields on the 2D domain of the equation, and *B<sup>k</sup> <sup>t</sup>* are independent one-dimensional Brownian motions; *K* is, usually, a finite index set, but with suitable assumption we could consider also the case of countable family of smooth fields.

In this case, the term **W** *(t,* **x***)* · ∇*ω (***x***)* obtained in the convergence result of the point vortex empirical measure, must be interpreted as a Stratonovich integral

$$\sum\_{k \in K} \sigma\_k \left( \mathbf{x} \right) \cdot \nabla \boldsymbol{\omega} \left( \mathbf{x} \right) \diamond \boldsymbol{d} \, \boldsymbol{B}\_{\mathbf{l}}^k. \tag{16}$$

Assume that the solution is sufficiently smooth so that the Stratonovich integral makes sense, then this is given by an Itô-Stratonovich corrector plus an Itô integral; precisely, is given by:

36 F. Flandoli et al.

$$-\frac{1}{2} \sum\_{k \in K} \sigma\_k \left( \mathbf{x} \right) \cdot \nabla \left( \sigma\_k \left( \mathbf{x} \right) \cdot \nabla \boldsymbol{\omega} \left( \mathbf{x}, \mathbf{v} \right) \right) dt + dM \left( t, \mathbf{x} \right)$$

where *M (t,* **x***)* is a (local) martingale. Follows that the Itô-Stratonovich corrector takes the form of an elliptic operator:

$$-\frac{1}{2}\operatorname{div}\left(C\left(\mathbf{x},\mathbf{x}\right)\nabla\boldsymbol{\omega}\left(\mathbf{x}\right)\right)dt$$

where *C (***x***,* **y***)* is the space-covariance function of the noise

$$C\left(\mathbf{x},\mathbf{y}\right) = \sum\_{k \in K} \sigma\_k\left(\mathbf{x}\right) \otimes \sigma\_k\left(\mathbf{y}\right).$$

As an example, we take the noise [34], which is relevant to our numerical investigation in the choice of the divergence-free field in the point vortex model. For simplicity, assume the domain to be R2, but modifications on T<sup>2</sup> are possible, see for example [16, 20].

Its covariance function is space-homogeneous, i.e. *C (***x***,* **y***)* = *C (***x** − **y***)*, with the form

$$C(\mathbf{z}) = \nu k\_0^{\zeta} \int\_{k\_0 \le |\mathbf{k}| < k\_1} \frac{1}{|\mathbf{k}|^{d + \zeta}} e^{i \mathbf{k} \cdot \mathbf{z}} \left( I - \frac{\mathbf{k} \otimes \mathbf{k}}{|\mathbf{k}|^2} \right) d\mathbf{k} \dots$$

The famous Kolmogorov 41 case follows if we take *ζ* = 4*/*3. Taking *k*<sup>1</sup> = +∞, then *<sup>C</sup> (***0***)* <sup>=</sup> *Kσ*<sup>2</sup> where the constant *<sup>K</sup>* is given by

$$K = \int\_{1 \le |\mathbf{k}| < \infty} \frac{1}{|\mathbf{k}|^{d+\zeta}} \left( I - \frac{\mathbf{k} \otimes \mathbf{k}}{\left| \mathbf{k} \right|^2} \right) d\mathbf{k} \dots$$

We consider small-scale turbulent velocity fields depending on a scaling parameter and taking the scaling limit in (12), as in [20, 23]. In the case of [34] we have

$$k\_0 = k\_0^N \to \infty$$

The result *C (***0***)* = *Kν* is independent of *N*, so that the Itô-Stratonovich corrector becomes equal to

$$\nu \Delta o\left(\mathbf{x}\right),$$

and simultaneously, we may have that the Itô term goes to zero, hence recovering (14).

Let us sketch the argument (see [16] for details) which explains why the Itô term may go to zero, in spite of the convergence to a finite non-zero limit of the Itô-Stratonovich corrector. Let *φ* be a smooth test function. One has

$$\mathbb{E}\left[\left(\sum\_{k\in K}\int\_0^T \langle \sigma\_k \cdot \nabla \omega\_l, \phi \rangle\_{L^2} \, dB^k\_l\right)^2\right] = \mathbb{E}\left[\sum\_{k\in K}\int\_0^T \langle \sigma\_k \cdot \nabla \omega\_l, \phi \rangle\_{L^2}^2 \, dt\right]$$

by the isometry formula of Itô integrals,

$$= \mathbb{E}\left[\sum\_{k \in K} \int\_0^T \left< \boldsymbol{\alpha}\_l , \boldsymbol{\sigma}\_k \cdot \nabla \phi \right>\_{L^2}^2 dt\right]$$

since div *σk* = 0,

$$\begin{aligned} &= \mathbb{E}\left[\int\_0^T \int \int \sum\_{k \in K} \sigma\_k \left( \mathbf{x} \right) \cdot \nabla \phi \left( \mathbf{x} \right) \sigma\_k \left( \mathbf{y} \right) \cdot \nabla \phi \left( \mathbf{y} \right) \omega \left( t, \mathbf{x} \right) \omega \left( t, \mathbf{y} \right) dx dy dt \right] \\ &= \mathbb{E}\left[\int\_0^T \int \int \nabla \phi \left( \mathbf{y} \right)^T \cdot \mathbf{C} \left( \mathbf{x}, \mathbf{y} \right) \cdot \nabla \phi \left( \mathbf{x} \right) \omega \left( t, \mathbf{x} \right) \omega \left( t, \mathbf{y} \right) dx dy dt \right] \\ &= \mathbb{E} \int\_0^T \langle \mathbf{C} \theta\_l, \theta\_l \rangle\_{L^2} dt \end{aligned}$$

where is the linear operator on vector fields with kernel *C (x, y)* and *θt (x)* = ∇*φ (x) ω (t,x)*,

$$\leq \|\mathbf{C}\|\_{L^{2}\to L^{2}} \,^{\mathbb{E}} \int\_{0}^{T} \|\theta\_{l}\|\_{L^{2}}^{2} \, dt.$$

Now, one can prove uniform bounds on E *<sup>T</sup>* <sup>0</sup> *θt*<sup>2</sup> *<sup>L</sup>*<sup>2</sup> *dt* with respect to the scaling of the noise and one can choose a noise such that **C***L*2→*L*<sup>2</sup> goes to zero. Notice that in the Itô-Stratonovich corrector only the diagonal *C (x, x)* counts, while the smallness of **C***L*2→*L*<sup>2</sup> is related to the smallness of *C (x, y)* when *x* = *y*.

## **4 Numerical Results**

## *4.1 Setting: Kelvin–Helmholtz Instability*

In this section, we investigate classical results on the shear flow model in the setting of point vortices, analyzing the Kelvin–Helmholtz instability and the possibility of delaying the structure formation. In this way, we can both test the goodness of our point vortex models and, at the same time, set a benchmark for which we will show delayed instability. In order to test the point vortex model in the classical cases (8) and (10), we choose the particular fluid configuration of a shear flow because of its fundamental property: *developing instability without viscosity and delaying it when viscosity is present* [36].

We work on the torus <sup>T</sup><sup>2</sup> equal to the set [−1*,* 1] <sup>2</sup> */* <sup>∼</sup> with coordinates *<sup>x</sup>* <sup>=</sup> *(x*1*, x*2*)* and identified boundaries at *x*1*, x*<sup>2</sup> = ±1; all fields are periodic in the *x*<sup>1</sup> and *x*2-direction. We take an initial velocity *u*<sup>0</sup> of the form

$$
\mu^0(\mathbf{x}\_1, \mathbf{x}\_2) = \left(\mu^0\_1(\mathbf{x}\_2), 0\right),
$$

and corresponding vorticity *<sup>ω</sup>*<sup>0</sup> <sup>=</sup> *∂x*2*u*<sup>0</sup> <sup>1</sup> *(x*2*)*. We choose, in particular, the function

$$u\_1^0(\mathbf{x}\_2) = \begin{cases} -1 \text{ if } & \mathbf{x}\_2 \le -\delta \\ \frac{\mathbf{x}\_2}{\delta} \text{ if } -\delta \le \mathbf{x}\_2 \le \delta \\\ 1 \text{ if } & \delta \le \mathbf{x}\_2 \end{cases} \tag{17}$$

where we fix *δ* = 0*.*02 in our numerical simulations.

To compute the vorticity measure of our point vortices, we use vortex blobs, obtained by spreading the circulation of a point vortex over a chosen small area, the vortex core (see e.g. [43]). In this formulation, the vorticity field is approximated by

$$\boldsymbol{\alpha}\_{\varepsilon}^{N}(\mathbf{x}, t) = \sum\_{i} \boldsymbol{\Gamma}\_{i} \boldsymbol{\phi}\_{\varepsilon}(\mathbf{x} - \boldsymbol{X}\_{t}^{i}), \tag{18}$$

where the mollifier *φε* (in our numerical simulations a Gaussian kernel with width dependent on the subscript *ε*, the characteristic size of the vortex core) describes the vorticity distribution in the vortex core. Following standard numerical techniques (see e.g. [4, 1]), the core size *ε* of the vortices has to be much larger than the average spacing *<sup>d</sup>* between the vortices; the core size is usually taken to be *<sup>ε</sup>* <sup>=</sup> *<sup>d</sup><sup>q</sup>* , with *q <<* 1.

In (18), the vorticity distribution at any time depends on the point vortices *X<sup>i</sup> t* through the vortex blobs. In our numerical simulations, we take *<sup>N</sup>* <sup>∼</sup> 104 point vortices; following a mean-field approach, the initial circulation for every given point vortex is derived from *u*<sup>0</sup> <sup>1</sup> and is equal to *<sup>i</sup>* <sup>0</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup>*δN* . We solved the point vortex model (8) using a 2nd order Runge Kutta scheme coupled with a Heun techniques for the noise, for a second-order time discrete approximation; the time step for our simulations was selected to be *t* <sup>∼</sup> <sup>10</sup>−<sup>3</sup> to ensure a trade-off between the stability of our method and the generation of vortex-like structures in the shear flow model.

In the usual way, we also recall that in our numerical framework, the kernel *K* in (13) corresponds to the Biot-Savart kernel. We have that *K* = ∇⊥*G* = *(∂*2*G,* <sup>−</sup>*∂*1*G)*, where *<sup>G</sup>* is the Green function on <sup>T</sup>2. In the whole plane we have the simple expression *<sup>G</sup>*R<sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup>*<sup>π</sup>* log |*x*|; while for our domain we know that

$$G(\mathbf{x}) = \frac{1}{2\pi} \log|\mathbf{x}| + s(\mathbf{x}), \ \forall \mathbf{x} \in \mathbb{T}^2 \ \backslash \{0\}, \tag{19}$$

and *s(x)* is a smooth function on T2. Thus, *K* is divergence-free, smooth away from the origin, and symmetrical; moreover, it holds the following behaviour:

$$|K(\mathbf{x})| \sim 1/|\mathbf{x}|, \text{ as } |\mathbf{x}| \to 0,$$

which we extensively use to approximate our kernel with *<sup>K</sup>*R<sup>2</sup> *(x* <sup>−</sup> *y)* <sup>=</sup> <sup>1</sup> 2*π (x*−*y)* ⊥ |*x*−*y*| 2 , throughout the numerical simulations. Without ambiguity, from here on, we consider the horizontal and vertical axes as our reference frame, naming them the x-axis and y-axis, as usual.

#### **4.1.1 The Role of Intrinsic Instability**

We know that, at least formally, the vector field *<sup>u</sup>* <sup>=</sup> *<sup>u</sup>*<sup>0</sup> is a solution of Euler equation (14) with *ν* = 0. This system is unstable: small perturbations rapidly develop vortex blobs. As such, we consider the system of point vortices *(X<sup>i</sup> t)i* with initial vorticity, derived from (17), and expressed as

$$\alpha\_0(\mathbf{x}\_1, \mathbf{x}\_2) := \frac{1}{N} \sum\_{l} \frac{1}{2\delta} \delta\_{X\_0^l}(\mathbf{x}\_1, \mathbf{x}\_2), \tag{20}$$

where the circulation for each point is equal to <sup>1</sup> <sup>2</sup>*δN* and the initial positions of the vortices *X<sup>i</sup>* <sup>0</sup> ∀*i* = 1*, ..., N* are uniformly distributed on the strip [−1*,* 1] × [−*δ, δ*]. The randomly generated initial condition represents small perturbations in the system and is responsible for the different pattern formation.

The measure *ω<sup>N</sup> <sup>t</sup>* := <sup>1</sup> *N <sup>i</sup> δXi <sup>t</sup>* converges, in distribution, to the scalar vorticity field solution of the Euler equation (14); analogously with the continuous case, we see in Fig. 1a, c the development of instability in the form of macroscopic vortexlike structure on the boundary of the two fluid layers.

Note that the number of such macroscopic vortex-like structures and their position is entirely dependent on the initial condition: small perturbations on the randomly generated point vortices can produce entirely different macroscopic vortex-like structures, hence the instability of the two laminar fluids' profile.

#### **4.1.2 The Role of Viscosity and Stability Restoration**

The exact solution of the Navier-Stokes equation (14), with *ν >* 0 and initial condition *u*0, is given by

$$
\mu\left(t, \left.x\_1, \left.x\_2\right)\right) = \left(\mu\_1\left(t, \left.x\_2\right), 0\right), 0\right)
$$

where *u*<sup>1</sup> *(t,x*2*)* solves the heat equation,

$$
\partial\_t \mu\_1 = \nu \partial\_{\chi\_2}^2 \mu\_1 \tag{21}
$$

$$
\mu\_1 \left( 0, \chi\_2 \right) = \mu\_1^0 \left( \chi\_2 \right) \,.
$$

Due to the spreading of the profile *u*<sup>0</sup> <sup>1</sup>, the solution becomes more stable; namely, the development of macroscopic vortex-like structures is delayed. In our numerical simulations, we reproduce this phenomenon by perturbing the system (8) through independent Brownian motions *B<sup>i</sup> <sup>t</sup> , i* = 1*, ..., N*, with variance linked to the viscosity parameter: *V ar(B<sup>i</sup> <sup>t</sup> )* <sup>∼</sup> <sup>√</sup>*ν*. This system converges to the exact solution when *N* → ∞; however, in our numerical study we are dealing with a finite system. For this reason, the profile of the strip remains quite stable for short times, with just a spread along the y-axes. We report in Fig. 2a, b a single configuration at two different

timesteps, *<sup>t</sup>* <sup>=</sup> <sup>50</sup> and *<sup>t</sup>* <sup>=</sup> 100; we take <sup>√</sup>*<sup>ν</sup>* <sup>=</sup> <sup>0</sup>*.*<sup>095</sup> to better focus on the stability restoration. Our results are in agreement with the theory: from a comparison with Fig. 1b, c, we see that when *ν >* 0, the profile is much more stable and diffused than in the deterministic case, and macroscopic vortex-like structures appear only at large times.

## *4.2 Numerical Results on Environmental Noise*

#### **4.2.1 Selection of Divergence Free Field**

Starting with the same initial condition (17), we consider *N* + *M* point vortices, each of which we associate with a position in T<sup>2</sup> in the following way:

$$X\_t^1, \ldots, X\_t^N, Y^1, \ldots, Y^M.$$

Here, the vortices *Y <sup>i</sup> , i* = 1*, ..., M* do not move, and when their interactions with point vortices became not negligible, they represent the feedback of small-scale turbulence acting on the fluid itself on large scales.

The new simulated vortex dynamics for *X<sup>i</sup> <sup>t</sup>* in T<sup>2</sup> as in (13), reads

**Fig. 2** *ν >* 0. (**a**): iteration *t* = 50, preservation of strip profile; (**b**): iteration *t* = 100, first development of the strip profile's instability, formation of large rotating structures

42 F. Flandoli et al.

$$dX\_t^i = \frac{1}{N} \sum\_{i' \neq i} \Gamma\_{i'} K\left(X\_t^i - X\_t^{i'}\right) dt + \sum\_j \sigma\_j(X\_t^i) \diamond dW\_t^j$$

with periodic boundary conditions, where *W<sup>j</sup> <sup>t</sup>* are Brownian motions, all independent and uni-dimensional, and they are acting simultaneously on all the particles *i* = 1*, ..., N*. The environmental noise follows the Stratonovich integral prescription, automatically implemented in Heun's method [35].

We choose the divergence-free vector fields *σj* as

$$\sigma\_j(X\_l^i) := a\_j^{N,M} K \left( X\_l^i - Y^j \right), \quad j = 1, \ldots, M$$

following the theoretical analysis performed in [22, 20]. Here, the intensities *aN ,M j* are linked to the scaling limit process, which produces a viscosity term on the large scales, and *K*, the Biot-Savart kernel, simulates the action of such small vortices. The idea behind such a selection is that we want to exploit the same features of the vortex model, with the formation of small-scale vortex structures generating feedback on the entire configuration. In the limit, the dynamics of such small structures, modulated through a Brownian motion, rebound on large scales, perturbing their motion with their dissipative properties and delaying the formation of the instability.

#### **4.2.2 Positions and Intensities of Fixed Vortices**

In order to connect with previous studies [20], we choose the positions of the fixed vortices *Y <sup>j</sup>* and their intensity *aN ,M <sup>j</sup>* according to the convergence of the scaling limit (4.2.1). More precisely, at each timestep, we generate *<sup>Y</sup> <sup>j</sup> , j* <sup>=</sup> <sup>1</sup>*, ..., M*, uniformly distributed point vortices; their position on the y-axis is apriori selected in the interval [−*δF X, δF X*]. In this setup, the vortices *<sup>Y</sup> <sup>j</sup>* are generated in a strip of variable height 2*δF X*; this strip contains the moving vortices *X<sup>i</sup> <sup>t</sup>* , and it is taken to be of the same height of our boundary fluid layers, or one order of magnitude greater. This choice emphasizes that our proposed "small-scale" structures should act on all points of the fluid in all directions: the average contribution of the *Y <sup>j</sup>* on the *Xi <sup>t</sup>* along every direction should mimic a Brownian motion. We explored different setups of positions and intensity; we selected meaningful realizations, as reported in Table 1.

**Table 1** Parameters of the discussed realizations


We choose the intensity of the "small-scale" perturbations following heuristic considerations. We consider the mean inter-particle distance between two fixed point vortices, *<r>*:= <sup>√</sup>1*/m*, where *<sup>m</sup>* := *M/A* is the particle density and *<sup>A</sup>* is the total area occupied by the *M* vortices.

Then, we focus on a single moving vortex, *X<sup>i</sup> <sup>t</sup>* ; we compute the magnitude of its velocity when *X<sup>i</sup> <sup>t</sup>* is at a distance *d* =*<r>/*2 from the nearest fixed vortex, i.e., its position is halfway between two fixed vortices. As a result, we obtain the following estimate for the velocity of *X<sup>i</sup> t* :

$$\sum\_{j} a\_j^{N,M} \frac{1}{4\pi} \frac{|X\_t^j - Y^j|^\perp}{\|X\_t^j - Y^j\|^2} \sim \frac{1}{4\pi} \sum\_{j} \frac{a\_j^{N,M}}{d} \sim \frac{a^N}{4\pi} \sum\_{j} \frac{1}{d}.$$

where we suppose that the coefficients depend on the configuration *(X<sup>i</sup> t)i*, and are equal for each *j* = 1*, ..., M*. What is left is to estimate the number of fixed vortices such that the interaction with *X<sup>i</sup> <sup>t</sup>* is not negligible: let us call this number *K*, giving us <sup>∼</sup> *Ka<sup>N</sup>* <sup>4</sup>*π d* .

Thus, using the theory from the scaling limit of environmental transport noise (see e.g. [22, 20, 23]) and the construction of Sect. 3.2 for point vortices with transport noise, we assume that

$$\nu \sim \frac{1}{2} \left( \frac{Ka^N}{4\pi d} \right)^2$$

*.*

This leads us to the estimate for the intensity of the fixed vortices

$$a^N \sim 2\sqrt{2}\pi \frac{\sqrt{A}}{\sqrt{M}} \frac{\sqrt{\nu}}{K}$$

It remains to estimate *K*, the number of the nearest fixed vortices: consider a ball centered in *X<sup>i</sup> <sup>t</sup>* with radius *<sup>d</sup>*, so that the area is *Anear* <sup>=</sup> *π d*2. We recall that *<sup>m</sup>* is the density of the fixed vortices, then the nearest vortices are:

$$m \times A\_{near} = \frac{M}{A} A\_{near} = \frac{\pi}{4}.$$

Taking into account only the nearest vortices, we are underestimating the actual contribution of all the vortices. In particular, we should compute such contribution by considering a radius dependent on the effective range of the image of the Biot-Savart kernel. In fact, being *<sup>K</sup>* <sup>∼</sup> *i* |*x*| <sup>2</sup> , contribution for distant particle is negligible. For this reason, we empirically selected a wider radius *αd*, with *α* ∼ 3. Concluding we get our estimates for the intensity

$$a^N \sim \frac{8\sqrt{2}}{3}\sqrt{\nu}\frac{\sqrt{A}}{\sqrt{N}}.$$

#### **4.2.3 Effect of Small Scale Common Noise**

As recalled in the previous paragraph's heuristics, the procedure of the scaling limit is preserved when both *N* and *M* are large, and the intensity *aN ,M <sup>j</sup>* is small. For this reason, we do not search for the same exact solution of the Navier-Stokes equation (14), *ν >* 0, with initial condition *u*0, as per the case of the independent noise. However, since the regime tends, in the limit, to the same solution, we expect a diffusive effect on the strip of the point vortex. More precisely, we expect to see a delay in the formation of macroscopic structures and a more dispersed displacement of such small vortex blobs that, on average, should maintain the strip configuration for a longer time.

In the first of our simulations, we generate at each time step *<sup>M</sup>* <sup>∼</sup> <sup>2</sup> · 105 fixed vortices, with intensity *aN ,M <sup>j</sup>* <sup>∼</sup> <sup>5</sup> · <sup>10</sup>−<sup>4</sup> which follows from the heuristics. The fixed vortices are uniformly distributed in a strip [−1*,* 1] × [−0*.*1*,* 0*.*1], containing the initial point vortices configuration. This particular setup captures the feedback effect of small scales on large vorticity structures, as the contribution of all the low-intensity perturbations on the dynamics of the point vortices averages in every direction. In the proposed numerical simulation, we show that the transport noise model reproduces the desired instability delay, even if it is slightly less effective than in the independent noise case; we illustrate a snapshot of a configuration for time *t* = 50 and *t* = 100 in Fig. 3a, b.

By comparison with Figs. 1b and 2a, we see that the initial strip configuration is preserved for a longer time than in the deterministic case and rotation of the fluid is milder, but the profile is less stable than in the viscous regime. If we focus on the deterministic case, we see blob-like structures formation already at *t* = 50; in contrast, in the transport noise regime, those structures are less visible and appear more prominently only at the end of our simulation (*t* = 100). This delay of the instability is evident in the realizations in Fig. 3b, compared with Figs. 1c and 2b: we notice a more diffused and homogeneous profile and a delayed formation of rotational structures due to the noise spreading the particles along the y-axis. A difference with the viscous case is that the compression in the x-axis is stronger than in the case of the independent noise, resulting in a more prominent stretch, which could resemble more the deterministic formations, placing the transport noise as a midpoint between the two regimes.

In the second of our simulations, the strip of fixed points *Y <sup>j</sup>* is generated in the same region as the point vortices at each timestep; we selected *<sup>M</sup>* <sup>∼</sup> <sup>1</sup>*.*<sup>32</sup> · 105, the fixed vortices are uniformly distributed in [−1*,* 1] × [−*δ* − *, δ* + ], with = 0*.*03, and their intensity is derived from the heuristics *aN ,M <sup>j</sup>* <sup>∼</sup> <sup>5</sup> · <sup>10</sup>−4. The results are shown in Fig. 4: diffusion on the y-axis is present for short times, and preservation of the strip profile is guaranteed. However, the drawback of such a configuration

is that analysis can be performed only for a short time: boundary effects of the fixed vortex strip can deteriorate the configuration, making the results unrealistic. In future works, we expect to overcome this obstacle by proposing a new method, now in the study, to generate small vortices only in regions activated by the shear flow's movement.

x

We performed a final simulation, in which we take the density of fixed vortices to be smaller than the density of the point vortices. In particular, the strip of fixed points *Y <sup>j</sup>* is generated in the same region as the point vortices at each timestep; we selected *<sup>M</sup>* <sup>∼</sup> 103, the fixed vortices are uniformly distributed in [−1*,* <sup>1</sup>] × [−*<sup>δ</sup>* <sup>−</sup> *, δ* <sup>+</sup> ], with <sup>=</sup> <sup>0</sup>*.*05, and their intensity derived from the heuristics *<sup>a</sup>N ,M <sup>j</sup>* <sup>∼</sup> <sup>5</sup> · <sup>10</sup>−3. While the diffusive behaviour is lost, as shown in Fig. 5, the strip is already broken at time *t* = 50, showing rotating structures. The fixed vortices' low density and higher intensity seem to produce new formations and medium-scale structures, showing

a completely different behaviour than the deterministic and viscose counterparts. For this reason, we need to investigate further the link between the ratio of vortices densities and the formation of new independent medium structures.

## *4.3 Diagnostics*

In this section, we perform statistic analysis on the three configurations proposed in this study to highlight the differences and the reconstructed stability, or the emergence of new structures, in the Kelvin-Helmholtz instability problem.

As a first step, we compute the vorticity *ω* obtained through the same mollification as in Majda [4], through the vortex blob method applied to each of the point vortices *X<sup>i</sup> <sup>t</sup>* . We report our results for the vorticity computed in the deterministic case at *t* = 100 in Fig. 6a. We see that the vorticity measure concentration near the fluid's boundary layer is located in the newly developed macroscopic structures. Moreover, a displacement from the initial configuration where the laminar fluid started its evolution is present.

In contrast, the vortex blob solution with viscosity *ν >* 0 retains its structure for longer times than in the inviscid case. The instability delay is graphically evident both from the configuration reported in Fig. 2a and the vorticity intensity reported in Fig. 6b: the vorticity measure concentration at time *t* = 100 is similar to the one of the initial strip but with a more diffused profile on the horizontal line.

From configuration Fig. 3b, finally, in Fig. 6c, we show the vorticity in the environmental noise regime at *t* = 100. In contrast with the inviscid case, we see no development of macroscopic structures in the profile. However, even though a more diffused profile, with lower density overall, is present, the stability of the strip profile is lost at larger times compared to Figs. 6b and 2b. This instability at larger times suggests that different behavior, dependent on the density of fixed vortex *Y <sup>j</sup>* and selection of transport noise fields *σj* , could arise in applying this kind of small-scale approximation. For this reason, we focus our analysis on small-time behaviour.

Concerning the formation of large rotating structures, we see that the particles spread in the horizontal direction when a forcing term, either an independent or transport noise, acts on the fluid, in contrast to the solution of the Euler equation

**Fig. 6** Vorticity *ω* at *t* = 100 in the case (**a**) inviscid, (**b**) viscous and (**c**) transport noise, showing macroscopic structures formation or delay of instability

**Fig. 7** Histograms of x-positions and empirical density at *t* = 100 in the case (**a**) inviscid, (**b**) viscous and (**c**) transport noise, showing formation of macroscopic structures

with *ν* = 0. In particular, we focus on the empirical density obtained from the xaxis in the three configurations at *t* = 100, Fig. 7. The deterministic case of Fig. 7a shows a complete formation of separate blobs with peaks in the exact locations of the macroscopic structures, as in Fig. 1c. On the contrary, in the viscous Fig. 7b and transport noise case of Fig. 7c, the distribution of the vortices is more uniform, and it delays the instability of the fluid layers.

Following theoretical results, we know that with our initial condition *ω*0, the velocity *ut* solves (21) when *ν >* 0. This is crucial to understanding our system's short-time behaviour and the preservation of the initial configuration. This result states that the empirical density obtained from the y-position, when viscosity is present, maintains a Gaussian profile through time. In particular, in the case of *ν* = 0, the viscosity follows a classic Euler equation. As such, the y-position profile is far from a Gaussian: it behaves like a multi-modal distribution, concentrated in the proximity of the center of the large structures. This result is supported both by the profile of the particle system and by Figs. 8a, and 9a; the qq-plot shows a distinct behaviour for small quantiles. The Kolmogorv-Smirnov test estimates a D-statistic of 0*.*031, with a p-value less than 10−<sup>9</sup> confirming the rejection of the Gaussianity hypothesis.

Vice versa, when viscosity is present, i.e. *ν >* 0, as in the case of independent Brownian motions, the noise's diffusive behaviour allows the profile's restoration in the y-direction: the strip configuration is preserved for a longer time. From the profile of the point vortex system and Figs. 8b, and 9b, we see that the empirical density approximates well the one of a Gaussian kernel. Moreover, the qq-plot suggests a perfect match with a normal distribution, suggesting the preservation of the strip through time, trading it with more spread on the y-axis. Finally, performing a Kolmogorv-Smirnov test, we see, in fact, a D-statistic of 0*.*005 and a p-value greater than 0*.*9 suggesting to accept the normality hypothesis. This behaviour is

**Fig. 8** Histograms of y-positions and empirical density at *t* = 50 in the case (**a**) inviscid, (**b**) viscous and (**c**) transport noise, showing different density profiles

**Fig. 9** qq-plot of the empirical density in the y-positions, at *t* = 50 in the case (**a**) inviscid, (**b**) viscous and (**c**) transport noise

preserved throughout the simulation, degrading only at longer times when a few large formations start to rise, as in Fig. 2b.

In the transport noise case, we know that we recover the same viscous Euler equation solved in the independent Brownian motion case by applying a particular scaling limit procedure. To this end, we expected that the more stable profile shown in Fig. 3a presents the same diffusion on the density of the y-position as the case of the independent noise. This is the case as reported in Figs. 8c, and 9c: the behaviour at a short times, *t* = 50, in which the profile still approximates a Gaussian kernel. We perform a qq-plot and a Kolmogorov-Smirnov test on the y-position; we obtain a D-statistic of 0*.*011, and a p-value of 0*.*312; those results suggest the profile stability and the validity of our hypothesis.

However, later on (*t* = 100), even though the quantity obtained from the KS test is still preserved as in the viscous case, with a D-statistics of 0*.*008 and a pvalue of 0*.*48, the profile degrades as shown in Fig. 3b. Those results show that the strip configuration is not preserved for longer times due to transport noise stretching acting on the point vortices. This can be seen in the tail of the distribution in Fig. 9c, compared to the viscous case in Fig. 9b, which shows at *t* = 50 already a different behaviour. This suggests that the environmental noise's effect is not only related to the diffusivity of the strip, but is also responsible for the stretching and formation of different structures.

## **5 Concluding Remarks**

In the present work, we have produced numerical simulations of 2D incompressible fluids, also perturbed by transport noise, using the point vortex method. We focused on the special case of shear flow formation that produced a Kelvin-Helmholtz instability, in order to test the dissipativity properties of small-space-scale transport noise. We confronted the intrinsic instability generated in the deterministic case with the possible recovery of the stability through injected noise in the system in the form of transport noise. We showed that, for short times, with a degree less intense than the viscous case, we can maintain the stability of the strip at the expense of a more small-scale irregularity and diffusion of the profile.

**Acknowledgments** We greatly thank anonymous referees for several improvements suggested in their reports.

The research of the first author is funded by the European Union (ERC, NoisyFluid, No. 101053472). Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **On the 3D Navier-Stokes Equations with Stochastic Lie Transport**

**Daniel Goodair and Dan Crisan**

**Abstract** We prove the existence and uniqueness of maximal solutions to the 3D SALT (Stochastic Advection by Lie Transport) Navier-Stokes Equation in velocity and vorticity form, on the torus and the bounded domain respectively. In particular we demonstrate the efficacy of Goodair et al. (Existence and Uniqueness of Maximal Solutions to SPDEs with Applications to Viscous Fluid Equations, 2023. Stochastics and Partial Differential Equations: Analysis and Computations, pp.1-64) in showing the well-posedness for both the velocity and vorticity form of the equation, as well as obtaining the first analytically strong existence result for a fluid equation perturbed by Lie transport noise on a bounded domain.

## **1 Introduction**

The theoretical analysis of Stochastic Navier-Stokes Equations dates back to the work of Bensoussan and Temam [4] in 1973, where the problem of existence of solutions is addressed in the presence of a random forcing term. The wellposedness question for additive and multiplicative noise has since seen significant developments, for example through the works [1, 7, 24, 29, 39, 42, 43] and references therein. The interest in this problem has expanded into analytical properties of these solutions, particularly along the lines of ergodicity, which can be seen in [17, 18, 22, 26, 27, 34]. In the present work our concern is the Navier-Stokes Equations with Stochastic Lie Transport, derived through the principle of Stochastic Advection by Lie Transport (SALT) introduced in [35]. We consider the equation

$$u\_t - u\_0 + \int\_0^t \mathcal{L}\_{u\_s} u\_s \, ds - \nu \int\_0^t \Delta u\_s \, ds + \int\_0^t B u\_s \circ d\mathcal{W}\_s + \nabla \rho\_l = 0 \tag{1}$$

D. Goodair (-) · D. Crisan

Imperial College London, London, UK e-mail: djg116@ic.ac.uk

where *<sup>u</sup>* represents the fluid velocity, *<sup>ρ</sup>* the pressure,<sup>1</sup> <sup>W</sup> is a Cylindrical Brownian Motion, L represents the nonlinear term and *B* is a first order differential operator (the SALT Operator) formally addressed in Sect. 2.3. Intrinsic to this stochastic methodology is that *B* is defined relative to a collection of functions *(ξi)* which physically represent spatial correlations. These *(ξi)* can be determined at coarsegrain resolutions from finely resolved numerical simulations, and mathematically are derived as eigenvectors of a velocity-velocity correlation matrix (see [10, 11, 12]). We pose the equation (1) in 3 dimensions and impose the divergence free constraint on *u*. We shall consider the problem both over the torus T<sup>3</sup> and a smooth bounded domain <sup>O</sup> <sup>⊂</sup> <sup>R</sup>3. In the case of the torus we supplement the equation with the zero-average condition (as is classical), whilst for the bounded domain we impose the boundary condition

$$
\mu \cdot n = 0, \qquad w = 0 \tag{2}
$$

where *n* represents the outwards unit normal at the boundary, and *w* the fluid vorticity. These are the so called Lions boundary conditions, considered in [38] and shown to be a particular case of the Navier boundary conditions in [36] (note that this is done in 2D, whilst a treatment of the Navier boundary conditions in 3D can be found in [28]). The significance of such a boundary condition is well documented in that work by Kelliher, and can be seen in other works such as [28] where the boundary layer is explicitly addressed. The precise mathematical interpretation of these conditions, and the operators of (1), are explicated in Sect. 2.2. A complete derivation of this equation can be found in [45].

This work continues the theoretical development of fluid models perturbed by a transport type noise, the significance of which was posed as early as 1992 in the paper [6]. The area has garnered substantial attention in more recent years with the seminal works [35, 41], in which the authors establish a new class of stochastic equations driven by transport type noise which serve as fluid dynamics models by adding uncertainty in the transport of fluid parcels to reflect the unresolved scales. This paper partners that of [32] where we showed the existence and uniqueness of maximal solutions to Stochastic Partial Differential Equations satisfying an abstract framework, built to cope with a general transport type noise as we see in (1). The importance of such equations in modelling, numerical schemes and data assimilation is reviewed there, along with the theoretical developments of these equations: let us briefly mention some interesting results [3, 5, 21, 23]. We only draw particular attention here to the Navier-Stokes Equations, and results on a bounded domain. The Navier-Stokes Equations have been studied with transport type noise, for example in the works [14, 19, 25, 43], though typically solutions are analytically weak and where strong solutions are considered major concessions in the noise are made. In these cases a cancellation property is evident in the noise

<sup>1</sup> The pressure term is a semimartingale, and an explicit form for the SALT Euler Equation is given in [45] Subsection 3.3.

term, so the resulting energy balance is formally the same as for the deterministic equation without noise. These difficulties have been addressed on the torus, in the likes of the papers [13, 37] and those further addressed in [32], but extending a control of this noise term to a bounded domain remains open. Furthermore, whilst the presence of viscosity does improve the solution theory, it invites additional challenges in controlling the noise. Energy methods require non-standard Sobolev inner products to conduct the required integration by parts for the viscous term in the bounded domain, so we must provide novel estimates on the transport type noise in these inner products. The problem of analytically strong solutions to fluid equations perturbed by a transport type noise in the bounded domain has been considered in [8], though the authors assume that the gradient dependency is of a small enough size to be directly controlled and that the noise terms are traceless under Leray Projection; such assumptions are designed to circumvent the technical difficulties of a first order noise operator on a bounded domain, and is a luxury that the SALT equations do not have.

The goal of this paper is to apply the abstract framework established in [32], providing a rigorous justification of the results first announced in [31] and extending them to the vorticity form of Eq. (1) on a bounded domain. The purpose of this is twofold:


In the interests of brevity we provide a shorter manuscript here, though greater detail can be found in the former arXiv version [33]. In Sect. 2 we establish the stochastic and functional framework necessary to understand (1), along with fundamental properties of the operators involved. In Sect. 3 we make precise how equation (1) fits into the framework of [32], as a problem posed on the torus T3. In Sect. 4 we consider the vorticity form of equation (1) as a problem posed on a bounded domain of R3. We again justify the assumptions in [32] to prove existence and uniqueness of maximal solutions to this equation. Additional details for the proofs are given in Sect. 5, along with the results of the partnering paper [32].

# **2 Preliminaries**

## *2.1 Elementary Notation*

In the following <sup>O</sup> can represent both the 3-dimensional torus <sup>T</sup><sup>3</sup> and a smooth bounded domain <sup>O</sup> <sup>⊂</sup> <sup>R</sup>3. We consider Banach Spaces as measure spaces equipped with the Borel *σ*-algebra, and use *λ* to represent the Lebesgue Measure. Let *(*X *, μ)* denote a general measure space, *(*Y*,* ·<sup>Y</sup> *)* and *(*Z*,* ·Z*)* be Banach Spaces, and *(*U*,*·*,* ·<sup>U</sup> *)*, *(*H*,*·*,* ·H*)* be general Hilbert spaces. O is equipped with Euclidean norm.

• *<sup>L</sup>p(*<sup>X</sup> ;Y*)* is the class of measurable *<sup>p</sup>*-integrable functions from <sup>X</sup> into <sup>Y</sup>, <sup>1</sup> <sup>≤</sup> *p <* ∞, which is a Banach space with norm

$$\|\|\phi\|\|\_{L^p(\mathcal{X};\mathcal{Y})}^p := \int\_{\mathcal{X}} \|\phi(\mathbf{x})\|\,^p\_{\mathcal{Y}}\mu(d\mathbf{x})\,.$$

In particular *<sup>L</sup>*2*(*<sup>X</sup> ;Y*)* is a Hilbert Space when <sup>Y</sup> itself is Hilbert, with the standard inner product

$$
\langle \phi, \psi \rangle\_{L^2(\mathcal{X}; \mathcal{Y})} = \int\_{\mathcal{X}} \langle \phi(\mathbf{x}), \psi(\mathbf{x}) \rangle \mathbf{y} \,\mu(d\mathbf{x}) .
$$

In the case <sup>X</sup> <sup>=</sup> <sup>O</sup> and <sup>Y</sup> <sup>=</sup> <sup>R</sup><sup>3</sup> note that

$$\left\|\boldsymbol{\phi}\right\|\_{L^{2}(\mathcal{O};\mathbb{R}^{3})}^{2} = \sum\_{l=1}^{3} \left\|\boldsymbol{\phi}^{l}\right\|\_{L^{2}(\mathcal{O};\mathbb{R})}^{2}, \qquad \boldsymbol{\phi} = \left(\boldsymbol{\phi}^{1}, \dots, \boldsymbol{\phi}^{3}\right), \quad \boldsymbol{\phi}^{l}: \mathcal{O} \to \mathbb{R}.$$

We denote ·*Lp(*O;R3*)* by ·*Lp* and ·*L*2*(*O;R3*)* by ·.

• *L*∞*(*X ;Y*)* is the class of measurable functions from X into Y which are essentially bounded, which is a Banach Space when equipped with the norm

$$\|\|\phi\|\|\_{L^{\infty}(\mathcal{X};\mathcal{Y})} := \inf \{ C \ge 0 : \|\phi(\mathbf{x})\|\} \\ \\ \\ \le C \text{ for } \mu\text{-a.e.} \ge \mathcal{X} \}.$$

• *<sup>L</sup>*∞*(*O; <sup>R</sup>3*)* is the class of measurable functions from <sup>O</sup> into <sup>R</sup><sup>3</sup> such that *<sup>φ</sup><sup>l</sup>* <sup>∈</sup> *<sup>L</sup>*∞*(*O; <sup>R</sup>*)* for *<sup>l</sup>* <sup>=</sup> <sup>1</sup>*,...,N*, which is a Banach Space when equipped with the norm

$$\|\phi\|\_{L^{\infty}} := \sup\_{I \subseteq N} \|\phi^I\|\_{L^{\infty}(\mathcal{O};\mathbb{R})}.$$


*<sup>α</sup>* <sup>=</sup> *<sup>α</sup>*1*,...,αN* with <sup>|</sup>*α*| ≤ *<sup>m</sup>*, *<sup>D</sup>αφ* <sup>∈</sup> *C(*O; <sup>R</sup>*)* where *<sup>D</sup><sup>α</sup>* is the corresponding classical derivative operator *∂α*<sup>1</sup> *<sup>x</sup>*<sup>1</sup> *...∂αN xN* .


$$\|\phi\|\_{W^{m,p}(\mathcal{O},\mathbb{R})}^p := \sum\_{|\alpha| \le m} \|D^\alpha \phi\|\_{L^p(\mathcal{O};\mathbb{R})}^p$$

where *<sup>D</sup><sup>α</sup>* is the corresponding weak derivative operator. In the case *<sup>p</sup>* <sup>=</sup> <sup>2</sup> the space *<sup>W</sup>m,*2*(*O*,* <sup>R</sup>*)* is Hilbert with inner product

$$\langle \phi, \,\psi \rangle\_{W^{m,2}(\mathcal{O}; \mathbb{R})} := \sum\_{|\alpha| \le m} \langle D^{\alpha}\phi, D^{\alpha}\psi \rangle\_{L^2(\mathcal{O}; \mathbb{R})}.$$

• *<sup>W</sup>m,*∞*(*O; <sup>R</sup>*)* for *<sup>m</sup>* <sup>∈</sup> <sup>N</sup> is the sub-class of *<sup>L</sup>*∞*(*O*,* <sup>R</sup>*)* which has all weak derivatives up to order *<sup>m</sup>* <sup>∈</sup> <sup>N</sup> also of class *<sup>L</sup>*∞*(*O*,* <sup>R</sup>*)*. This is a Banach space with norm

$$\|\|\phi\|\|\_{W^{m,\infty}(\mathcal{O},\mathbb{R})} := \sup\_{|\alpha| \le m} \|D^{\alpha}\phi\|\_{L^{\infty}(\mathcal{O};\mathbb{R}^3)}.$$

• *<sup>W</sup>m,p(*O; <sup>R</sup>3*)* for <sup>1</sup> <sup>≤</sup> *p <* <sup>∞</sup> is the sub-class of *<sup>L</sup>p(*O*,* <sup>R</sup>3*)* which has all weak derivatives up to order *<sup>m</sup>* <sup>∈</sup> <sup>N</sup> also of class *<sup>L</sup>p(*O*,* <sup>R</sup>3*)*. This is a Banach space with norm

$$\|\|\phi\|\|\_{W^{m,p}}^p := \sum\_{l=1}^{3} \|\phi^l\|\|\_{W^{m,p}}^p (\mathcal{O}; \mathbb{R})^{\frac{1}{4}}$$

In the case *<sup>p</sup>* <sup>=</sup> <sup>2</sup> the space *<sup>W</sup>m,*2*(*O*,* <sup>R</sup>3*)* is Hilbertian with inner product

$$\langle \phi, \psi \rangle\_{W^{m,2}} := \sum\_{l=1}^{3} \langle \phi^l, \psi^l \rangle\_{W^{m,2}(\mathcal{O}; \mathbb{R})}.$$

• *<sup>W</sup>m,*∞*(*O; <sup>R</sup>3*)* is the sub-class of *<sup>L</sup>*∞*(*O*,* <sup>R</sup>3*)* which has all weak derivatives up to order *<sup>m</sup>* <sup>∈</sup> <sup>N</sup> also of class *<sup>L</sup>*∞*(*O*,* <sup>R</sup>3*)*. This is a Banach space with norm

$$\|\|\phi\|\|\_{W^{m,\infty}(\mathcal{O},\mathbb{R}^3)} := \sup\_{I \subseteq N} \|\phi^I\|\|\_{W^{m,\infty}(\mathcal{O};\mathbb{R})}.$$

• *<sup>L</sup>*˙ <sup>2</sup>*(*T3; <sup>R</sup>3*)* is the subset of *<sup>L</sup>*2*(*T3; <sup>R</sup>3*)* of functions *<sup>φ</sup>* such that

$$\int\_{\mathbb{T}^3} \phi \, d\lambda = 0.$$


$$\|F\|\_{\mathcal{L}(\mathcal{Y}; \mathcal{Z})} = \sup\_{\|\mathcal{Y}\|\mathcal{Y} = 1} \|F\mathbf{y}\|\_{\mathcal{Z}}$$

and is simply the dual space <sup>Y</sup><sup>∗</sup> when <sup>Z</sup> <sup>=</sup> <sup>R</sup>, with operator norm ·Y<sup>∗</sup> *.*

• <sup>L</sup> <sup>2</sup>*(*U; <sup>H</sup>*)* is the space of Hilbert-Schmidt operators from <sup>U</sup> to <sup>H</sup>, defined as the elements *F* ∈ L *(*U; H*)* such that for some basis *(ei)* of U,

$$\sum\_{i=1}^{\infty} \|Fe\_i\|\_{\mathcal{H}}^2 < \infty.$$

This is a Hilbert space with inner product

$$\langle F, G \rangle\_{\mathcal{L}^2(\mathcal{U}; \mathcal{H})} = \sum\_{l=1}^{\infty} \langle Fe\_l, Ge\_l \rangle\_{\mathcal{H}}.$$

which is independent of the choice of basis.

We will consider a partial ordering on the 3−dimensional multi-indices by *α* ≤ *β* if and only if for all *l* = 1*,* 2*,* 3 we have that *αl* ≤ *βl*. We extend this to notation *<* by *α<β* if and only if *α* ≤ *β* and for some *l* = 1*,* 2*,* 3, *αl < βl*.

We also now introduce some less familiar spaces in slightly greater detail. We recall notation that O represents a smooth bounded domain in R<sup>3</sup> which we now fix, <sup>T</sup><sup>3</sup> is the <sup>3</sup>−dimensional torus, and <sup>O</sup> freely denotes both <sup>T</sup><sup>3</sup> and <sup>O</sup>.

**Definition 2.1** We define *C*∞ <sup>0</sup>*,σ (*O; <sup>R</sup>3*)* as the subset of *<sup>C</sup>*<sup>∞</sup> <sup>0</sup> *(*O; <sup>R</sup>3*)* of functions which are divergence-free. *L*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* is defined as the completion of *C*∞ <sup>0</sup>*,σ (*O; <sup>R</sup>3*)* in *<sup>L</sup>*2*(*O; <sup>R</sup>3*)*, whilst we introduce *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* as the intersection of *W*1*,*<sup>2</sup> <sup>0</sup> *(*O; <sup>R</sup>3*)* with *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* and *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* as the intersection of *<sup>W</sup>*2*,*2*(*O; <sup>R</sup>3*)* with *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*.

Recall that any function *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*2*(*T3; <sup>R</sup>3*)* admits the representation

On the 3D Navier-Stokes Equations with Stochastic Lie Transport 59

$$f(\mathbf{x}) = \sum\_{k \in \mathbb{Z}^3} f\_k e^{ik \cdot \mathbf{x}} \tag{3}$$

whereby each *fk* <sup>∈</sup> <sup>C</sup><sup>3</sup> is such that *fk* <sup>=</sup> *<sup>f</sup>*−*<sup>k</sup>* and the infinite sum is defined as a limit in *<sup>L</sup>*2*(*T3; <sup>R</sup>3*)*, see e.g. [44] Subsection 1.5 for details.

**Definition 2.2** We define *L*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* as the subset of *<sup>L</sup>*˙ <sup>2</sup>*(*T3; <sup>R</sup>3*)* of functions *<sup>f</sup>* whereby for all *<sup>k</sup>* <sup>∈</sup> <sup>Z</sup>3, *<sup>k</sup>* ·*fk* <sup>=</sup> <sup>0</sup> with *fk* as in (3). For general *<sup>m</sup>* <sup>∈</sup> <sup>N</sup> we introduce *<sup>W</sup>m,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* as the intersection of *<sup>W</sup>m,*2*(*T3; <sup>R</sup>3*)* respectively with *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)*.

Note that *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* is precisely the subspace of *<sup>W</sup>*1*,*2*(*T3; <sup>R</sup>3*)* consisting of zero-average divergence free functions. Similarly *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* is precisely the subspace of *W*1*,*<sup>2</sup> <sup>0</sup> *(*O; <sup>R</sup>3*)* consisting of divergence free functions. Moreover, *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* is the completion of *<sup>C</sup>*<sup>∞</sup> <sup>0</sup>*,σ (*O; <sup>R</sup>3*)* in *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)*. The general space *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* thus incorporates the divergence-free and zero-average/zero-trace condition.

As for the stochastic set up, let *(
,* <sup>F</sup>*, (*F*t),* <sup>P</sup>*)* be a fixed filtered probability space satisfying the usual conditions of completeness and right continuity. We take W to be a cylindrical Brownian Motion over some Hilbert Space U with orthonormal basis *(ei)*. Recall ([30], Subsection 1.4) that W admits the representation W*<sup>t</sup>* = <sup>∞</sup> *<sup>i</sup>*=<sup>1</sup> *eiW<sup>i</sup> <sup>t</sup>* as a limit in *<sup>L</sup>*2*(* ; <sup>U</sup> *)* whereby the *(W<sup>i</sup> )* are a collection of i.i.d. standard real valued Brownian Motions and U is an enlargement of the Hilbert Space U such that the embedding *J* : U → U is Hilbert-Schmidt and W is a *J J* ∗−cylindrical Brownian Motion over U . Given a process *<sup>F</sup>* : [0*, T* ] ×  <sup>→</sup> <sup>L</sup> <sup>2</sup>*(*U; <sup>H</sup> *)* progressively measurable and such that *<sup>F</sup>* <sup>∈</sup> *L*<sup>2</sup>  × [0*, T* ]; <sup>L</sup> <sup>2</sup>*(*U; <sup>H</sup> *)* , for any 0 ≤ *t* ≤ *T* we define the stochastic integral

$$\int\_0^t F\_s d\mathcal{W}\_s := \sum\_{i=1}^\infty \int\_0^t F\_s(e\_i) d\mathcal{W}\_s^i$$

where the infinite sum is taken in *<sup>L</sup>*2*(* ; <sup>H</sup> *)*. We can extend this notion to processes *<sup>F</sup>* which are such that *F (ω)* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> [0*, T* ]; <sup>L</sup> <sup>2</sup>*(*U; <sup>H</sup> *)* for <sup>P</sup> <sup>−</sup> *a.e. <sup>ω</sup>* via the traditional localisation procedure. In this case the stochastic integral is a local martingale in H . 2

<sup>2</sup> A complete, direct construction of this integral, a treatment of its properties and the fundamentals of stochastic calculus in infinite dimensions can be found in [30] Section 1.

# *2.2 Functional Framework*

We now recap the classical functional framework for the study of the deterministic Navier-Stokes Equation. Firstly we briefly comment on the pressure term ∇*ρ*, which will not play any role in our analysis. *ρ* does not come with an evolution equation and is simply chosen to ensure the incompressibility (divergence-free) condition; moreover we will ignore this term via a suitable projection (in Sect. 3 we even consider a different form of the equation) and treat the projected equation, with the understanding to append a pressure to it later. This procedure is well discussed in [44] Sect. 5 and [40], and an explicit form for the pressure for the SALT Euler Equation is given in [45] Subsection 3.3.

The mapping <sup>L</sup> is defined for sufficiently regular functions *f, g* : <sup>O</sup> <sup>→</sup> <sup>R</sup><sup>3</sup> by <sup>L</sup>*<sup>f</sup> <sup>g</sup>* := <sup>3</sup> *<sup>j</sup>*=<sup>1</sup> *<sup>f</sup> <sup>j</sup> ∂j g.* Here and throughout the text we make no notational distinction between differential operators acting on a vector valued function or a scalar valued one; that is, we understand *∂j <sup>g</sup>* by its component mappings *(∂lg)<sup>l</sup>* := *∂j g<sup>l</sup>* . We now give some clarification as to 'sufficiently regular', by stating basic properties of this mapping. For any *<sup>m</sup>* <sup>≥</sup> 1, the mapping <sup>L</sup> : *<sup>W</sup>m*+1*,*2*(*O; <sup>R</sup>3*)* <sup>→</sup> *<sup>W</sup>m,*2*(*O; <sup>R</sup>3*)* defined by *<sup>f</sup>* → <sup>L</sup>*<sup>f</sup> <sup>f</sup>* is continuous. Additionally there exists a constant *<sup>c</sup>* such that for any *f, g* <sup>∈</sup> *<sup>W</sup>k,*2*(*O; <sup>R</sup>3*)* for *<sup>k</sup>* <sup>∈</sup> <sup>N</sup> as appropriate, we have the bounds:

$$\|\mathcal{L}\_f \mathbf{g}\| + \|\mathcal{L}\_\mathbf{g} f\| \le c \|\mathbf{g}\|\_{W^{1,2}} \|f\|\_{W^{2,2}},\tag{4}$$

$$\|\mathcal{L}\_{\mathfrak{g}}f\|\_{W^{1,2}} \le c \|\mathfrak{g}\|\_{W^{1,2}} \|f\|\_{W^{3,2}},\tag{5}$$

$$\|\mathcal{L}\_{\mathfrak{g}}f\|\_{W^{1,2}} \le c \|\mathfrak{g}\|\_{W^{2,2}} \|f\|\_{W^{2,2}}.\tag{6}$$

We introduce the Leray Projector <sup>P</sup> as the orthogonal projection in *<sup>L</sup>*2*(*O; <sup>R</sup>3*)* onto *L*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*. It is well known (see e.g. [47] Remark 1.6.) that for any *<sup>m</sup>* <sup>∈</sup> <sup>N</sup>, <sup>P</sup> is continuous as a mapping <sup>P</sup> : *<sup>W</sup>m,*2*(*O; <sup>R</sup>3*)* <sup>→</sup> *<sup>W</sup>m,*2*(*O; <sup>R</sup>3*)*. In fact, the complement space of *L*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* can be characterised (this is the so called Helmholtz-Weyl decomposition), a result that we state explicitly as we will need to exploit the precise structure in future arguments.

**Lemma 2.3** *Define the space*

$$L^{2, \perp}\_{\sigma}(\mathcal{O}; \mathbb{R}^3) := \{ \psi \in L^2(\mathcal{O}; \mathbb{R}^3) : \psi = \nabla \text{g for some } \mathbf{g} \in W^{1,2}(\mathcal{O}; \mathbb{R}) \}.$$

*Then indeed L*2*,*<sup>⊥</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*) is orthogonal to <sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*) in <sup>L</sup>*2*(*O; <sup>R</sup>3*), i.e. for any <sup>φ</sup>* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*) and <sup>ψ</sup>* <sup>∈</sup> *<sup>L</sup>*2*,*<sup>⊥</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*) we have that*

$$
\langle \phi, \psi \rangle = 0.
$$

*Moreover, every <sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*2*(*O; <sup>R</sup>3*) has the unique decomposition*

$$f = \phi + \psi\tag{7}$$

*for some <sup>φ</sup>* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*), <sup>ψ</sup>* <sup>∈</sup> *<sup>L</sup>*2*,*<sup>⊥</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*) and every <sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*2*(*T3; <sup>R</sup>3*) has the unique decomposition*

$$f = \phi + \psi + c \tag{8}$$

*where <sup>φ</sup>* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*), <sup>ψ</sup>* <sup>∈</sup> *<sup>L</sup>*2*,*<sup>⊥</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*) and <sup>c</sup> is a constant function: that is, there exists <sup>k</sup>* <sup>∈</sup> <sup>R</sup><sup>3</sup> *such that each component mapping <sup>c</sup><sup>j</sup> is identically equal to <sup>k</sup><sup>j</sup> , j* = 1*,* 2*,* 3*.*

*Proof* See [47] Theorems 1.4, 1.5 and [44] Theorem 2.6 .

**Corollary 2.3.1** *Every <sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*2*(*O; <sup>R</sup>3*) admits the representation*

$$f = \mathcal{P}f + \nabla \mathcal{g} \tag{9}$$

*for some <sup>g</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>*). Similarly every <sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*2*(*T3; <sup>R</sup>3*) admits the representation*

$$f = \mathcal{P}f + \nabla \mathcal{g} + c \tag{10}$$

*for some <sup>g</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*2*(*T3; <sup>R</sup>*) and constant function <sup>c</sup>.*

*Proof* It is an immediate property of the orthogonal projection that P*f* is the unique element *<sup>φ</sup>* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* of (7) and (8).

Through <sup>P</sup> we define the Stokes Operator *<sup>A</sup>* : *<sup>W</sup>*2*,*2*(*O; <sup>R</sup>3*)* <sup>→</sup> *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* by *A* := −P. Once more we understand the Laplacian as an operator on vector valued functions through the component mappings, *(f )<sup>l</sup>* := *f <sup>l</sup>* . From the continuity of <sup>P</sup> we have immediately that for *<sup>m</sup>* ∈ {0} ∪ <sup>N</sup>, *<sup>A</sup>* : *<sup>W</sup>m*+2*,*2*(*O; <sup>R</sup>3*)* <sup>→</sup> *<sup>W</sup>m,*2*(*O; <sup>R</sup>3*)* is continuous. We remark that leaves the complement space *L*2*,*<sup>⊥</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* invariant, so *<sup>A</sup>*<sup>P</sup> is equal to *<sup>A</sup>* on *<sup>W</sup>*2*,*2*(*O; <sup>R</sup>3*)*. Moreover (see [44] Theorem 2.24) there exists a collection of functions *(ak)*, *ak* <sup>∈</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* <sup>∩</sup> *<sup>C</sup>*∞*(*O; <sup>R</sup>3*)* such that the *(ak)* are eigenfunctions of *<sup>A</sup>*, are an orthonormal basis in *L*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* and an orthogonal basis in *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*. The eigenvalues *(λk)* are strictly positive and approach infinity as *<sup>k</sup>* → ∞. Therefore every *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* admits the representation

$$f = \sum\_{k=1}^{\infty} f\_k a\_k \tag{11}$$

where *fk* = *f, ak*, as a limit in *<sup>L</sup>*2*(*O; <sup>R</sup>3*)*.

**Definition 2.4** For *<sup>m</sup>* <sup>∈</sup> <sup>N</sup> we introduce the spaces *D(Am/*<sup>2</sup>*)* as the subspaces of *L*2 *<sup>σ</sup> (*O; <sup>R</sup>3*)* consisting of functions *<sup>f</sup>* such that

$$\sum\_{k=1}^{\infty} \lambda\_k^m f\_k^2 < \infty$$

for *fk* as in (11). Then *Am/*<sup>2</sup> : *D(Am/*<sup>2</sup>*)* <sup>→</sup> *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* is defined by

$$A^{m/2}: f \mapsto \sum\_{k=1}^{\infty} \lambda\_k^{m/2} f\_k a\_k.$$

We present some fundamental properties regarding these spaces, which are justified in [9] Proposition 4.12, [44] Exercises 2.12, 2.13 and the discussion in Subsection 2.3.

1. *D(Am/*<sup>2</sup>*)* <sup>⊂</sup> *<sup>W</sup>m,*2*(*O; <sup>R</sup>3*)* <sup>∩</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* and the bilinear form

$$\langle f, \mathbf{g} \rangle\_m := \langle A^{m/2} f, A^{m/2} \mathbf{g} \rangle\_m$$

is an inner product on *D(Am/*<sup>2</sup>*)*;

2. For *<sup>m</sup>* even the induced norm ·<sup>2</sup> *<sup>m</sup>* = ·*,* ·*<sup>m</sup>* is equivalent to the *<sup>W</sup>m,*2*(*O; <sup>R</sup>3*)* norm, and for *m* odd there is a constant *c* such that

$$\|\cdot\|\_{W^{m,2}} \le c \|\cdot\|\_{m};$$

3. *D(A)* <sup>=</sup> *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* and *D(A*1*/*2*)* <sup>=</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* with the additional property that ·<sup>1</sup> is equivalent to ·*W*1*,*<sup>2</sup> on this space.

It can be directly shown that for any *p, q* <sup>∈</sup> <sup>N</sup> with *<sup>p</sup>* <sup>≤</sup> *<sup>q</sup>*, *<sup>p</sup>* <sup>+</sup> *<sup>q</sup>* <sup>=</sup> <sup>2</sup>*<sup>m</sup>* and *<sup>f</sup>* <sup>∈</sup> *D(Am/*<sup>2</sup>*)*, *<sup>g</sup>* <sup>∈</sup> *D(Aq/*<sup>2</sup>*)* we have that

$$
\langle f, \mathbf{g} \rangle\_m = \langle A^{p/2} f, A^{q/2} \mathbf{g} \rangle. \tag{12}
$$

From here we can also see that the collection of functions *(ak)* form an orthogonal basis of *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* equipped with the ·*,* ·<sup>1</sup> inner product. In addition to using these spaces defined by powers of the Stokes Operator, we also use the basis to consider finite dimensional approximations of these spaces.

**Definition 2.5** We define P*<sup>n</sup>* as the orthogonal projection onto span{*a*1*,...,an*} in *<sup>L</sup>*2*(*O; <sup>R</sup>3*)*. That is <sup>P</sup>*<sup>n</sup>* is given by

$$\mathcal{P}\_n: f \mapsto \sum\_{k=1}^n \langle f, a\_k \rangle a\_k$$

for *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*2*(*O; <sup>R</sup>3*)*.

The restriction of <sup>P</sup>*<sup>n</sup>* to *D(Am/*<sup>2</sup>*)* is self-adjoint for the ·*,* ·*<sup>m</sup>* inner product, and there exists a constant *<sup>c</sup>* independent of *<sup>n</sup>* such that for all *<sup>f</sup>* <sup>∈</sup> *D(Am/*<sup>2</sup>*)*,

On the 3D Navier-Stokes Equations with Stochastic Lie Transport 63

$$\|\mathcal{P}\_{\hbar}f\|\_{W^{m,2}} \le c \|f\|\_{W^{m,2}},\tag{13}$$

see [44] Lemma 4.1 for details. Similar ideas justify that for all *<sup>f</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*, *<sup>g</sup>* <sup>∈</sup> *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*,

$$\left\|(I - \mathcal{P}\_n)f\right\|^2 \le \frac{1}{\lambda\_n} \left\|f\right\|\_1^2, \qquad \left\|(I - \mathcal{P}\_n)g\right\|\_1^2 \le \frac{1}{\lambda\_n} \left\|g\right\|\_2^2$$

where *I* represents the identity operator in the relevant spaces. To conclude this subsection we discuss briefly bounds related to the nonlinear term, which will be used in our analysis. For every *<sup>φ</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* and *f, g* <sup>∈</sup> *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)*, we have that

$$
\langle \mathcal{L}\_{\phi}f, g \rangle = -\langle f, \mathcal{L}\_{\phi}g \rangle \tag{14}
$$

$$
\langle \mathcal{L}\_{\phi}f, f \rangle = 0. \tag{15}
$$

## *2.3 The SALT Operator*

Having established the relevant function spaces and some fundamental properties of the operators involved in the deterministic Navier-Stokes Equation, we now address the operator *B* appearing in the Stratonovich integral of (1). As in [30] Subsection 2.2, the operator *B* is defined by its action on the basis vectors *(ei)* of U. We shall show in Sect. 3.3 that *B* does indeed satisfy Assumption 2.2.2 of [30] for the spaces to *V,H,U,X* to be defined. With the notation of [30], each *Bi* is defined relative to the correlations *ξi* for sufficiently regular *f* by the mapping

$$B\_l: f \mapsto \mathcal{L}\_{\xi\_l} f + \mathcal{T}\_{\xi\_l} f$$

where L is as before, and T is a new operator that we introduce defined by

$$\mathcal{T}\_{\mathbf{g}}f := \sum\_{j=1}^{3} f^{j} \nabla \mathbf{g}^{j} .$$

We shall assume throughout that each *ξi* belongs to the space *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*. If for some fixed *<sup>m</sup>* <sup>∈</sup> <sup>N</sup> we have *ξi* <sup>∈</sup> *<sup>W</sup>m*+2*,*∞*(*O; <sup>R</sup>3*)* then for all *<sup>k</sup>* <sup>=</sup> <sup>0</sup>*,...,m* <sup>+</sup> 1,

$$\|\|\mathcal{T}\_{\xi\_l}f\|\|\_{W^{k,2}}^2 \le c \|\|\xi\_l\|\_{W^{k+1,\infty}}^2 \|f\|\|\_{W^{k,2}}^2 \tag{16}$$

$$\|\mathcal{L}\_{\xi\_l}f\|\_{W^{k,2}}^2 \le c \|\xi\_l\|\_{W^{k,\infty}}^2 \|f\|\_{W^{k+1,2}}^2 \tag{17}$$

$$\|\|B\_l f\|\|\_{W^{k,2}}^2 \le c \|\|\xi\_l\|\_{W^{k+1,\infty}}^2 \|f\|\|\_{W^{k+1,2}}^2. \tag{18}$$

Moreover <sup>T</sup>*ξi* is a bounded linear operator on *<sup>L</sup>*2*(*O; <sup>R</sup>3*)* so has adjoint <sup>T</sup> <sup>∗</sup> *ξi* satisfying the same boundedness. In conjunction with property (14), L*ξi* is a densely defined operator in *<sup>L</sup>*2*(*O; <sup>R</sup>3*)* with domain of definition *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)*, and has adjoint L<sup>∗</sup> *ξi* in this space given by −L*ξi* with same dense domain of definition. Likewise then *B*∗ *<sup>i</sup>* is the densely defined adjoint −L*ξi* + T <sup>∗</sup> *ξi* . Our techniques centre around energy estimates, where the key idea as to how we preserve these estimates in the case of a transport type noise owes to the following proposition.

**Proposition 2.6** *There exists a constant c such that for each i and for all f* ∈ *<sup>W</sup>k*+2*,*2*(*O; <sup>R</sup>3*) with <sup>k</sup>* <sup>=</sup> <sup>0</sup>*,...,m, we have the bounds*

$$\|\langle B\_l^2 f, f \rangle\_{W^{k,2}} + \|B\_l f\|\_{W^{k,2}}^2 \le c \|\xi\_l\|\_{W^{k+2,\infty}}^2 \|f\|\_{W^{k,2}}^2,\tag{19}$$

$$\left\| \langle B\_l f, f \rangle\_{W^{k,2}}^2 \le c \left\| \xi\_l \right\|\_{W^{k+1,\infty}}^2 \left\| f \right\|\_{W^{k,2}}^4. \tag{20}$$

*Proof* See Sect. 5.1.

Another valuable result is given now, which will be necessary in showing comparable estimates to Proposition 2.6 in the ·*,* ·*<sup>k</sup>* inner product for appropriate *k*.

**Lemma 2.7** *We have that*

$$B\_l: L^{2, \perp}\_{\sigma}(\mathcal{O}; \mathbb{R}^3) \cap W^{1, 2}(\mathcal{O}; \mathbb{R}^3) \to L^{2, \perp}\_{\sigma}(\mathcal{O}; \mathbb{R}^3)$$

*and moreover that* <sup>P</sup>*Bi* <sup>=</sup> <sup>P</sup>*Bi*<sup>P</sup> *on <sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*).*

*Proof* See Sect. 5.1.

We note that this result holds true only in the presence of the additional T*ξi* term in the operator, highlighting the significance of considering a noise which is not purely transport. The Leray Projector does pose difficulties in the presence of a boundary though, which we state here.

**Remark 2** The Leray Projector does not preserve the space *W*1*,*<sup>2</sup> <sup>0</sup> *(*O; <sup>R</sup>3*)*, and so we cannot say that <sup>P</sup>*Bi* : *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* <sup>→</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*. The issues arising from this operator not satisfying the zero-trace property are fundamentally why we only treat the Torus for the velocity form in Sect. 3.

## **3 The Velocity Equation on the Torus**

In this section we restrict ourselves to the Torus T3, leaving a treatment of the bounded domain to Sect. 4. We also now fix our assumptions on the *(ξi)*, assuming that each *ξi* <sup>∈</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* <sup>∩</sup> *<sup>W</sup>*3*,*∞*(*T3; <sup>R</sup>3*)* and they collectively satisfy

$$\sum\_{l=1}^{\infty} \|\xi\_l\|\_{W^{3,\infty}}^2 < \infty. \tag{21}$$

## *3.1 Definitions and Results*

Here we state the key definitions and results of this section. To facilitate our analysis we work with an equation projected by the Leray Projector as discussed at the start of Sect. 2.2. Thus we consider the new equation

$$u\_t - u\_0 + \int\_0^t \mathcal{P} \mathcal{L}\_{u\_\delta} u\_s \, ds + \nu \int\_0^t A u\_s \, ds + \int\_0^t \mathcal{P} B u\_s \circ d\mathcal{W}\_s = 0 \tag{22}$$

obtained at a heuristic level by projecting all terms of (1). Having not defined solutions of (1) we cannot be too formal here, but the idea is that we require solutions in *L*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* with initial condition also in this space so they are invariant under <sup>P</sup>, and P is a bounded linear operator so can be taken through the integrals (see [30] Corollary 1.6.12.1 for this result in Itô integration, understanding the Stratonovich integral as the sum of an Itô and a Bochner integral).

**Theorem 3.1** *For any given* <sup>F</sup>0−*measurable <sup>u</sup>*<sup>0</sup> :  <sup>→</sup> *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*) there exists a pair (u, τ ) such that: τ is a* P − *a.s. positive stopping time and u is a process whereby for* P − *a.e. ω, u*·*(ω)* ∈ *C* [0*, T* ]; *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*) and u*·*(ω)*1·≤*τ (ω)* ∈ *L*<sup>2</sup> [0*, T* ]; *<sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*) for all T >* 0 *with u*·1·≤*<sup>τ</sup> progressively measurable in <sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*), and moreover satisfying the identity*

$$u\_t = u\_0 - \int\_0^{t \wedge \tau} \mathcal{P} \mathcal{L}\_{u\_s} u\_s \, ds - \nu \int\_0^{t \wedge \tau} Au\_s \, ds - \int\_0^{t \wedge \tau} \mathcal{P} B u\_s \circ dW\_s$$

<sup>P</sup> <sup>−</sup> *a.s. in <sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*) for all <sup>t</sup>* <sup>≥</sup> <sup>0</sup>*.*

Theorem 3.1 will be proved as a consequence of the following proposition.

**Proposition 3.2** *Suppose that (u, τ ) are such that: τ is a* P−*a.s. positive stopping time and u is a process whereby for* P − *a.e. ω, u*·*(ω)* ∈ *C* [0*, T* ]; *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*) and <sup>u</sup>*·*(ω)*1·≤*τ (ω)* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> [0*, T* ]; *<sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*) for all T >* 0 *with u*·1·≤*<sup>τ</sup> progressively measurable in <sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*), and moreover satisfying the identity*

$$\begin{aligned} u\_{l} &= u\_{0} - \int\_{0}^{t \wedge \tau} \mathcal{P} \mathcal{L}\_{u\_{s}} u\_{s} \, ds - \upsilon \int\_{0}^{t \wedge \tau} A u\_{s} \, ds \\ &+ \frac{1}{2} \int\_{0}^{t \wedge \tau} \sum\_{l=1}^{\infty} \mathcal{P} B\_{l}^{2} u\_{s} \, ds - \int\_{0}^{t \wedge \tau} \mathcal{P} B u\_{s} d\mathcal{W}\_{s} \end{aligned}$$

<sup>P</sup> <sup>−</sup> *a.s. in <sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*) for all <sup>t</sup>* <sup>≥</sup> <sup>0</sup>*. Then the pair (u, τ ) satisfies the identity*

$$u\_t = u\_0 - \int\_0^{t \wedge \tau} \mathcal{P} \mathcal{L}\_{u\_s} u\_s \, ds - \nu \int\_0^{t \wedge \tau} A u\_s \, ds - \int\_0^{t \wedge \tau} \mathcal{P} B u\_s \circ d\mathcal{W}\_s$$

<sup>P</sup> <sup>−</sup> *a.s. in <sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*) for all <sup>t</sup>* <sup>≥</sup> <sup>0</sup>*.*

Proposition 3.2 motivates studying the converted equation

$$u\_t = u\_0 - \int\_0^t \mathcal{P} \mathcal{L}\_{u\_s} u\_s \, ds - \nu \int\_0^t A u\_s \, ds + \frac{1}{2} \int\_0^t \sum\_{l=1}^\infty \mathcal{P} B\_l^2 u\_s \, ds - \int\_0^t \mathcal{P} B u\_s d\mathcal{W}\_s. \tag{23}$$

Once we convert to the Itô Form though, starting with an initial condition in *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* is not optimal in the sense that, at least according to the deterministic theory, we should be able to construct a solution (satisfying the identity in *L*2 *<sup>σ</sup> (*T3; <sup>R</sup>3*)* as is natural) for only a *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* initial condition. To this end we give the following definitions and the main result of this section. Definition 3.3 is stated for an arbitrary <sup>F</sup>0−measurable *<sup>u</sup>*<sup>0</sup> :  <sup>→</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)*.

**Definition 3.3** A pair *(u, τ )* where *<sup>τ</sup>* is a <sup>P</sup> <sup>−</sup> *a.s.* positive stopping time and *<sup>u</sup>* is a process such that for <sup>P</sup> <sup>−</sup> *a.e. <sup>ω</sup>*, *<sup>u</sup>*·*(ω)* <sup>∈</sup> *<sup>C</sup>* [0*, T* ]; *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* and *<sup>u</sup>*·*(ω)***1**·≤*τ (ω)* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> [0*, T* ]; *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* for all *T >* 0 with *u*·**1**·≤*<sup>τ</sup>* progressively measurable in *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)*, is said to be a local strong solution of the equation (23) if the identity

$$u\_t = u\_0 - \int\_0^{t \wedge \tau} \mathcal{P} \mathcal{L}\_{u\_s} u\_s \, ds - \nu \int\_0^{t \wedge \tau} Au\_s \, ds$$

$$+ \frac{1}{2} \int\_0^{t \wedge \tau} \sum\_{l=1}^\infty \mathcal{P} B\_l^2 u\_s ds - \int\_0^{t \wedge \tau} \mathcal{P} B u\_s d\mathcal{W}\_s \tag{24}$$

holds <sup>P</sup> <sup>−</sup> *a.s.* in *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* for all *<sup>t</sup>* <sup>≥</sup> 0.

**Remark 3** If *(u, τ )* is a local strong solution of the equation (23), then *u*· = *u*·∧*<sup>τ</sup>* . The time integrals in (24) are well defined as Bochner integrals in *L*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)*, and the Itô integral in *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)*. These integrals are shown to be well defined in the abstract case in [30] Definition 2.2.3.

**Definition 3.4** A pair *(u, )*such that there exists a sequence of stopping times*(θj )* which are <sup>P</sup> <sup>−</sup> *a.s.* monotone increasing and convergent to , whereby *(u*·∧*θj , θj )* is a local strong solution of the equation (23) for each *j* , is said to be a maximal strong solution of the equation (23) if for any other pair *(v, )* with this property then <sup>≤</sup> <sup>P</sup> <sup>−</sup> *a.s.* implies <sup>=</sup> <sup>P</sup> <sup>−</sup> *a.s.*.

**Definition 3.5** A maximal strong solution *(u, )* of the equation (23) is said to be pathwise unique if for any other such solution *(v, )*, then <sup>=</sup> <sup>P</sup> <sup>−</sup> *a.s.* and for all *t* ∈ [0*, )*,

$$\mathbb{P}\left(\{\omega \in \Omega \, : \, \mu\_l(\omega) = v\_l(\omega)\}\right) = 1.$$

**Theorem 3.6** *For any given* <sup>F</sup>0<sup>−</sup> *measurable <sup>u</sup>*<sup>0</sup> :  <sup>→</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*), there exists a pathwise unique maximal strong solution (u, ) of the equation (23). Moreover at* <sup>P</sup> <sup>−</sup> *a.e. <sup>ω</sup> for which (ω) <* <sup>∞</sup>*, we have that*

$$\sup\_{r \in [0, \Theta(\omega))} \left\| u\_r(\omega) \right\|\_1^2 + \int\_0^{\Theta(\omega)} \| u\_r(\omega) \|\_2^2 dr = \infty. \tag{25}$$

# *3.2 Operator Bounds*

In this subsection we state some intermediary results regarding control on the operators involved. In the following and throughout this section *c* will represent a generic constant changing from line to line, *c(ε)* will be a generic constant dependent on a fixed *<sup>ε</sup>*, *<sup>f</sup>* and *<sup>g</sup>* will be arbitrary elements of *<sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* and *fn* ∈ span{*a*1*,* ··· *, an*}. The proofs of these lemmas can be found in Sect. 5.1.

**Lemma 3.7** *For any ε >* 0*, we have that*

$$\left| \langle \mathcal{P}\_n \mathcal{P} \mathcal{L}\_{f\_n} f\_n, f\_n \rangle\_2 \right| \le c(\varepsilon) \| f\_n \|\_2^4 + \varepsilon \| f\_n \|\_3^2; \tag{26}$$

$$
\begin{split} \langle \mathcal{P}\_n \mathcal{P} \mathcal{B}\_l^2 f\_n, f\_n \rangle\_1 + \langle \mathcal{P}\_n \mathcal{P} \mathcal{B}\_l f\_n, \mathcal{P}\_n \mathcal{P} \mathcal{B}\_l f\_n \rangle\_1 &\leq c(\varepsilon) \|\xi\_l\|\_{W^{3,\infty}}^2 \|f\_n\|\_1^2 \\ &+ \varepsilon \|\xi\_l\|\_{W^{3,\infty}}^2 \|f\_n\|\_2^2; \\ &\lesssim \varepsilon. \end{split} \tag{27}
$$

$$\begin{split} \langle \mathcal{P}\_{n}\mathcal{P}\mathcal{B}\_{l}^{2}f\_{n},f\_{n}\rangle\_{2} + \langle \mathcal{P}\_{n}\mathcal{P}\mathcal{B}\_{l}f\_{n},\mathcal{P}\_{n}\mathcal{P}\mathcal{B}\_{l}f\_{n}\rangle\_{2} &\leq c(\varepsilon) \|\xi\_{l}\|\_{W^{3,\infty}}^{2} \|f\_{n}\|\_{2}^{2} \\ &+ \varepsilon \|\xi\_{l}\|\_{W^{3,\infty}}^{2} \|f\_{n}\|\_{3}^{2}. \end{split} \tag{28}$$

**Remark 4** The algebra property of *<sup>W</sup>k,*2*(*T3; <sup>R</sup>3*)* when *<sup>k</sup>* <sup>=</sup> <sup>2</sup> is fundamental in proving (26); a result of the form

$$\left| \langle \mathcal{P}\_n \mathcal{P} \mathcal{L}\_{f\_n} f\_n, f\_n \rangle\_1 \right| \le c(\varepsilon) \| f\_n \|\_1^4 + \varepsilon \| f\_n \|\_2^2$$

is unavailable to us given that the algebra property does not hold for *k* = 1. This is revisited in Remark 5.

**Lemma 3.8** *For any ε >* 0*, we have that*

$$\left| \langle \mathcal{P} \mathcal{L}\_f f - \mathcal{P} \mathcal{L}\_g \mathbf{g}, f - \mathbf{g} \rangle\_1 \right| \le c(\varepsilon) \left( \| \mathbf{g} \| ^4\_1 + \| f \| ^2\_2 \right) \| f - \mathbf{g} \| ^2\_1 + \varepsilon \| f - \mathbf{g} \| ^2\_2.$$

**Lemma 3.9** *For any ε >* 0*, we have that*

$$\left| \left< \mathcal{P} \mathcal{L}\_f f - \mathcal{P} \mathcal{L}\_\mathfrak{g} \mathfrak{g}, f - \mathfrak{g} \right> \right| \le c(\varepsilon) \| f \|\_2^2 \| f - \mathfrak{g} \| ^2 + \varepsilon \| f - \mathfrak{g} \|\_1^2.$$

## *3.3 Proof of Proposition 3.2*

We prove this result through the abstract procedure used in [30] Subsections 2.2 and 2.3, the result of which is stated in Sect. 5.2. Towards this goal we define the quartet of spaces

$$\begin{aligned} V &:= W^{3,2}\_{\sigma}(\mathbb{T}^3; \mathbb{R}^3), & H &:= W^{2,2}\_{\sigma}(\mathbb{T}^3; \mathbb{R}^3), \\ U &:= W^{1,2}\_{\sigma}(\mathbb{T}^3; \mathbb{R}^3), & X &:= L^2\_{\sigma}(\mathbb{T}^3; \mathbb{R}^3). \end{aligned}$$

We equip *L*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* with the usual ·*,* · inner product, but then equip *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* and *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* with the ·*,* ·<sup>1</sup> and ·*,* ·<sup>2</sup> inner products respectively. In fact we also have that *D(A*3*/*2*)* <sup>=</sup> *<sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* and that the ·*,* ·<sup>3</sup> inner product is equivalent to the usual ·*,* ·*W*3*,*<sup>2</sup> one on this space (see [44] Theorem 2.27), so we endow *<sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* with ·*,* ·3. Our SPDE (22) takes the form of (68) for the operators

$$\mathcal{Q} := -\left(\mathcal{P}\mathcal{L} + \nu A\right), \qquad \mathcal{G} := -\mathcal{P}B$$

where we rewrite *(*−P*Bi)*<sup>2</sup> as <sup>P</sup>*B*<sup>2</sup> *<sup>i</sup>* firstly from the linearity of P*Bi* to deal with the minus and secondly using the property that P*Bi* = P*Bi*P. It is worth appreciating here that we chose to project the equation and then convert it into Itô Form, but we may equally have chosen to convert the unprojected Stratonovich Form and then project the resulting Itô Equation. Without addressing the conversion of the unprojected equation in complete detail, we would directly arrive at (23) taking this approach. Indeed before projection our correction term would be of the form ∞ *<sup>i</sup>*=<sup>1</sup> *<sup>B</sup>*<sup>2</sup> *<sup>i</sup>* plus a function in *<sup>L</sup>*2*,*<sup>⊥</sup> *<sup>σ</sup> (*T3*,* R3*)* as defined in Lemma 2.7. Under projection this is P <sup>∞</sup> *<sup>i</sup>*=<sup>1</sup> *<sup>B</sup>*<sup>2</sup> *<sup>i</sup>* which is just <sup>∞</sup> *<sup>i</sup>*=<sup>1</sup> <sup>P</sup>*B*<sup>2</sup> *<sup>i</sup>* from the linearity and continuity. Therefore the property P*Bi* = P*Bi*P from Lemma 2.7 establishes the consistency between these approaches.

To prove the result we check the Assumptions 5.3 and 5.4. Starting with 5.3, we first of all have that *νA* is continuous from *<sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* into *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)*, therefore it is measurable and as a linear operator too satisfies the boundedness. As for PL, measurability is satisfied in the same way and for the boundedness we have that

$$\|\mathcal{PL}\_f f\|\_1 \le c \|\mathcal{PL}\_f f\|\_{W^{1,2}} \le c \|\mathcal{L}\_f f\|\_{W^{1,2}} \le c \|f\|\_{W^{1,2}} \|f\|\_{W^{3,2}} \le c \|f\|\_1 \|f\|\_3$$

for any *<sup>f</sup>* <sup>∈</sup> *<sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* where *<sup>c</sup>* is a generic constant, critically applying (5). This verifies Assumption 5.3 so we move on to Assumption 5.4, which is immediate from (18) and the linearity of P*B* to show continuity in all relevant spaces.

## *3.4 Proofs of Theorems 3.1 and 3.6.*

We use the abstract results of [32] stated in Sects. 5.3 and 5.4. Definitions 3.3, 3.4, 3.5 and Theorem 3.6 are precisely Definitions 5.19, 5.20, 5.21 and Theorem 5.22 for the equation (23) with respect to the spaces *V,H,U,X* as defined in Sect. 3.3. Indeed we would also prove Theorem 3.1 through Proposition 3.2 by showing the existence of a local solution with the regularity of Definition 5.12. Therefore we prove both Theorem 3.1 and 3.6 by showing that the assumptions of Sects. 5.3 and 5.4, are satisfied. We work with the operators

$$\mathcal{A} := -\left(\mathcal{P}\mathcal{L} + \nu A\right) + \frac{1}{2} \sum\_{l=1}^{\infty} \mathcal{P}B\_l^2, \qquad \mathcal{G} := -\mathcal{P}B$$

which were addressed to be measurable mappings into the required spaces in Sect. 3.3. We now prove Theorem 3.1 by justifying the assumptions of Sect. 5.3.

*Proof of Theorem 3.1* First note that the density of the spaces is immediately inherited from the density of the usual Sobolev Spaces and the equivalence of the norms. The bilinear form satisfying (70) is chosen to be

$$\langle f, \mathbf{g} \rangle\_{U \times V} := \langle A^{1/2} f, A^{3/2} \mathbf{g} \rangle\_V$$

which reduces to the ·*,* ·<sup>2</sup> inner product from (12). The remainder of the proof is in treating the numbered assumptions.

Assumption 5.6: We use the system *(ak)* of eigenfunctions of the Stokes Operator. Assumption 5.7: Once more (74) follows from the discussion in Sect. 3.3. For (75) we treat the different operators in A individually, starting from the nonlinear term:

$$\begin{split} \|\mathcal{PL}\_f f - \mathcal{PL}\_\mathcal{g} \mathbf{g}\|\_1 &= \|\mathcal{PL}\_f (f - \mathbf{g}) + \mathcal{PL}\_{f - \mathbf{g}} \mathbf{g}\|\_1 \\ &\le \|\mathcal{PL}\_f (f - \mathbf{g})\|\_1 + \|\mathcal{PL}\_{f - \mathbf{g}} \mathbf{g}\|\_1 \\ &\le c \|\mathcal{L}\_f (f - \mathbf{g})\|\_{W^{1,2}} + c \|\mathcal{L}\_{f - \mathbf{g}} \mathbf{g}\|\_{W^{1,2}} \\ &\le c \|f\|\_{W^{1,2}} \|f - \mathbf{g}\|\_{W^{3,2}} + c \|f - \mathbf{g}\|\_{W^{1,2}} \|\mathbf{g}\|\_{W^{3,2}} \\ &\le c \left(\|f\|\_{W^{1,2}} + \|\mathbf{g}\|\_{W^{3,2}}\right) \|f - \mathbf{g}\|\_{W^{3,2}} \end{split}$$

$$\leq c \left( \|f\|\_{1} + \|g\|\_{3} \right) \|f - g\|\_{3}$$

having applied (5). From the linearity of *νA* and <sup>1</sup> 2 ∞ *<sup>i</sup>*=<sup>1</sup> <sup>P</sup>*B*<sup>2</sup> *<sup>i</sup>* then the corresponding result follows immediately from (74) and this subsequently justifies (75). Additionally (76) follows immediately from the already justified Assumption 5.4.

Assumption 5.8: (26) and (28) will be our basis of showing (77). The task is to control

$$2\left\langle \mathcal{P}\_n \left( -\mathcal{P}\mathcal{L} - \nu A + \frac{1}{2} \sum\_{l=1}^{\infty} \mathcal{P}B\_l^2 \right) f\_n, \ f\_n \right\rangle\_2 + \sum\_{l=1}^{\infty} \|\mathcal{P}\_n \mathcal{P}B\_l f\_n\|\_2^2$$

which we rewrite as

$$-2\langle \mathcal{P}\_n \mathcal{P} \mathcal{L}\_{f\_n} f\_n, f\_n \rangle\_2 - 2\nu \langle \mathcal{P}\_n A f\_n, f\_n \rangle\_2$$

$$+ \sum\_{l=1}^{\infty} \left( \langle \mathcal{P}\_n \mathcal{P} \mathcal{B}\_l^2 f\_n, f\_n \rangle\_2 + \|\mathcal{P}\_n \mathcal{P} \mathcal{B}\_l f\_n\|\_2^2 \right). \tag{29}$$

Recalling the assumption (21) and (26), (28), we have that for any *ε >* 0,

$$\begin{split}(29) \leq & -2\upsilon(\mathcal{P}\_{h}Af\_{n},f\_{n})\_{2} + c(\varepsilon)\|f\_{n}\|\_{4}^{4} + \varepsilon\|f\_{n}\|\_{3}^{2} \\ & + \sum\_{i=1}^{\infty} \Bigl(c(\varepsilon)\|\xi\_{i}\|\_{\boldsymbol{W}^{3,\infty}}^{2}\|f\_{n}\|\_{2}^{2} + \varepsilon\|\xi\_{i}\|\_{\boldsymbol{W}^{3,\infty}}^{2}\|f\_{n}\|\_{3}^{2}\Bigr) \\ = & -2\upsilon(Af\_{n},f\_{n})\_{2} + \left[c(\varepsilon)\|f\_{n}\|\_{4}^{4} + c(\varepsilon)\sum\_{i=1}^{\infty}\lVert\xi\_{i}\rVert\_{\boldsymbol{W}^{3,\infty}}^{2}\lVertf\_{n}\rVert\_{2}^{2}\right] \\ & + \varepsilon\left[1 + \sum\_{i=1}^{\infty}\lVert\xi\_{i}\rVert\_{\boldsymbol{W}^{3,\infty}}^{2}\right]\lVertf\_{n}\rVert\_{3}^{2} \\ \leq & -2\upsilon(A^{1/2}Af\_{n},A^{3/2}f\_{n})\_{2} + c(\varepsilon)\left[1 + \lVertf\_{n}\rVert\_{2}^{2}\right]\lVertf\_{n}\rVert\_{2}^{2} \\ & + \varepsilon\left[1 + \sum\_{i=1}^{\infty}\lVert\xi\_{i}\rVert\_{W^{3,\infty}}^{2}\right]\lVertf\_{n}\rVert\_{3}^{2} \\ = & -2\upsilon\lVertf\_{\boldsymbol{h}}\rVert\_{3}^{2} + c(\varepsilon)\left[1 + \lVertf\_{\boldsymbol{h}}\rVert\_{2}^{2}\right]\lVertf\_{n}\rVert\_{2}^{2} + \varepsilon\left[1 + \sum\_{i=1}^{\infty}\lVert\xi\_{i}\rVert\_{W^{3,\infty}}^{2}\right]\lVertf\_{\boldsymbol{h}}\rVert\_{3}^{2}\end{split}$$

where we have embedded the ∞ *<sup>i</sup>*=1*ξi*<sup>2</sup> *<sup>W</sup>*3*,*<sup>∞</sup> into the constant *c(ε)*. Therefore by choosing

On the 3D Navier-Stokes Equations with Stochastic Lie Transport 71

$$\varepsilon := \frac{\nu}{1 + \sum\_{l=1}^{\infty} \|\xi\_l\|\_{W^{3,\infty}}^2}$$

then (77) is satisfied for *κ* := *ν*. Moving on to (78), we are interested in the term

$$\sum\_{i=1}^{\infty} \langle \mathcal{P}\_n \mathcal{P} B\_i f\_n, f\_n \rangle\_2^2.$$

Observe that

$$\begin{aligned} \langle \mathcal{P}\_{n}\mathcal{P}\mathcal{B}\_{l}f\_{n},f\_{n}\rangle\_{2}^{2} &= \langle \mathcal{P}\mathcal{B}\_{l}f\_{n},f\_{n}\rangle\_{2}^{2} \\ &= \langle \mathcal{P}[\Delta,\mathcal{B}\_{l}]f\_{n} + \mathcal{P}\mathcal{B}\_{l}\Delta f\_{n},Af\_{n}\rangle^{2} \\ &\leq 2\langle \mathcal{P}[\Delta,\mathcal{B}\_{l}]f\_{n},Af\_{n}\rangle^{2} + 2\langle \mathcal{P}\mathcal{B}\_{l}\Delta f\_{n},Af\_{n}\rangle^{2} \\ &= 2\langle [\Delta,\mathcal{B}\_{l}]f\_{n},Af\_{n}\rangle^{2} + 2\langle \mathcal{B}\_{l}Af\_{n},Af\_{n}\rangle^{2}. \end{aligned}$$

The first of these terms is dealt with through a simple Cauchy-Schwarz, as

$$\left| \langle [\Delta, B\_l] f\_n, Af\_n \rangle \right|^2 \le \| [\Delta, B\_l] f\_n \|^2 \| Af\_n \|^2 \le c \| \xi\_l \|\_{W^{3,\infty}}^2 \| f\_n \|\_{2}^4$$

using Proposition 5.2, and the second comes directly from (20) as

$$\left| \left\langle B\_l A f\_n, A f\_n \right\rangle \right|^2 \le c \left\| \xi\_l \right\|\_{W^{1,\infty}}^2 \left\| A f\_n \right\|^4 \le c \left\| \xi\_l \right\|\_{W^{3,\infty}}^2 \left\| f\_n \right\|\_2^4.$$

Summing up the two terms and over all *i* gives that

$$\sum\_{i=1}^{\infty} \langle \mathcal{P}\_n \mathcal{P} B\_i f\_n, f\_n \rangle\_2^2 \le \left( c \sum\_{i=1}^{\infty} \|\xi\_i\|\_{W^{3,\infty}}^2 \right) \|f\_n\|\_2^4$$

which justifies (78) and Assumption 5.8. Assumption 5.9: For (79) we must bound the term

$$\begin{aligned} &2\left\langle \left( -\mathcal{PT} - \nu A + \frac{1}{2} \sum\_{l=1}^{\infty} \mathcal{PB}\_l^2 \right) f - \left( -\mathcal{PT} - \nu A + \frac{1}{2} \sum\_{l=1}^{\infty} \mathcal{PB}\_l^2 \right) g, f - g \right\rangle\_1 \\ &+ \sum\_{l=1}^{\infty} \left\| \mathcal{PB}\_l f - \mathcal{PB}\_l g \right\|\_1^2 \end{aligned}$$

which we simply rewrite as

$$-2\langle \mathcal{P}\mathcal{L}\_f f - \mathcal{P}\mathcal{L}\_\mathbf{g}\mathbf{g}, f - \mathbf{g} \rangle\_1 - 2\nu \langle A(f - \mathbf{g}), f - \mathbf{g} \rangle\_1$$

$$+\sum\_{i=1}^{\infty} \left( \langle \mathcal{P}B\_i^2(f-\mathbf{g}), f-\mathbf{g} \rangle\_1 + \|\mathcal{P}B\_i(f-\mathbf{g})\|\_1^2 \right)^{\frac{1}{2}}$$

and inspect the distinct items individually. Firstly from Lemma 3.8 we have that for any *ε >* 0,

$$-2\langle \mathcal{PL}\_f f - \mathcal{PL}\_\mathcal{g} \mathbf{g}, f - \mathbf{g} \rangle\_1 \le c(\varepsilon) \left( \|\mathbf{g}\|\_1^4 + \|f\|\_2^2 \right) \|f - \mathbf{g}\|\_1^2 + \varepsilon \|f - \mathbf{g}\|\_2^2. \tag{30}$$

Similarly to the justification of Assumption 5.8 we also see that

$$-2\nu \langle A(f-\mathbf{g}), f-\mathbf{g} \rangle\_{\mathbb{I}} = -2\nu \| f-\mathbf{g} \|\_{2}^{2}.\tag{31}$$

Shifting focus to the final term, note that in (27) we in fact showed that

$$\langle \langle \mathcal{P}B\_l^2 f\_n, f\_n \rangle\_1 + \langle \mathcal{P}B\_l f\_n, \mathcal{P}B\_l f\_n \rangle\_1 \le c(\varepsilon) \|\xi\_l\|\_{W^{3,\infty}}^2 \|f\_n\|\_1^2 + \varepsilon \|\xi\_l\|\_{W^{3,\infty}}^2 \|f\_n\|\_2^2$$

and scanning the proof we see that all arguments hold for arbitrary *fn* ∈ *<sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* so we can deduce directly the bound

$$\begin{aligned} \|\mathcal{P}\mathcal{B}\_l^2(f-\mathbf{g}), f-\mathbf{g})\_{\mathbb{I}} + \|\mathcal{P}\mathcal{B}\_l(f-\mathbf{g})\|\_{1}^2 &\leq c(\varepsilon) \|\xi\_l\|\_{W^{3,\infty}}^2 \|f-\mathbf{g}\|\_{1}^2 \\ &+ \varepsilon \|\xi\_l\|\_{W^{3,\infty}}^2 \|f-\mathbf{g}\|\_{2}^2. \end{aligned} \tag{32}$$

Summing over (30), (31) and all *i* in (32), we deduce a bound by

$$\begin{aligned} &1 - 2\nu \|f - g\|\_2^2 + c(\varepsilon) \left[ \|g\|\_1^4 + \|f\|\_2^2 + \sum\_{i=1}^\infty \|\xi\_i\|\_{W^{3,\infty}}^2 \right] \|f - g\|\_1^2 \\ &+ \varepsilon \left[ 1 + \sum\_{i=1}^\infty \|\xi\_i\|\_{W^{3,\infty}}^2 \right] \|f - g\|\_2^2 \end{aligned}$$

so again a choice of

$$\varepsilon := \frac{\nu}{1 + \sum\_{l=1}^{\infty} \|\xi\_l\|\_{W^{3,\infty}}^2} \tag{33}$$

ensures (79) is satisfied for *κ* := *ν*. Moving on to (80), we are interested in the term

$$\sum\_{l=1}^{\infty} \langle \mathcal{P} \mathcal{B}\_l(f - \mathbf{g}), f - \mathbf{g} \rangle\_1^2,\tag{34}$$

noting that

On the 3D Navier-Stokes Equations with Stochastic Lie Transport 73

$$\begin{aligned} \langle \mathcal{P}B\_l(f-\mathbf{g}), f-\mathbf{g} \rangle\_1^2 &= \langle A\mathcal{P}B\_l(f-\mathbf{g}), f-\mathbf{g} \rangle^2 \\ &\leq N \sum\_{k=1}^3 \langle \partial\_k B\_l(f-\mathbf{g}), \partial\_k(f-\mathbf{g}) \rangle^2. \end{aligned}$$

We have

$$
\partial\_k B\_l(f - \mathbf{g}) = B\_{\partial\_k \xi\_l}(f - \mathbf{g}) + B\_l \partial\_k (f - \mathbf{g})
$$

so

$$\begin{aligned} \left\langle \partial\_{\mathbb{k}} B\_{l}(f - \mathbf{g}), \partial\_{\mathbb{k}} (f - \mathbf{g}) \right\rangle^{2} &\leq 2 \left\langle B\_{\partial\_{\mathbb{k}} \xi\_{l}}(f - \mathbf{g}), \partial\_{\mathbb{k}} (f - \mathbf{g}) \right\rangle^{2} \\ &+ 2 \left\langle B\_{l} \partial\_{\mathbb{k}} (f - \mathbf{g}), \partial\_{\mathbb{k}} (f - \mathbf{g}) \right\rangle^{2}. \end{aligned}$$

Now from (18),

$$\begin{aligned} \left\| \langle B\_{\partial\_k \xi\_l}(f - \mathbf{g}), \partial\_k(f - \mathbf{g}) \rangle^2 \right\| &\leq \left\| B\_{\partial\_k \xi\_l}(f - \mathbf{g}) \right\|^2 \left\| \partial\_k(f - \mathbf{g}) \right\|^2 \\ &\leq c \left\| \xi\_l \right\|\_{W^{3,\infty}}^2 \left\| f - \mathbf{g} \right\|^4\_l \end{aligned}$$

and from (15),

$$\begin{aligned} \left\langle \mathcal{B}\_l \partial\_k (f - \mathbf{g}), \partial\_k (f - \mathbf{g}) \right\rangle^2 &= \left\langle \mathcal{T}\_{\mathbb{K}} \partial\_k (f - \mathbf{g}), \partial\_k (f - \mathbf{g}) \right\rangle^2 \\ &\le c \|\xi\_l\|\_{W^{3,\infty}}^2 \|f - \mathbf{g}\|\_1^4. \end{aligned}$$

By summing both terms, over all *<sup>k</sup>* <sup>=</sup> <sup>1</sup>*,...,N* and *<sup>i</sup>* <sup>∈</sup> <sup>N</sup>, we have shown that

$$\sum\_{l=1}^{\infty} \langle \mathcal{P}B\_l(f-\mathbf{g}), f-\mathbf{g} \rangle\_1^2 \le \left( c \sum\_{l=1}^{\infty} \|\xi\_l\|\_{W^{3,\infty}}^2 \right) \|f-\mathbf{g}\|\_1^4 \tag{35}$$

demonstrating (80) and hence Assumption 5.9. Assumption 5.10: For (81) we must bound the term

$$2\left\langle \left(-\mathcal{P}\mathcal{L} - \nu A + \frac{1}{2} \sum\_{i=1}^{\infty} \mathcal{P}B\_i^2 \right) f, f\right\rangle\_1 + \sum\_{i=1}^{\infty} \|\mathcal{P}B\_i f\|\_1^2$$

which we simply rewrite as

$$-2\langle \mathcal{P} \mathcal{L}\_f f, f \rangle\_1 - 2\nu \langle Af, f \rangle\_1 + \sum\_{l=1}^{\infty} \left( \langle \mathcal{P} \mathcal{B}\_l^2 f, f \rangle\_1 + \| \mathcal{P} \mathcal{B}\_l f \|\_1^2 \right). \tag{36}$$

The nonlinear term can be controlled precisely as seen in Lemma 3.8 to deduce that for any *ε >* 0,

$$\left| \langle \mathcal{P} \mathcal{L}\_f f, f \rangle\_1 \right| \le c(\varepsilon) \| f \|\_1^6 + \varepsilon \| f \|\_2^2.$$

Meanwhile across (31) and (32) we have that

$$\begin{aligned} -2\nu \langle Af, f \rangle\_1 + \sum\_{l=1}^{\infty} \left( \langle \mathcal{P}B\_l^2 f, f \rangle\_1 + \| \mathcal{P}B\_l f \|\_1^2 \right) \\ \leq -2\nu \| f \|\_{2}^{2} + c(\varepsilon) \left[ \sum\_{l=1}^{3} \| \xi\_l \|\_{W^{3,\infty}}^2 \right] \| f \|\_{1}^{2} \\ + \varepsilon \left[ \sum\_{l=1}^{\infty} \| \xi\_l \|\_{W^{3,\infty}}^2 \right] \| f \|\_{2}^{2} \end{aligned}$$

so with the familiar choice of *ε* (33) we see that

$$(36) \le c\left(1 + \|f\|\_1^6\right) - \nu \|f\|\_2^2$$

which is more than sufficient to show (81). (82) follows immediately from (35), concluding the justification.

Assumption 5.11: For any *<sup>η</sup>* <sup>∈</sup> *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* we must bound the term

$$\left\langle \left( -\mathcal{P}\mathcal{L} - \nu A + \frac{1}{2} \sum\_{l=1}^{\infty} \mathcal{P}B\_{l}^{2} \right) f - \left( -\mathcal{P}\mathcal{L} - \nu A + \frac{1}{2} \sum\_{l=1}^{\infty} \mathcal{P}B\_{l}^{2} \right) g, \eta \right\rangle\_{1}$$

which we simply rewrite as

$$-2\langle \mathcal{P}\mathcal{L}\_f f - \mathcal{P}\mathcal{L}\_\mathbf{g}\mathbf{g}, \eta \rangle\_1 - 2\nu \langle A(f-\mathbf{g}), \eta \rangle\_1 + \sum\_{l=1}^\infty \langle \mathcal{P}B\_l^2(f-\mathbf{g}), \eta \rangle\_1$$

and further by

$$-2\langle \mathcal{P}\mathcal{L}\_f f - \mathcal{P}\mathcal{L}\_\mathfrak{g}\mathbf{g}, A\eta \rangle - 2\nu \langle A(f - \mathbf{g}), A\eta \rangle + \sum\_{l=1}^\infty \langle \mathcal{P}B\_l^2(f - \mathbf{g}), A\eta \rangle.$$

Through Cauchy-Schwarz this is controlled by

$$\|\|\eta\|\|\_{2}\left(2\|\mathcal{P}\mathcal{L}\_{f}f - \mathcal{P}\mathcal{L}\_{\text{g}}\mathbf{g}\| + 2\nu\|A(f-\mathbf{g})\| + \sum\_{l=1}^{\infty} \|\mathcal{P}\mathcal{B}\_{l}^{2}(f-\mathbf{g})\|\right).$$

so our problem is reduced to bounding the bracketed terms. The linear terms are trivial when recalling (18), and for the nonlinear term we revert back to (4) to see that

$$\|\mathcal{P}\mathcal{L}\_f f - \mathcal{P}\mathcal{L}\_\mathcal{g}\mathbf{g}\| \le \|\mathcal{L}\_{f-\mathcal{g}}f\| + \|\mathcal{L}\_\mathcal{g}(f-\mathbf{g})\| \le c \left(\|f\|\_2 + \|\mathbf{g}\|\_1\right) \|f-\mathbf{g}\|\_2$$

comfortably justifying the assumption.

*Proof of Theorem 3.1* The new space *X* will again be *L*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* as laid out in Sect. 3.3. We choose the bilinear form ·*,* ·*X*×*<sup>H</sup>* to be given by

$$
\langle f, g \rangle\_{X \times H} := \langle f, Ag \rangle \tag{37}
$$

noting that the property (86) follows from (12). Noting also that the system *(ak)* is an orthogonal basis of *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)*, and that the operators were shown to be measurable into the relevant spaces in Sect. 3.3, we are in the setting of Sect. 5.4. We now proceed to justify the assumptions required to apply Theorem 5.22.

Assumption 5.16: This follows identically to Assumption 5.7, referring again to Sect. 3.3 and (4).

Assumption 5.17: Continuing to consider the distinct terms, we have that

$$-\mathcal{D}\boldsymbol{\nu}\langle\boldsymbol{A}(\boldsymbol{f}-\boldsymbol{g}),\boldsymbol{f}-\boldsymbol{g}\rangle=-\mathcal{D}\boldsymbol{\nu}\|\boldsymbol{f}-\boldsymbol{g}\|\_{1}^{2}$$

and

$$\begin{aligned} & \left\| \mathcal{P} \mathcal{B}\_l^2(f - \mathbf{g}), f - \mathbf{g} \right\| + \left\| \mathcal{P} \mathcal{B}\_l(f - \mathbf{g}) \right\|^2 \\ & \qquad \le \left\langle \mathcal{B}\_l^2(f - \mathbf{g}), f - \mathbf{g} \right\rangle + \left\| \mathcal{B}\_l(f - \mathbf{g}) \right\|^2 \le c \left\| \xi\_l \right\|\_{W^{2,\infty}}^2 \left\| f - \mathbf{g} \right\|^2. \end{aligned}$$

from (19). With these components in place, the proof of (89) then follows identically to that of (79). (90) is a direct consequence of (20), concluding the justification.

Assumption 5.18: This stronger Assumption was in fact already verified in the address of Assumption 5.10.

## **4 The Vorticity Equation on a Bounded Main**

In order to address the well-posedness problem of the SALT Navier-Stokes Equations on bounded domains, we now pose it in vorticity form. The analysis conducted in Sect. 3.4 was done with reference to the properties derived across Sects. 2.2 and 2.3, applicable to the bounded domain as well as the torus. The issue in studying the velocity form is that our operators do not map into the correct spaces in order to use these properties: in particular, the Leray Projector does not preserve the zero trace property and so the operators do not map into the necessary *<sup>W</sup>k,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* spaces (see Remark 2). The motivation behind the vorticity form is to circumvent the necessity of Leray Projection.

Our attentions shall be decidedly on the bounded domain O, though the results for the vorticity form carry over seamlessly to the torus T3. For this section we impose new constraints on the *ξi*, which are such that for each *<sup>i</sup>* <sup>∈</sup> <sup>N</sup>, *ξi* <sup>∈</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* <sup>∩</sup> *<sup>W</sup>*2*,*<sup>2</sup> <sup>0</sup> *(*O; <sup>R</sup>3*)* <sup>∩</sup> *<sup>W</sup>*3*,*∞*(*O; <sup>R</sup>3*)* and they collectively satisfy

$$\sum\_{l=1}^{\infty} \left\| \xi\_l \right\|\_{W^{3,\infty}}^2 < \infty. \tag{38}$$

## *4.1 Deriving the Equation*

The vorticity form of the equation is derived through taking the curl of the velocity form, where the curl operator is defined for *<sup>f</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)* by

$$\operatorname{curl} f := \begin{pmatrix} \partial\_2 f^3 - \partial\_3 f^2 \\ \partial\_3 f^1 - \partial\_1 f^3 \\ \partial\_1 f^2 - \partial\_2 f^1 \end{pmatrix}.$$

We introduce the Lie Bracket operator L defined on sufficiently regular functions *f, g* : <sup>O</sup> <sup>→</sup> <sup>R</sup><sup>3</sup> by

$$\mathcal{L}\mathcal{C}\_f \mathbf{g} := \mathcal{L}\_f \mathbf{g} - \mathcal{L}\_\mathbf{g} f. \tag{39}$$

In [44] Subsection 12.1 it is shown that, with notation *φ* := curl*f* ,

$$\operatorname{curl} \left( \mathcal{L}\_f f \right) = \mathcal{A}\_f^\rho \phi$$

where it is also observed that the curl of elements of *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)* <sup>∩</sup> *<sup>L</sup>*2*,*<sup>⊥</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* is null. It is clear that the Laplacian commutes with the curl operation, and we now consider how the curl operation interacts with the stochastic term.

**Lemma 4.1** *We have that*

$$\operatorname{curl}(B\_l f) = \mathcal{X}\_{\xi l} \phi \tag{40}$$

*where once more φ* := curl*f.*

*Proof* We shall show only that the identity (40) holds in its first component, with the others following similarly. To this end we calculate the first component of the left hand side of (40):

[curl*(Bif )*] <sup>1</sup> <sup>=</sup> *<sup>∂</sup>*2[*Bif* ] <sup>3</sup> <sup>−</sup> *<sup>∂</sup>*3[*Bif* ] 2 = *∂*<sup>2</sup> ⎛ ⎝ 3 *j*=1 *ξ j <sup>i</sup> ∂jf* <sup>3</sup> <sup>+</sup> *<sup>f</sup> <sup>j</sup> <sup>∂</sup>*3*<sup>ξ</sup> <sup>j</sup> i* ⎞ <sup>⎠</sup> <sup>−</sup> *<sup>∂</sup>*<sup>3</sup> ⎛ ⎝ 3 *j*=1 *ξ j <sup>i</sup> ∂jf* <sup>2</sup> <sup>+</sup> *<sup>f</sup> <sup>j</sup> <sup>∂</sup>*2*<sup>ξ</sup> <sup>j</sup> i* ⎞ ⎠ <sup>=</sup> 3 *j*=1 *<sup>∂</sup>*2*<sup>ξ</sup> <sup>j</sup> <sup>i</sup> ∂jf* <sup>3</sup> <sup>+</sup> *<sup>ξ</sup> <sup>j</sup> <sup>i</sup> <sup>∂</sup>*2*∂jf* <sup>3</sup> <sup>+</sup> *<sup>∂</sup>*2*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*3*<sup>ξ</sup> <sup>j</sup> <sup>i</sup>* <sup>+</sup> *<sup>f</sup> <sup>j</sup> <sup>∂</sup>*2*∂*3*<sup>ξ</sup> <sup>j</sup> i* − 3 *j*=1 *<sup>∂</sup>*3*<sup>ξ</sup> <sup>j</sup> <sup>i</sup> ∂jf* <sup>2</sup> <sup>+</sup> *<sup>ξ</sup> <sup>j</sup> <sup>i</sup> <sup>∂</sup>*3*∂jf* <sup>2</sup> <sup>+</sup> *<sup>∂</sup>*3*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*2*<sup>ξ</sup> <sup>j</sup> <sup>i</sup>* <sup>+</sup> *<sup>f</sup> <sup>j</sup> <sup>∂</sup>*3*∂*2*<sup>ξ</sup> <sup>j</sup> i* <sup>=</sup> 3 *j*=1 *<sup>∂</sup>*2*<sup>ξ</sup> <sup>j</sup> <sup>i</sup> ∂jf* <sup>3</sup> <sup>+</sup> *<sup>ξ</sup> <sup>j</sup> <sup>i</sup> <sup>∂</sup>*2*∂jf* <sup>3</sup> <sup>+</sup> *<sup>∂</sup>*2*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*3*<sup>ξ</sup> <sup>j</sup> <sup>i</sup>* <sup>−</sup> *<sup>∂</sup>*3*<sup>ξ</sup> <sup>j</sup> <sup>i</sup> ∂jf* <sup>2</sup> <sup>−</sup>*<sup>ξ</sup> <sup>j</sup> <sup>i</sup> <sup>∂</sup>*3*∂jf* <sup>2</sup> <sup>−</sup> *<sup>∂</sup>*3*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*2*<sup>ξ</sup> <sup>j</sup> i* <sup>=</sup> 3 *j*=1 *ξ j <sup>i</sup> ∂j (∂*2*<sup>f</sup>* <sup>3</sup> <sup>−</sup> *<sup>∂</sup>*3*<sup>f</sup>* <sup>2</sup>*)* <sup>+</sup> 3 *j*=1 *<sup>∂</sup>*2*<sup>ξ</sup> <sup>j</sup> <sup>i</sup> ∂jf* <sup>3</sup> <sup>+</sup> *<sup>∂</sup>*2*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*3*<sup>ξ</sup> <sup>j</sup> i* <sup>−</sup>*∂*3*<sup>ξ</sup> <sup>j</sup> <sup>i</sup> ∂jf* <sup>2</sup> <sup>−</sup> *<sup>∂</sup>*3*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*2*<sup>ξ</sup> <sup>j</sup> i* = [L*ξiφ*] <sup>1</sup> <sup>+</sup> 3 *j*=1 *<sup>∂</sup>*2*<sup>ξ</sup> <sup>j</sup> <sup>i</sup> ∂jf* <sup>3</sup> <sup>+</sup> *<sup>∂</sup>*2*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*3*<sup>ξ</sup> <sup>j</sup> <sup>i</sup>* <sup>−</sup> *<sup>∂</sup>*3*<sup>ξ</sup> <sup>j</sup> <sup>i</sup> ∂jf* <sup>2</sup> <sup>−</sup> *<sup>∂</sup>*3*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*2*<sup>ξ</sup> <sup>j</sup> i* 

Therefore it remains to show that

$$\sum\_{j=1}^{3} \left( \partial \mathfrak{k}\_{l}^{j} \partial\_{j} f^{3} + \partial \mathfrak{z} f^{j} \partial \mathfrak{z}\_{l}^{j} - \partial \mathfrak{z}\_{l}^{j} \partial\_{j} f^{2} - \partial \mathfrak{z} f^{j} \partial \mathfrak{z}\_{l}^{j} \right) = - \left[ \mathcal{L}\_{\phi} \mathfrak{k}\_{l} \right]^{1}.\tag{41}$$

We expand the sum in (41) to

$$\begin{aligned} \left( \partial\_2 \xi^1\_i \partial\_1 f^3 + \partial\_2 f^1 \partial\_3 \xi^1\_i - \partial\_3 \xi^1\_i \partial\_1 f^2 - \partial\_3 f^1 \partial\_2 \xi^1\_i \right) \\ + \left( \partial\_2 \xi^2\_i \partial\_2 f^3 + \partial\_2 f^2 \partial\_3 \xi^2\_i - \partial\_3 \xi^2\_i \partial\_2 f^2 - \partial\_3 f^2 \partial\_2 \xi^2\_i \right) \\ + \left( \partial\_2 \xi^3\_i \partial\_3 f^3 + \partial\_2 f^3 \partial\_3 \xi^3\_i - \partial\_3 \xi^3\_i \partial\_3 f^2 - \partial\_3 f^3 \partial\_2 \xi^3\_i \right) \end{aligned}$$

achieving some immediate cancellation in the second two brackets to

*.*

$$\begin{aligned} & \left( \partial\_2 \xi\_l^1 \partial\_1 f^3 + \partial\_2 f^1 \partial\_3 \xi\_l^1 - \partial\_3 \xi\_l^1 \partial\_1 f^2 - \partial\_3 f^1 \partial\_2 \xi\_l^1 \right) \\ & \qquad + \left( \partial\_2 \xi\_l^2 \partial\_2 f^3 - \partial\_3 f^2 \partial\_2 \xi\_l^2 \right) + \left( \partial\_2 f^3 \partial\_3 \xi\_l^3 - \partial\_3 \xi\_l^3 \partial\_3 f^2 \right) . \end{aligned}$$

We now simply rewrite the above by combining like terms, into

$$(\partial\_2 \xi\_l^1 (\partial\_1 f^3 - \partial\_3 f^1) + \partial\_3 \xi\_l^1 (\partial\_2 f^1 - \partial\_1 f^2) + (\partial\_2 \xi\_l^2 + \partial\_3 \xi\_l^3)(\partial\_2 f^3 - \partial\_3 f^2))$$

or more succinctly as

$$-\partial\_2 \xi\_l^1 \phi^2 - \partial\_3 \xi\_l^1 \phi^3 + (\partial\_2 \xi\_l^2 + \partial\_3 \xi\_l^3) \phi^1$$

to which we add and subtract *∂*1*ξ* <sup>1</sup> *<sup>i</sup> <sup>φ</sup>*<sup>1</sup> to arrive at

$$-\sum\_{j=1}^{3} \phi^j \partial\_j \xi\_l^1 + \sum\_{j=1}^{3} \left(\partial\_j \xi\_l^j\right) \phi^1 \cdot$$

The first term is precisely −[L*φξi*] <sup>1</sup> as we wished to show, appreciating that the second term is zero given the divergence free condition on *ξi*.

From this point forwards we adopt the notation of L*<sup>i</sup>* := L*ξi* . Writing the Stratonovich integral of (1) in its component form over the basis vectors of U, and introducing the notation *w* := curl*u*, at a heuristic level we can take the curl of (1) to obtain

$$w\_l - w\_0 + \int\_0^l \mathcal{Q}\_{u\_s} w\_s \, ds - \nu \int\_0^l \Delta w\_s \, ds + \sum\_{l=1}^\infty \int\_0^l \mathcal{Q}\_l w \circ dW\_s^l = 0. \tag{42}$$

Having already rigorously demonstrated the Itô conversion of the velocity form (22) we shall make the conversion without explicit reference to the conditions of Sect. 5.2, though this can be precisely shown in the same way. At least again then at the heuristic level, the Itô form is

$$w\_l = w\_0 - \int\_0^t \mathcal{L}\_{u\_l} w\_s \, ds + \nu \int\_0^t \Delta w\_s \, ds + \frac{1}{2} \int\_0^t \sum\_{l=1}^\infty \mathcal{L}\_l^2 w\_s ds - \sum\_{l=1}^\infty \int\_0^t \mathcal{L}\_l w \, dW\_s^l$$

which can again be projected3 to the equation

<sup>3</sup> Although the pressure term is eliminated, we still consider the Leray Projection to ensure that all terms belong to *L*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*.

$$\begin{split} w\_{l} &= w\_{0} - \int\_{0}^{l} \mathcal{P} \mathcal{L} \mathcal{E}\_{u\_{l}} w\_{s} \, ds - \nu \int\_{0}^{l} A w\_{s} \, ds \\ &+ \frac{1}{2} \int\_{0}^{l} \sum\_{i=1}^{\infty} \mathcal{P} \mathcal{L} \mathcal{E}\_{i}^{2} w\_{s} \, ds - \sum\_{i=1}^{\infty} \int\_{0}^{l} \mathcal{P} \mathcal{E} \mathcal{E}\_{l} w \, dW\_{s}^{l}. \end{split} \tag{43}$$

Having motivated this section by an avoidance of the Leray Projection this may seem counter intuitive, however we shall shortly show that the projection is not felt in the noise (where it becomes problematic in velocity form); that is to say that for sufficiently regular *w*, PL*iw* = L*iw*. The goal is to deduce the existence of a unique maximal solution of (43), a task which requires some clarification. Having reached (43) from the velocity form, we now look to solve the equation for vorticity which demands a representation of the velocity *u* in terms of the vorticity *w*. For this we quote Theorem 1 of [20] (or in fact, a slightly relaxed version).

**Theorem 4.2** *There exists a mapping <sup>K</sup>* : <sup>O</sup> <sup>×</sup> <sup>O</sup> <sup>→</sup> <sup>R</sup><sup>3</sup> *such that for every <sup>φ</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* <sup>∩</sup> *<sup>W</sup>k,p(*O; <sup>R</sup>3*) where <sup>k</sup>* <sup>∈</sup> <sup>N</sup> ∪ {0}*,* <sup>1</sup> *<p<* <sup>∞</sup>*, the function BSKφ* : <sup>O</sup> <sup>→</sup> <sup>R</sup><sup>3</sup> *defined <sup>λ</sup>* <sup>−</sup> *a.e. by*

$$(BS\_K\phi)(\mathbf{x}) = \int\_{\mathcal{O}} K(\mathbf{x}, \mathbf{y})\phi(\mathbf{y})d\mathbf{y} \tag{44}$$

*is such that:*


$$\|\|B\mathcal{S}\_K\phi\|\|\_{W^{k+1,p}} \le C \|\|\phi\|\|\_{W^{k,p}}.$$

It should immediately be noted that such a *K* is not claimed to be unique, and in [20] is explicitly shown to be non-unique, however it does allow us to identify *a* velocity from a given vorticity satisfying the divergence-free and boundary conditions (2). From this point forwards we fix a specific *K* from the class of admissable integral kernels postulated in Theorem 4.2. We thus understand the nonlinear term as a mapping

$$
\phi \mapsto \mathcal{L}\_{BS\_K\phi}\phi.
$$

This mapping shall at times be simply denoted by L*BSK* . The Eq. (43) is thus closed in *w*.

# *4.2 Definitions and Results*

We now state and prove the existence, uniqueness and maximality results for (43). We recall that to solve the velocity form (23) we used the extended criterion of Theorem 5.22, requiring the space *<sup>V</sup>* := *<sup>W</sup>*3*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*)* to prove Theorem 3.6. This arose naturally in first showing Theorem 3.1, where we considered solutions explicitly in terms of the original Stratonovich form. For (43), however, solutions can be obtained for the natural choice of *<sup>w</sup>*<sup>0</sup> <sup>∈</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* with an application only of Theorem 5.15 in Sect. 5.3 (see Remark 5). We note that such a choice is natural as the identity is satisfied in *<sup>L</sup>*2*(*O; <sup>R</sup>3*)*. Thus we do not take the detour of considering a fourth Hilbert Space to rigorously define solutions of the Stratonovich form (42), although this can be done similarly. Notions of local strong solution, maximal strong solution and pathwise uniqueness can then be defined identically to Definitions 3.3, 3.4 and 3.5 for the new identity (43). The result is then the following.

**Theorem 4.3** *For any given* <sup>F</sup>0<sup>−</sup> *measurable <sup>w</sup>*<sup>0</sup> :  <sup>→</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*), there exists a unique maximal strong solution (w, ) of the equation (43). Moreover at* <sup>P</sup>−*a.e. ω for which (ω) <* ∞*, we have that*

$$\sup\_{r \in [0, \Theta(\omega))} \|w\_r(\omega)\|\_1^2 + \int\_0^{\Theta(\omega)} \|w\_r(\omega)\|\_2^2 dr = \infty.$$

# *4.3 Operator Bounds*

Just as in Sect. 3.2, we state some intermediary results regarding the operators involved. In the following and throughout this section *c* will represent a generic constant changing from line to line, *c(ε)* will be a generic constant dependent on a fixed *<sup>ε</sup>*, *<sup>φ</sup>* and *<sup>ψ</sup>* will be arbitrary elements of *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* and *φn* <sup>∈</sup> span{*a*1*,* ··· *, an*}. The mapping L*<sup>i</sup>* satisfies the same boundedness as (18), and also the following.

**Lemma 4.4** *There exists a constant c such that for each i and for all f* ∈ *<sup>W</sup>k*+2*,*2*(*O; <sup>R</sup>3*) with <sup>k</sup>* <sup>=</sup> <sup>0</sup>*,* <sup>1</sup>*, we have the bounds*

$$\langle \mathcal{J}\_l^{\varrho^2} f, f \rangle\_{W^{k,2}} + \|\mathcal{J}\_l^{\varrho} f\|\_{W^{k,2}}^2 \le c \|\xi\_l\|\_{W^{k+2,\infty}}^2 \|f\|\_{W^{k,2}}^2,\tag{45}$$

$$\left\| \langle \mathcal{Q}\_l^\ell f, f \rangle \right\|\_{W^{k,2}}^2 \le c \left\| \xi\_l \right\|\_{W^{k+1,\infty}}^2 \left\| f \right\|\_{W^{k,2}}^4,\tag{46}$$

$$\left\|\left[\Delta,\mathcal{A}\_{l}^{\ell}\right]f\right\|^{2}\leq c\left\|\xi\_{l}\right\|\_{W^{3,\infty}}^{2}\left\|f\right\|\_{W^{2,2}}^{2}\tag{47}$$

*where* [*,* L*i*] *is the commutator*

$$[\Delta, \mathcal{Q}\_l] := \Delta \mathcal{Q}\_l - \mathcal{Q}\_l \Delta \dots$$

*Proof* See Sect. 5.1.

**Lemma 4.5** *For any ε >* 0 *we have the bound*

$$\begin{split} \langle \mathcal{P}\_n \mathcal{P} \mathcal{L}\_l^2 \phi\_n, \phi\_n \rangle\_1 &+ \langle \mathcal{P}\_n \mathcal{P} \mathcal{L}\_l \phi\_n, \mathcal{P}\_n \mathcal{P} \mathcal{L}\_l \phi\_n \rangle\_1 \\ &\leq c(\varepsilon) \|\xi\_l\|\_{W^{3,\infty}}^2 \|\phi\_n\|\_1^2 + \varepsilon \|\xi\_l\|\_{W^{3,\infty}}^2 \|\phi\_n\|\_2^2. \end{split}$$

*Proof* This now follows precisely as for (27).

**Lemma 4.6** *For any ε >* 0 *we have the bound*

$$\left| \left| \langle \mathcal{P}\_n \mathcal{P} \mathcal{Q}\_{BS\_K \phi\_n} \phi\_n, \phi\_n \rangle\_1 \right| \right| \leq c(\varepsilon) \| \phi\_n \|\_1^4 + \varepsilon \| \phi\_n \|\_2^2.$$

*Proof* See Sect. 5.1.

**Remark 5** Recalling Remark 4, the nonlinear term in velocity form fails this estimate. It is satisfied in the vorticity form due to the additional regularity that *fn* has compared to *φn*. This difference is what enables us to apply Theorem 5.15 directly in the case *<sup>H</sup>* := *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup>* for the vorticity form, whereas for the velocity form the appropriate estimate is only satisfied for *<sup>H</sup>* := *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup>* (see Assumption 5.8) hence the need for Theorem 5.22 in velocity form.

In the following *g* is defined by

$$\mathbf{g}(\mathbf{x}) = \int\_{\mathcal{O}} K(\mathbf{x}, \mathbf{y}) \psi(\mathbf{y}) d\mathbf{y}$$

as in Theorem 4.2. The subsequent lemma is in analogy with Lemma 3.9.

**Lemma 4.7** *For any ε >* 0*, we have that*

$$\left| \langle \mathcal{P} \mathcal{A}\_{BS\_K} \phi - \mathcal{P} \mathcal{A}\_{BS\_K} \psi, \phi - \psi \rangle \right| \leq c(\varepsilon) \left[ \|\phi\|\_{1}^{2} + \|\psi\|\_{1}^{2} \right] \|\phi - \psi\|^{2} + \varepsilon \|\phi - \psi\|\_{1}^{2}$$

*Proof* See Sect. 5.1.

## *4.4 Proof of Theorem 4.3*

As discussed we apply Theorem 5.15, which we do for the spaces

$$W := W^{2,2}\_{\sigma}(\mathcal{O}; \mathbb{R}^3), \qquad H := W^{1,2}\_{\sigma}(\mathcal{O}; \mathbb{R}^3), \qquad U := L^2\_{\sigma}(\mathcal{O}; \mathbb{R}^3).$$

*Proof of Theorem 4.3* The density relations are clear as *C*∞ <sup>0</sup>*,σ (*O; <sup>R</sup>3*)* <sup>⊂</sup> *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* is dense in both *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* and *<sup>L</sup>*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*. The bilinear form (70) is simply again (37). Now we shift attentions to checking that the operators

are measurable into the correct spaces. We note that L*BSK* has improved regularity properties over L given item 3 of Theorem 4.2, so retains the continuity with measurability following. There is no change to the Stokes Operator from Sect. 3. As for PL*i*, we in fact first show that L*<sup>i</sup>* ∈ *C <sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*; *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* (and hence is invariant under <sup>P</sup>4). This consists of three parts: showing that it is continuous as a mapping into *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)*, showing the divergence free property and then the zero trace property. In fact with the appropriate regularity, it follows identically to (18) that we again have

$$\|\|\mathcal{J}\_l^\ell \phi\|\|\_{W^{k,2}}^2 \le c \|\|\xi\_l\|\_{W^{k+1,\infty}}^2 \|\phi\|\|\_{W^{k+1,2}}^2 \tag{48}$$

which addresses the continuity. The fact that L*iφ* is divergence free comes immediately from the relation L*iφ* = curl*(Bi*[*BSKφ*]*)* and the well established fact the divergence of a curl is zero. For the zero trace property it is sufficient to show the existence of a sequence of compactly supported *ηn* <sup>∈</sup> *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)* which converge to <sup>L</sup>*iφ* in *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)*. By definition of the property that *ξi* <sup>∈</sup> *<sup>W</sup>*2*,*<sup>2</sup> <sup>0</sup> *(*OR3*)* there is a sequence *(γn)*, *γn* ∈ *C*<sup>∞</sup> <sup>0</sup> *(*O; <sup>R</sup>3*)* such that *γn* <sup>→</sup> *ξi* in *<sup>W</sup>*2*,*2*(*OR3*)*. Evidently *ηn* := L*γnφ* is again compactly supported, and observe that

$$\|\|\mathcal{X}\_{\mathbb{M}}\phi - \mathcal{X}\_{l}\phi\|\|\_{W^{1,2}} = \|\mathcal{X}\_{\mathbb{M}-\xi\_{l}}\phi\|\|\_{W^{1,2}} \le c \|\chi\_{n} - \xi\_{l}\|\_{W^{2,2}} \|\phi\|\|\_{W^{2,2}}$$

from (6), which converges to zero as required to justify the zero trace property. The fact that <sup>P</sup><sup>L</sup> <sup>2</sup> *<sup>i</sup>* ∈ *C <sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*;*L*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* again follows from the linearity, (48) and continuity of P. We now proceed to justify the numbered assumptions of Sect. 5.3.


<sup>4</sup> As <sup>P</sup>L*<sup>i</sup>* is equal to <sup>L</sup>*<sup>i</sup>* on *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)*, then *(*PL*i)* <sup>2</sup> is equal to <sup>P</sup><sup>L</sup> <sup>2</sup> *<sup>i</sup>* on this space too, justifying the consistency between taking the Leray Projector and then converting from Stratonovich to Itô and vice versa as seen for (23).

$$\begin{aligned} \left| \langle \mathcal{P} \mathcal{L}\_f \phi, \phi \rangle \right| &\leq \left| \langle \mathcal{L}\_f \phi, \phi \rangle + \langle \mathcal{L}\_\phi f, \phi \rangle \right| \\ &\leq c \left[ \|f\|\_2 \|\phi\|\_1 + \|\phi\|\_1 \|f\|\_2 \right] \|\phi\|\_1 \\ &\leq c \|\phi\| \|\|\phi\|\|\_1^2 \end{aligned}$$

where the rest simply follows as in Assumption 5.9.

Assumption 5.11: We consider the different operators in turn, starting with the nonlinear term and using that

$$\begin{aligned} \left| \langle \mathcal{P} \mathcal{L}\_f^\rho \phi - \mathcal{P} \mathcal{L}\_g^\rho \psi, \eta \rangle \right| &= \left| \langle \mathcal{L}\_f \phi - \mathcal{L}\_\phi f - \mathcal{L}\_g \psi + \mathcal{L}\_\psi g, \eta \rangle \right| \\ &= \left| \langle \mathcal{L}\_{f-\underline{g}} \phi + \mathcal{L}\_\theta (\phi - \psi) - \mathcal{L}\_{\phi - \underline{\psi}} f - \mathcal{L}\_\psi (f - \underline{g}), \eta \rangle \right| \end{aligned}$$

where exactly as in Lemma 4.7 we have that

$$\|\mathcal{L}\_{f-\mathbf{g}}\phi + \mathcal{L}\_{\mathbf{g}}(\phi - \psi) - \mathcal{L}\_{\phi - \psi}f - \mathcal{L}\_{\psi}(f - \mathbf{g})\| \le c \left[ \|\phi\|\_{1} + \|\psi\|\_{1} \right] \|\phi - \psi\|\_{1}$$

so in particular

$$\left| \left< \mathcal{P} \mathcal{L}\_f \phi - \mathcal{P} \mathcal{L}\_g \psi , \eta \right> \right| \le \| \eta \| \left( c \left[ \| \phi \| \_1 + \| \psi \| \_1 \right] \| \phi - \psi \| \_1 \right) . \tag{49}$$

For the Stokes Operator we simply apply Proposition (12) to see that

$$|\langle A\phi - A\psi, \eta \rangle| = |\langle A(\phi - \psi), \eta \rangle| = \langle \phi - \psi, \eta \rangle\_{\mathbb{I}} \le \|\phi - \psi\|\_{\mathbb{I}} \|\eta\|\_{\mathbb{I}} \qquad (50)$$

and for the L <sup>2</sup> *<sup>i</sup>* term we use the adjoint characterisation to observe that

$$|\langle \mathcal{L}\_l^2 \phi - \mathcal{L}\_l^2 \psi, \eta \rangle| = |\langle \mathcal{L}\_l^2 (\phi - \psi), \eta \rangle| = |\langle \mathcal{L}\_l^\varrho (\phi - \psi), B\_l^\* \eta \rangle|$$

$$\leq c \|\xi\_l\|\_{W^{1,\infty}}^2 \|\phi - \psi\|\_1 \|\eta\|\_1. \tag{51}$$

Combining (49), (50) and (51) gives the result.

## **5 Appendices**

## *5.1 Proofs from Sects. 2.3, 3.2, and 4.3*

We begin with the proofs from Sect. 2.3.

*Proof of Proposition 2.6* We begin with showing (19), tasked to control

$$\|\langle \mathcal{B}\_l^2 f, f \rangle\_{W^{k,2}} + \|\mathcal{B}\_l f\|\_{W^{k,2}}^2 \tag{52}$$

and do so with each derivative in the sum for the inner product: that is, we are looking at

$$
\langle \langle D^{\alpha} B\_{l}^{2} f, D^{\alpha} f \rangle + \langle D^{\alpha} B\_{l} f, D^{\alpha} B\_{l} f \rangle \tag{53}
$$

where we simplify the derivatives by writing

$$D^{a}B\_{\xi\_{l}}f = \sum\_{a' \le a} B\_{D^{a-a'}\xi\_{l}}D^{a'}f$$

$$= \sum\_{a'$$

Plugging this result in, we also see that

$$\begin{aligned} D^{\alpha} B^{2}\_{\xi\_{l}} f &= D^{\alpha} B\_{\xi\_{l}} \left( B\_{\xi\_{l}} f \right) \\ &= \sum\_{\alpha'<\alpha} B\_{D^{\alpha-\alpha'}\xi\_{l}} D^{\alpha'} B\_{\xi\_{l}} f + B\_{\xi\_{l}} D^{\alpha} B\_{\xi\_{l}} f \end{aligned}$$

which will use in our analysis of (53), reducing the expression to

$$\left\langle \sum\_{\alpha'<\alpha} B\_{D^{\alpha-\alpha'}\xi\_{l}} D^{\alpha'} B\_{\xi\_{l}} f + B\_{\xi\_{l}} D^{\alpha} B\_{\xi\_{l}} f, D^{\alpha} f \right\rangle + \langle D^{\alpha} B\_{\xi\_{l}} f, D^{\alpha} B\_{\xi\_{l}} f \rangle$$

which we further break up in terms of the adjoint *B*∗ *ξi* :

$$\left\langle \sum\_{\alpha'<\alpha} B\_{D^{\alpha-\alpha'}\xi\_{l}} D^{\alpha'} B\_{\xi\_{l}} f, D^{\alpha} f \right\rangle + \left\langle D^{\alpha} B\_{\xi\_{l}} f, B\_{\xi\_{l}}^{\*} D^{\alpha} f \right\rangle + \left\langle D^{\alpha} B\_{\xi\_{l}} f, D^{\alpha} B\_{\xi\_{l}} f \right\rangle \qquad (55)$$

requiring that *<sup>D</sup>αf* <sup>∈</sup> *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)*, which is satisfied by the assumption *<sup>f</sup>* <sup>∈</sup> *<sup>W</sup>k*+2*,*2*(*O; <sup>R</sup>3*)*. By summing the second and third inner products and using (54), this becomes

$$\left\langle \sum\_{\alpha'<\alpha} B\_{D^{\alpha-\alpha'}\xi\_i} D^{\alpha'} B\_{\xi\_i} f, D^{\alpha} f \right\rangle + \left\langle D^{\alpha} B\_{\xi\_i} f, B\_{\xi\_i}^\* D^{\alpha} f + \sum\_{\alpha'<\alpha} B\_{D^{\alpha-\alpha'}\xi\_i} D^{\alpha'} f + B\_{\xi\_i} D^{\alpha} f \right\rangle$$

which we look to simplify by combining *B*∗ *ξi* and *Bξi* , noting that

$$B\_l^\* + B\_l = \mathcal{L}\_{\xi\_l}^\* + \mathcal{T}\_l^\* + \mathcal{L}\_{\xi\_l} + \mathcal{T}\_l = \mathcal{T}\_l^\* + \mathcal{T}\_l.$$

Indeed we arrive at the expression

$$
\left\langle \sum\_{\alpha'<\alpha} B\_{D^{\underline{u}-u'}\xi\_{\bar{l}}} D^{\alpha'} B\_{\xi\_{\bar{l}}} f, D^{\alpha} f \right\rangle + \left\langle D^{\alpha} B\_{\xi\_{\bar{l}}} f, \left( \mathcal{T}\_{\xi\_{\bar{l}}} + \mathcal{T}\_{\xi\_{\bar{l}}}^{\*} \right) D^{\alpha'} f + \sum\_{\alpha'<\alpha} B\_{D^{\underline{u}-u'}\xi\_{\bar{l}}} D^{\alpha'} f \right\rangle.$$

As we are looking to achieve control with respect to the *<sup>W</sup>k,*2*(*O; <sup>R</sup>3*)* norm of *<sup>f</sup>* , then it is the terms with differential operators of order greater than *k* that concern us. Of course this was the motivating factor behind combining *Bξi* and its adjoint, nullifying the additional derivative coming from L*ξi* . There are more higher order terms to go though, and the strategy will be to write these in terms of commutators with a differential operator of controllable order. This will involve considering different aspects of our sum in tandem, which will be helped with (54) reducing our expression again to

$$
\begin{split}
\left< \sum\_{\alpha'<\alpha} \mathcal{B}\_{D^{\mu-\alpha'}\xi\_i} \, D^{\alpha'} \mathcal{B}\_{\xi\_i} f, \, D^{\alpha} f \right> &+ \left< \sum\_{\beta<\alpha} \mathcal{B}\_{D^{\mu-\beta}\xi\_i} \, D^{\beta} f + \mathcal{B}\_{\xi\_i} D^{\alpha} f, \, \left( \mathcal{T}\_{\xi\_i} + \mathcal{T}\_{\xi\_i}^\* \right) D^{\alpha} f \right> \\ &+ \sum\_{\alpha'<\alpha} \mathcal{B}\_{D^{\mu-\alpha'}\xi\_i} \, D^{\alpha'} f \right>.
\end{split}
$$

Ultimately the terms in the summand are split up into

$$<\langle B\_{\xi\_l} D^a f, \left( \mathcal{T}\_{\xi\_l} + \mathcal{T}\_{\xi\_l}^\* \right) D^a f \rangle \tag{56}$$

$$+\left\langle \sum\_{\beta < \alpha} B\_{D^{\mu-\beta}\xi\_l} D^{\beta}f, \left(\mathcal{T}\_{\xi\_l} + \mathcal{T}\_{\xi\_l}^\*\right) D^{\alpha}f + \sum\_{a' < a} B\_{D^{\mu-a'}\xi\_l} D^{a'}f \right\rangle \tag{57}$$

$$+\sum\_{\alpha'<\alpha} \left( \langle B\_{D^{\alpha-\alpha'}\xi\_{l}} D^{\alpha'} B\_{\xi\_{l}} f, D^{\alpha} f \rangle + \langle B\_{\xi\_{l}} D^{\alpha} f, B\_{D^{\alpha-\alpha'}\xi\_{l}} D^{\alpha'} f \rangle \right) \tag{58}$$

with the intention of controlling each one individually. Firstly for a treatment of (56),

$$\begin{split}(S6) &= \langle (\mathcal{L}\_{\xi\_{l}} + \mathcal{T}\_{\xi\_{l}}) D^{\alpha} f, (\mathcal{T}\_{\xi\_{l}}^{\*} + \mathcal{T}\_{\xi\_{l}}) D^{\alpha} f \rangle \\ &= \langle \mathcal{L}\_{\xi\_{l}} D^{\alpha} f, \mathcal{T}\_{\xi\_{l}}^{\*} D^{\alpha} f \rangle + \langle \mathcal{L}\_{\xi\_{l}} D^{\alpha} f, \mathcal{T}\_{\xi\_{l}} D^{\alpha} f \rangle \\ &\quad + \langle \mathcal{T}\_{\xi\_{l}} D^{\alpha} f, \mathcal{T}\_{\xi\_{l}}^{\*} D^{\alpha} f \rangle + \langle \mathcal{T}\_{\xi\_{l}} D^{\alpha} f, \mathcal{T}\_{\xi\_{l}} D^{\alpha} f \rangle \\ &= \left( \langle \mathcal{T}\_{\xi\_{l}} \mathcal{L}\_{\xi\_{l}} D^{\alpha} f, D^{\alpha} f \rangle + \langle \mathcal{L}\_{\xi\_{l}} D^{\alpha} f, \mathcal{T}\_{\xi\_{l}} D^{\alpha} f \rangle \right), \\ &\quad + \left( \langle \mathcal{T}\_{\xi\_{l}}^{2} D^{\alpha} f, D^{\alpha} f \rangle + \langle \mathcal{T}\_{\xi\_{l}} D^{\alpha} f, \mathcal{T}\_{\xi\_{l}} D^{\alpha} f \rangle \right). \end{split}$$

We now bound the brackets in terms of *Dαf* <sup>2</sup> separately, starting with the latter one as

$$\begin{aligned} \left\langle \mathcal{T}\_{\xi\_l}^2 D^\alpha f, D^\alpha f \right\rangle &\leq \left\| \mathcal{T}\_{\xi\_l}^2 D^\alpha f \right\| \left\| D^\alpha f \right\| \leq c \left\| \xi\_l \right\|\_{W^{1,\infty}} \left\| \mathcal{T}\_{\xi\_l} D^\alpha f \right\| \left\| D^\alpha f \right\| \\ &\leq c \left\| \xi\_l \right\|\_{W^{1,\infty}}^2 \left\| D^\alpha f \right\|^2 \end{aligned}$$

and similarly

$$\|\langle \mathcal{T}\_{\xi\_l} D^\alpha f, \mathcal{T}\_{\xi\_l} D^\alpha f \rangle\| \le \|\langle \mathcal{T}\_{\xi\_l} D^\alpha f \|\| \mathcal{T}\_{\xi\_l} D^\alpha f \|\| \le c \|\xi\_l\|\_{W^{1,\infty}}^2 \|D^\alpha f\|^2.$$

Now for the first bracket, we add and subtract a term to have an expression through the commutator of the operators:

$$
\begin{split} & \langle \mathcal{T}\_{\overline{\xi}\_{i}} \mathcal{L}\_{\overline{\xi}\_{i}} D^{\alpha} f, D^{\alpha} f \rangle + \langle \mathcal{L}\_{\overline{\xi}\_{i}} D^{\alpha} f, \mathcal{T}\_{\overline{\xi}\_{i}} D^{\alpha} f \rangle \\ = & \langle (\mathcal{T}\_{\overline{\xi}\_{i}} \mathcal{L}\_{\overline{\xi}\_{i}} - \mathcal{L}\_{\overline{\xi}\_{i}} \mathcal{T}\_{\overline{\xi}\_{i}}) D^{\alpha} f, D^{\alpha} f \rangle + \langle \mathcal{L}\_{\overline{\xi}\_{i}} \mathcal{T}\_{\overline{\xi}\_{i}} D^{\alpha} f, D^{\alpha} f \rangle + \langle \mathcal{L}\_{\overline{\xi}\_{i}} D^{\alpha} f, \mathcal{T}\_{\overline{\xi}\_{i}} D^{\alpha} f \rangle \\ = & (\langle \mathcal{T}\_{\overline{\xi}\_{i}} \mathcal{L}\_{\overline{\xi}\_{i}} - \mathcal{L}\_{\overline{\xi}\_{i}} \mathcal{T}\_{\overline{\xi}\_{i}} \rangle D^{\alpha} f, D^{\alpha} f \rangle + \langle \mathcal{T}\_{\overline{\xi}\_{i}} D^{\alpha} f, \mathcal{L}\_{\overline{\xi}\_{i}}^{\*} D^{\alpha} f \rangle + \langle \mathcal{L}\_{\overline{\xi}\_{i}} D^{\alpha} f, \mathcal{T}\_{\overline{\xi}\_{i}} D^{\alpha} f \rangle \\ = & (\langle \mathcal{T}\_{\overline{\xi}\_{i}} \mathcal{L}\_{\overline{\xi}\_{i}} - \mathcal{L}\_{\overline{\xi}\_{i}} \mathcal{T}\_{\overline{\xi}\_{i}} \rangle D^{\alpha} f, D^{\alpha} f \rangle. \end{split}
$$

The commutator term is given explicitly through

$$\begin{aligned} \mathcal{T}\_{\mathbb{K}} \mathcal{L}\_{\xi\_i} D^\alpha f &= \mathcal{T}\_{\mathbb{K}} \left( \sum\_{j=1}^3 \xi\_i^j \partial\_j D^\alpha f \right) \\ &= \sum\_{k=1}^3 \left( \sum\_{j=1}^3 \xi\_i^j \partial\_j D^\alpha f \right)^k \nabla \xi\_i^k \\ &= \sum\_{k=1}^3 \sum\_{j=1}^3 \xi\_i^j \partial\_j D^\alpha f^k \nabla \xi\_i^k \end{aligned}$$

and

$$\begin{split} \mathcal{L}\_{\xi\_i} \mathcal{T}\_{\xi\_i} \boldsymbol{D}^\alpha f &= \mathcal{L}\_{\xi\_i} \Big( \sum\_{k=1}^3 \boldsymbol{D}^\alpha f^k \nabla \xi\_i^k \Big) \\ &= \sum\_{j=1}^3 \xi\_i^j \partial\_j \Big( \sum\_{k=1}^3 \boldsymbol{D}^\alpha f^k \nabla \xi\_i^k \Big) \\ &= \sum\_{j=1}^3 \sum\_{k=1}^3 \xi\_i^j \partial\_j \left( \boldsymbol{D}^\alpha f^k \nabla \xi\_i^k \right) \\ &= \sum\_{j=1}^3 \sum\_{k=1}^3 \left( \xi\_i^j \partial\_j \boldsymbol{D}^\alpha f^k \nabla \xi\_i^k + \xi\_i^j \boldsymbol{D}^\alpha f^k \partial\_j \nabla \xi\_i^k \right) \end{split}$$

such that

$$(\mathcal{T}\_{\xi\_l}\mathcal{L}\_{\xi\_l} - \mathcal{L}\_{\xi\_l}\mathcal{T}\_{\xi\_l})D^\alpha f = -\sum\_{j=1}^3 \sum\_{k=1}^3 \xi\_l^j D^\alpha f^k \partial\_j \nabla \xi\_l^k.$$

#### Therefore

$$\begin{split} \|(\mathsf{T}\_{\textnormal{\mathbb{K}}}\mathcal{L}\_{\textnormal{\mathbb{K}}}-\mathcal{L}\_{\textnormal{\mathbb{K}}}\mathsf{T}\_{\textnormal{\mathbb{K}}})D^{\alpha}f\| &\leq c\sum\_{j=1}^{3}\sum\_{k=1}^{3}\|\xi\_{j}^{j}D^{\alpha}f^{k}\partial\_{j}\nabla\xi\_{j}^{k}\| \\ &\leq c\sum\_{j=1}^{3}\sum\_{k=1}^{3}\sum\_{l=1}^{3}\|\xi\_{j}^{j}D^{\alpha}f^{k}\partial\_{j}\partial\_{l}\xi\_{l}^{k}\|\_{L^{2}(\mathcal{O};\mathbb{R})} \\ &\leq c\sum\_{j=1}^{3}\sum\_{k=1}^{3}\sum\_{l=1}^{3}\|\xi\_{j}^{j}\partial\_{j}\partial\_{l}\xi\_{j}^{k}\|\_{L^{\infty}(\mathcal{O};\mathbb{R})}\|D^{\alpha}f^{k}\|\_{L^{2}(\mathcal{O};\mathbb{R})} \\ &\leq c\|\xi\_{l}\|\_{\mathrm{W}^{2,\infty}}^{2}\sum\_{j=1}^{3}\sum\_{k=1}^{3}\sum\_{l=1}^{3}\|D^{\alpha}f^{k}\|\_{L^{2}(\mathcal{O};\mathbb{R})} \\ &\leq c\|\xi\_{l}\|\_{\mathrm{W}^{2,\infty}}^{2}\|D^{\alpha}f\| \end{split}$$

and

$$\begin{aligned} \| (\mathcal{T}\_{\mathbb{K}} \mathcal{L}\_{\mathbb{K}} - \mathcal{L}\_{\mathbb{K}} \mathcal{T}\_{\mathbb{K}}) D^{\alpha} f, D^{\alpha} f \rangle &\leq \| (\mathcal{T}\_{\mathbb{K}} \mathcal{L}\_{\mathbb{K}} - \mathcal{L}\_{\mathbb{K}} \mathcal{T}\_{\mathbb{K}}) D^{\alpha} f \| \| D^{\alpha} f \| \| \\ &\leq c \| \xi\_{l} \|\_{W^{2,\infty}}^{2} \| D^{\alpha} f \| ^{2}. \end{aligned}$$

Combining these inequalities we determine the bound

$$(\mathsf{S}\mathsf{6}) \leq c \left\lVert \xi\_{l} \right\rVert\_{W^{2,\infty}}^{2} \left\lVert D^{\alpha}f \right\rVert^{2}.$$

As for (57) we look to use Cauchy-Schwarz and bound each item in the inner product. Indeed straight from (18) in the *<sup>L</sup>*2*(*O; <sup>R</sup>3*)* setting, by simply replacing *ξi* with *Dα*−*βξi*, we have that

$$\|\|B\_{D^{\alpha-\beta}\xi\_l}D^{\beta}f\|\|^2 \le c\|D^{\alpha-\beta}\xi\_l\|\_{W^{1,\infty}}^2\|D^{\beta}f\|\_{W^{1,2}}^2 \le c\|\xi\_l\|\_{W^{k+1,\infty}}^2\|f\|\_{W^{k,2}}^2.$$

Moreover,

$$\left\| \left| \sum\_{\beta < \alpha} B\_{D^{\alpha-\beta}\xi\_l} D^{\beta} f \right| \right\|^2 \le c \sum\_{\beta < \alpha} \|B\_{D^{\alpha-\beta}\xi\_l} D^{\beta} f\|^2 \le c \|\xi\_l\|\_{W^{k+1,\infty}}^2 \|f\|\_{W^{k,2}}^2$$

In addition to this,

$$\|\| (\mathcal{T}\_{\xi\_l} + \mathcal{T}\_{\xi\_l}^\*) D^\alpha f \|\| \le \|\mathcal{T}\_{\xi\_l} D^\alpha f \|\| + \|\mathcal{T}\_{\xi\_l}^\* D^\alpha f \|\| \le c \|\xi\_l\|\_{W^{k+1,\infty}} \|D^\alpha f\|\|$$

using the equivalence in operator norm of the adjoint. Together this amounts to

$$\begin{split}(57) &\leq \left| \left| \sum\_{\beta < \alpha} B\_{D^{\alpha-\beta}\underline{\xi}i} D^{\beta} f \right| \right| \cdot \left| \left( \mathcal{T}\_{\mathbb{K}} + \mathcal{T}\_{\mathbb{K}}^{\*} \right) D^{\alpha} f + \sum\_{a' < a} B\_{D^{a-a'}\underline{\xi}i} D^{a'} f \right| \right| \\ &\leq \left| \left| \sum\_{\beta < \alpha} B\_{D^{\alpha-\beta}\underline{\xi}i} D^{\beta} f \right| \right| \left( \left| \left( \mathcal{T}\_{\mathbb{K}} + \mathcal{T}\_{\mathbb{K}}^{\*} \right) D^{a} f \right| + \left| \left| \sum\_{a' < a} B\_{D^{a-a'}\underline{\xi}i} D^{a'} f \right| \right| \right) \\ &\leq c \left\| \xi\_{\mathbb{I}} \right\|\_{W^{k+1,\infty}} \left\| f \right\|\_{W^{k,2}} \left( c \left\| \xi\_{\mathbb{I}} \right\|\_{W^{k+1,\infty}} \left\| D^{a'} f \right\| + c \left\| \xi\_{\mathbb{I}} \right\|\_{W^{k+1,\infty}} \left\| f \right\|\_{W^{k+2}} \right) \right) \\ &\leq c \left\| \xi\_{\mathbb{I}} \right\|\_{W^{k+1,\infty}}^2 \left\| f \right\|\_{W^{k,2}}^2. \end{split}$$

Let's now turn our attentions to (58), which for each *α* in the sum we rewrite as

$$
\langle \langle D^{\alpha}f, B\_{D^{\alpha-\alpha'}\xi\_{l}}D^{\alpha'}B\_{\xi\_{l}}f + B\_{\xi\_{l}}^{\*}B\_{D^{\alpha-\alpha'}\xi\_{l}}D^{\alpha'}f \rangle \tag{59}
$$

and employing (54) again we see this becomes

$$\begin{split} & \left\langle D^{\alpha}f, B\_{D^{\alpha-\alpha'}\xi\_{i}} \left( \sum\_{\beta < a'} B\_{D^{\alpha'-\beta}\xi\_{i}} D^{\beta}f + B\_{\xi\_{i}} D^{\alpha'}f \right) + B\_{\xi\_{i}}^{\*} B\_{D^{\alpha-\alpha'}\xi\_{i}} D^{\alpha'}f \right\rangle \\ & = \left\langle D^{\alpha}f, \sum\_{\beta < a'} B\_{D^{\alpha-\alpha'}\xi\_{i}} B\_{D^{\alpha'-\beta}\xi\_{i}} D^{\beta}f \right\rangle \\ & \quad + \langle D^{\alpha}f, B\_{D^{\alpha-\alpha'}\xi\_{i}} B\_{\xi\_{i}} D^{\alpha'}f + B\_{\xi\_{i}}^{\*} B\_{D^{\alpha-\alpha'}\xi\_{i}} D^{\alpha'}f \rangle. \end{split}$$

We have split up these terms to make our approach clearer, as the two right hand sides of the inner products will be considered separately. For the first inner product, two applications of (18) give that

$$\begin{split} \|\mathcal{B}\_{D^{\alpha-\alpha'}\xi\_{l}}\mathcal{B}\_{D^{\alpha'-\beta}\xi\_{l}}D^{\beta}f\|^{2} &\leq c\|D^{\alpha-\alpha'}\xi\_{l}\|\_{W^{1,\infty}}^{2}\|\mathcal{B}\_{D^{\alpha'-\beta}\xi\_{l}}D^{\beta}f\|\_{W^{1,2}}^{2} \\ &\leq c\|D^{\alpha-\alpha'}\xi\_{l}\|\_{W^{1,\infty}}^{2}\Big(c\|D^{\alpha'-\beta}\xi\_{l}\|\_{W^{2,\infty}}^{2}\|D^{\beta}f\|\_{W^{2,2}}^{2}\Big) \\ &\leq c\|\xi\_{l}\|\_{W^{k+1,\infty}}^{4}\|f\|\_{W^{k,2}}^{2} \end{split}$$

Moreover,

$$\begin{aligned} \left| \left| \sum\_{\beta < a'} \mathcal{B}\_{D^{a-a'}\xi\_i} \mathcal{B}\_{D^{a'-\beta}\xi\_i} D^{\beta} f \right| \right|^2 &\leq c \| \mathcal{B}\_{D^{a-a'}\xi\_i} \mathcal{B}\_{D^{a'-\beta}\xi\_i} D^{\beta} f \|^2 \\ &\leq c \| \xi\_i \|\_{W^{k+1,\infty}}^4 \| f \|\_{W^{k,2}}^2. \end{aligned}$$

As for the second inner product, we rewrite the right side as

$$B\_{D^{\alpha-\alpha'}\xi\_{l}}((\mathcal{L}\_{\xi\_{l}}+\mathcal{T}\_{\xi\_{l}})D^{\alpha'}f) + (\mathcal{L}\_{\xi\_{l}}^{\*}+\mathcal{T}\_{\xi\_{l}}^{\*})B\_{D^{\alpha-\alpha'}\xi\_{l}}D^{\alpha'}f$$

and further

$$(\mathcal{B}\_{D^{\mu-\alpha'}\xi\_{l}}\mathcal{L}\_{\xi\_{l}} - \mathcal{L}\_{\xi\_{l}}\mathcal{B}\_{D^{\mu-\alpha'}\xi\_{l}})D^{\alpha'}f + \mathcal{B}\_{D^{\mu-\alpha'}\xi\_{l}}\mathcal{T}\_{\xi\_{l}}^{\*}D^{\alpha'}f + \mathcal{T}\_{\xi\_{l}}^{\*}\mathcal{B}\_{D^{\mu-\alpha'}\xi\_{l}}D^{\alpha'}f. \qquad (60)$$

Starting with the latter two terms,

$$\begin{aligned} \|\|B\_{D^{\alpha-\alpha'}\xi\_{l}}\mathcal{T}\_{\xi\_{l}}D^{\alpha'}f\|\|^{2} &\leq c\|D^{\alpha-\alpha'}\xi\_{l}\|\_{W^{1,\infty}}^{2}\|\mathcal{T}\_{\xi\_{l}}D^{\alpha'}f\|\_{W^{1,2}}^{2} \\ &\leq c\|D^{\alpha-\alpha'}\xi\_{l}\|\_{W^{1,\infty}}^{2}\left(c\|\|\xi\_{l}\|\_{W^{2,\infty}}^{2}\|D^{\alpha'}f\|\_{W^{1,2}}^{2}\right) \\ &\leq c\|\|\xi\_{l}\|\_{W^{k+1,\infty}}^{4}\|f\|\_{W^{k,2}}^{2} \end{aligned}$$

and likewise

$$\begin{split} \|\mathcal{T}\_{\xi\_{l}}^{\*}\mathcal{B}\_{D^{\alpha-\alpha'}\xi\_{l}}D^{\alpha'}f\|^{2} &\leq c\|\xi\_{l}\|\_{W^{1,\infty}}^{2}\|\mathcal{B}\_{D^{\alpha-\alpha'}\xi\_{l}}D^{\alpha'}f\|^{2} \\ &\leq c\|\xi\_{l}\|\_{W^{1,\infty}}^{2}\Big(c\|D^{\alpha-\alpha'}\xi\_{l}\|\_{W^{1,\infty}}^{2}\|D^{\alpha'}f\|\_{W^{1,2}}^{2}\Big) \\ &\leq c\|\xi\_{l}\|\_{W^{k+1,\infty}}^{4}\|f\|\_{W^{k,2}}^{2} .\end{split}$$

Now we show explicitly that the commutator in

$$(B\_{D^{\alpha-\alpha'}\xi\_{l}}\mathcal{L}\_{\xi\_{l}} - \mathcal{L}\_{\xi\_{l}}B\_{D^{\alpha-\alpha'}\xi\_{l}})D^{\alpha'}f\tag{61}$$

from (60) is of first order (so of *k*th order when composed with *Dα* ), through the expressions

$$\begin{split} B\_{D^{\alpha-\alpha'}\xi\_{l}}\mathcal{L}\_{\xi\_{l}}D^{\alpha'}f &= \sum\_{j=1}^{3} \left( D^{\alpha-\alpha'}\xi\_{l}^{j}\,\partial\_{j} \Big{(} \sum\_{k=1}^{3} \xi\_{l}^{k}\,\partial\_{k}D^{\alpha'}f \Big{)} \right. \\ &\left. + \left( \sum\_{k=1}^{3} \xi\_{l}^{k}\,\partial\_{k}D^{\alpha'}f \Big{)} \right)^{j} \nabla D^{\alpha-\alpha'}\xi\_{l}^{j} \right) \\ &= \sum\_{j=1}^{3} \sum\_{k=1}^{3} \left( D^{\alpha-\alpha'}\xi\_{l}^{j}\,\partial\_{j}\xi\_{l}^{k}\,\partial\_{k}D^{\alpha'}f \Big{)} \\ &\left. + D^{\alpha-\alpha'}\xi\_{l}^{j}\,\xi\_{l}^{k}\,\partial\_{j}\partial\_{k}D^{\alpha'}f + \xi\_{l}^{k}\,\partial\_{k}D^{\alpha'}f \,\nabla D^{\alpha-\alpha'}\xi\_{l}^{j} \right) \right) \end{split}$$

and

$$\mathcal{L}\_{\xi\_l} B\_{D^{\alpha-a'}\xi\_l} D^{\alpha'} f = \sum\_{k=1}^3 \xi\_l^k \partial\_k \left( \sum\_{j=1}^3 D^{\alpha-a'} \xi\_l^j \, \partial\_j D^{a'} f + D^{\alpha'} f^j \nabla D^{\alpha-a'} \xi\_l^j \right)$$

$$\begin{split} 0 &= \sum\_{j=1}^{3} \sum\_{k=1}^{3} \left( \xi\_{l}^{k} \partial\_{k} D^{\alpha - \alpha'} \xi\_{l}^{j} \partial\_{j} D^{\alpha'} f + \xi\_{l}^{k} D^{\alpha - \alpha'} \xi\_{l}^{j} \partial\_{k} \partial\_{j} D^{\alpha'} f \right) \\ &+ \xi\_{l}^{k} \partial\_{k} D^{\alpha'} f^{j} \nabla D^{\alpha - \alpha'} \xi\_{l}^{j} + \xi\_{l}^{k} D^{\alpha'} f^{j} \partial\_{k} \nabla D^{\alpha - \alpha'} \xi\_{l}^{j} \end{split}$$

such that

$$\begin{split}(61) &= \sum\_{j=1}^{3} \sum\_{k=1}^{3} \left( D^{\alpha-\alpha'} \xi\_{l}^{j} \partial\_{j} \xi\_{l}^{k} \partial\_{k} D^{\alpha'} f - \xi\_{l}^{k} \partial\_{k} D^{\alpha-\alpha'} \xi\_{l}^{j} \partial\_{j} D^{\alpha'} f \right) \\ &- \xi\_{l}^{k} D^{\alpha'} f^{j} \partial\_{k} \nabla D^{\alpha-\alpha'} \xi\_{l}^{j} \end{split}$$

and in particular

(61)<sup>2</sup> <sup>≤</sup> *<sup>c</sup>* 3 *j*=1 3 *k*=1 *Dα*−*α ξ j <sup>i</sup> ∂j <sup>ξ</sup> <sup>k</sup> <sup>i</sup> ∂kD<sup>α</sup> <sup>f</sup>* <sup>2</sup> + *<sup>ξ</sup> <sup>k</sup> <sup>i</sup> ∂kD<sup>α</sup>*−*α ξ j <sup>i</sup> ∂j <sup>D</sup>α <sup>f</sup>* <sup>2</sup> + *<sup>ξ</sup> <sup>k</sup> <sup>i</sup> <sup>D</sup>α <sup>f</sup> <sup>j</sup> ∂k*∇*Dα*−*α ξ j i* 2 = *c* 3 *j*=1 3 *k*=1 3 *l*=1 *Dα*−*α ξ j <sup>i</sup> ∂j <sup>ξ</sup> <sup>k</sup> <sup>i</sup> ∂kD<sup>α</sup> f l* 2 *<sup>L</sup>*2*(*O;R*)* + *<sup>ξ</sup> <sup>k</sup> <sup>i</sup> ∂kD<sup>α</sup>*−*α ξ j <sup>i</sup> ∂jD<sup>α</sup> f l* 2 *<sup>L</sup>*2*(*O;R*)* + *<sup>ξ</sup> <sup>k</sup> <sup>i</sup> <sup>D</sup>α f <sup>j</sup> ∂k∂lD<sup>α</sup>*−*α ξ j i* 2 *<sup>L</sup>*2*(*O;R*)* <sup>≤</sup> *<sup>c</sup>ξi*<sup>4</sup> *Wk*+2*,*∞ 3 *j*=1 3 *k*=1 3 *l*=1 *∂kD<sup>α</sup> f l* 2 *<sup>L</sup>*2*(*O;R*)* + *∂j <sup>D</sup>α f l* 2 *<sup>L</sup>*2*(*O;R*)* + *Dα f l* 2 *<sup>L</sup>*2*(*O;R*)* <sup>≤</sup> *<sup>c</sup>ξi*<sup>4</sup> *Wk*+2*,*<sup>∞</sup> 3 *j*=1 *<sup>f</sup>* <sup>2</sup> *<sup>W</sup>k,*<sup>2</sup> <sup>+</sup> 3 *k*=1 *<sup>f</sup>* <sup>2</sup> *<sup>W</sup>k,*<sup>2</sup> <sup>+</sup> 3 *j*=1 3 *k*=1 *Dα <sup>f</sup>* <sup>2</sup> <sup>≤</sup> *<sup>c</sup>ξi*<sup>4</sup> *<sup>W</sup>k*+2*,*∞*<sup>f</sup>* <sup>2</sup> *<sup>W</sup>k,*<sup>2</sup> *.*

Finally now we can piece these four inequalities together to produce a bound on (59):

On the 3D Navier-Stokes Equations with Stochastic Lie Transport 91

$$\begin{split} |(S9)| &\le \|D^{\alpha}f\| \Big| \left| \sum\_{\beta<\alpha'} B\_{D^{\alpha-\omega'}\xi\_i} B\_{D^{\alpha'-\beta}\xi\_i} D^{\beta}f + (60) \right| \Big| \\ &\le \|D^{\alpha}f\| \Big( \Big| \Big| \sum\_{\beta<\alpha'} B\_{D^{\alpha-\omega'}\xi\_i} B\_{D^{\alpha'-\beta}\xi\_i} D^{\beta}f \Big| \Big| + \|B\_{D^{\alpha-\omega'}\xi\_i} \nabla\_{\xi\_i} D^{\alpha'}f \| \Big| \\ &\quad + \|T^{\alpha}\_{\xi\_i^{\omega}} B\_{D^{\alpha-\omega'}\xi\_i} D^{\alpha'}f \| + \|(61)\| \Big) \\ &\le c \|D^{\alpha}f\| \Big( \|\xi\_i\|\_{W^{k+1,\infty}}^2 \|f\|\_{W^{k,2}} + \|\xi\_i\|\_{W^{k+1,\infty}}^2 \|f\|\_{W^{k+1,\infty}} \|f\|\_{W^{k,2}} \\ &\quad + \|\xi\_i\|\_{W^{k+1,\infty}}^2 \|f\|\_{W^{k,2}} + \|\xi\_i\|\_{W^{k+2,\infty}}^2 \|f\|\_{W^{k,2}} \Big) \\ &\le c \|D^{\alpha}f\| \Big( \|\xi\_i\|\_{W^{k+2,\infty}}^2 \|f\|\_{W^{k,2}} \Big) \\ &\le c \|\xi\_i\|\_{W^{k+2,\infty}}^2 \|f\|\_{W^{k,2}}^2 \end{split}$$

and subsequently of (58):

$$\mathbb{P}(58) = \sum\_{a'$$

We can now conclude the proof of (19) by observing that

$$\begin{split} (\mathfrak{S}2) &= \sum\_{|a| \le k} (\mathfrak{S}3) \\ &= \sum\_{|a| \le k} (\mathfrak{S}6) + (\mathfrak{S}7) + (\mathfrak{S}8) \\ &\le c \sum\_{|a| \le k} \left( \|\xi\_{l}\|\_{W^{2,\infty}}^2 \|D^{\boldsymbol{a}}f\|^2 + \|\xi\_{l}\|\_{W^{k+1,\infty}}^2 \|f\|\_{W^{k,2}}^2 + \|\xi\_{l}\|\_{W^{k+2,\infty}}^2 \|f\|\_{W^{k,2}}^2 \right) \\ &\le c \sum\_{|a| \le k} \|\xi\_{l}\|\_{W^{k+2,\infty}}^2 \|f\|\_{W^{k,2}}^2 \\ &= c \left\|\xi\_{l}\right\|\_{W^{k+2,\infty}}^2 \|f\|\_{W^{k,2}}^2. \end{split}$$

As for (20), using (54) once more, we see that for each *α* in the sum for the inner product,

$$|\langle D^{\alpha}B\_{l}f, D^{\alpha}f\rangle| = \left| \left\langle \sum\_{\alpha'<\alpha} B\_{D^{\alpha-\alpha'}\xi\_{l}}D^{\alpha'}f + B\_{\xi\_{l}}D^{\alpha}f, D^{\alpha}f \right\rangle \right|.$$

$$\begin{aligned} &= \left| \left\langle \sum\_{\alpha'<\alpha} B\_{D^{\alpha-\alpha'}\xi\_i} D^{\alpha'} f, D^{\alpha} f \right\rangle + \langle B\_{\xi\_i} D^{\alpha} f, D^{\alpha} f \rangle \right| \\ &\leq \left| \left\langle \sum\_{\alpha'<\alpha} B\_{D^{\alpha-\alpha'}\xi\_i} D^{\alpha'} f, D^{\alpha} f \right\rangle \right| + |\langle \mathcal{T}\_{\xi\_i} D^{\alpha} f, D^{\alpha} f \rangle| \end{aligned}$$

using the cancellation from (15) to dispose of the order *k* + 1 term. In our treatment of (57) in (19), we showed the bound

$$\left\| \left| \sum\_{\beta < \alpha} B\_{D^{\alpha-\beta}\xi\_l} D^{\beta} f \right| \right\| \leq c \| \xi\_l \|\_{W^{k+1,\infty}} \| f \|\_{W^{k,2}}$$

and therefore

$$\begin{aligned} \left| \left\langle \sum\_{\alpha'<\alpha} B\_{D^{\alpha-\alpha'}\xi\_l} D^{\alpha'} f, D^{\alpha} f \right\rangle \right| &\leq c \|\xi\_l\|\_{W^{k+1,\infty}} \|f\|\_{W^{k,2}} \|D^{\alpha} f\|\_{L^2} \\ &\leq c \|\xi\_l\|\_{W^{k+1,\infty}} \|f\|\_{W^{k,2}}^2 \end{aligned}$$

whilst a simple bound on the second term yields the result.

*Proof of Lemma 2.7* For <sup>∇</sup>*<sup>g</sup>* <sup>∈</sup> *<sup>L</sup>*2*,*<sup>⊥</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*)* <sup>∩</sup> *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)*,

$$\begin{aligned} B\_i(\nabla g) &= \mathcal{L}\_{\xi\_i}(\nabla g) + \overline{\mathcal{T}\_{\xi\_i}}(\nabla g) \\ &= \sum\_{j=1}^3 \xi\_j^j \partial\_j(\nabla g) + \sum\_{j=1}^3 \partial\_j g \nabla \xi\_j^j \\ &= \sum\_{j=1}^3 \xi\_i^j (\nabla \partial\_j g) + \sum\_{j=1}^3 (\nabla \xi\_i^j) \partial\_j g \\ &= \nabla \sum\_{j=1}^3 \xi\_i^j \partial\_j g \\ &\in L\_\sigma^{2, \perp}(\mathcal{O}; \mathbb{R}^3) \end{aligned}$$

using in the last line the assumption that <sup>∇</sup>*<sup>g</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)* and that *ξi* <sup>∈</sup> *<sup>W</sup>*1*,*∞*(*O; <sup>R</sup>3*)*. We now make a distinction between the settings of <sup>T</sup><sup>3</sup> and <sup>O</sup>. For the bounded domain <sup>O</sup>, we take any *<sup>f</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)* and use the representation (9) to see that

$$\mathcal{P}B\_l f = \mathcal{P}B\_l \left(\mathcal{P}f + \nabla \mathbf{g}\right) = \mathcal{P}B\_l \mathcal{P}f + \mathcal{P}(B\_l \nabla \mathbf{g}) = \mathcal{P}B\_l \mathcal{P}f$$

as required, using again the *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)* regularity of both components of the decomposition (9). In the case of the Torus we must address the constant term in the decomposition (10), appreciating that

$$B\_l c = \mathcal{T}\_{\xi\_l} c = \sum\_{j=1}^3 c^j \nabla \xi\_l^j = \nabla \sum\_{j=1}^3 c^j \xi\_l^j \in L^{2,\perp}\_{\sigma}(\mathbb{T}^3; \mathbb{R}^3)$$

so the result follows in the same manner.

We now state two results used in proving the Lemmas of Sects. 3.2 and 4.3. The first is derived from the Gagliardo-Nirenberg Inequality, whilst the second is proved below.

**Proposition 5.1** *There exists a constant <sup>c</sup> such that for any <sup>f</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*) and <sup>g</sup>* <sup>∈</sup> *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*T3; <sup>R</sup>3*),*

$$\|\mathcal{L}\_f\mathbf{g}\| \le c \|f\|\_1 \|\mathbf{g}\|\_1^{1/2} \|\mathbf{g}\|\_2^{1/2}.\tag{62}$$

*Meanwhile for <sup>f</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*) and <sup>g</sup>* <sup>∈</sup> *<sup>W</sup>*2*,*<sup>2</sup> *<sup>σ</sup> (*O; <sup>R</sup>3*) we have that*

$$\|\mathcal{L}\_f\mathbf{g}\| \le c \|f\|\_1 \left( \|\mathbf{g}\|\_1^{1/2} \|\mathbf{g}\|\_2^{1/2} + \|\mathbf{g}\|\_1 \right). \tag{63}$$

**Proposition 5.2** *There exists a constant <sup>c</sup> such that for every <sup>f</sup>* <sup>∈</sup> *<sup>W</sup>*3*,*2*(*O; <sup>R</sup>3*),*

$$\left\|[\Delta,\mathcal{B}\_{l}]f\right\|^{2} \leq c\left\|\xi\_{l}\right\|\_{W^{3,\infty}}^{2}\left\|f\right\|\_{W^{2,2}}^{2}$$

*where* [*,Bi*] *is the commutator*

$$[\Delta, B\_l] := \Delta B\_l - B\_l \Delta \dots$$

*Proof* We fix any such *f* and first show that

$$\{\Delta, B\_{l}\}f = \sum\_{k=1}^{3} \sum\_{j=1}^{3} \left(\partial\_{k}^{2} \xi\_{l}^{j} \,^{j} \partial\_{j} f + 2 \partial\_{k} \xi\_{l}^{j} \, \partial\_{k} \partial\_{j} f + 2 \partial\_{k} f^{j} \, \partial\_{k} \nabla \xi\_{l}^{j} + f^{j} \partial\_{k}^{2} \nabla \xi\_{l}^{j}\right). \tag{64}$$

Indeed

$$\begin{aligned} \Delta B\_l f &= \sum\_{k=1}^3 \partial\_k^2 \left( \sum\_{j=1}^3 \left( \xi\_i^j \partial\_j f + f^j \nabla \xi\_i^j \right) \right) \\ &= \sum\_{k=1}^3 \partial\_k \left( \sum\_{j=1}^3 \left( \partial\_k \xi\_i^j \partial\_j f + \xi\_i^j \partial\_k \partial\_j f + \partial\_k f^j \nabla \xi\_i^j + f^j \partial\_k \nabla \xi\_i^j \right) \right) \end{aligned}$$

$$\begin{aligned} 0 &= \sum\_{k=1}^3 \sum\_{j=1}^3 \left( \partial\_k^2 \xi\_i^j \partial\_j f + 2 \partial\_k \xi\_i^j \partial\_k \partial\_j f + \xi\_i^j \partial\_k^2 \partial\_j f + \partial\_k^2 f^j \nabla \xi\_i^j \right) \\ &+ 2 \partial\_k f^j \partial\_k \nabla \xi\_i^j + f^j \partial\_k^2 \nabla \xi\_i^j \end{aligned}$$

and

$$\begin{aligned} B\_l \Delta f &= \sum\_{j=1}^3 \left( \xi\_i^j \partial\_j \left( \sum\_{k=1}^3 \partial\_k^2 f \right) + \left( \sum\_{k=1}^3 \partial\_k^2 f \right)^j \nabla \xi\_i^j \right) \\ &= \sum\_{k=1}^3 \sum\_{j=1}^3 \left( \xi\_i^j \partial\_k^2 \partial\_j f + \partial\_k^2 f^j \nabla \xi\_i^j \right) \end{aligned}$$

therefore

$$\begin{aligned} [\Delta, B\_l]f &= \Delta B\_l f - B\_l \Delta f \\ &= \sum\_{k=1}^3 \sum\_{j=1}^3 \left( \partial\_k^2 \xi\_l^j \partial\_j f + 2 \partial\_k \xi\_l^j \partial\_k \partial\_j f + 2 \partial\_k f^j \partial\_k \nabla \xi\_l^j + f^j \partial\_k^2 \nabla \xi\_l^j \right) f \end{aligned}$$

justifying (64). The result then follows with direct calculation:

[*,Bi*]*<sup>f</sup>* <sup>2</sup> <sup>=</sup> 3 *k*=1 3 *j*=1 *∂*2 *k ξ j <sup>i</sup> ∂jf* <sup>+</sup>2*∂kξ <sup>j</sup> <sup>i</sup> ∂k∂jf* <sup>+</sup>2*∂kf <sup>j</sup> ∂k*∇*<sup>ξ</sup> <sup>j</sup> <sup>i</sup>* <sup>+</sup>*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*<sup>2</sup> *<sup>k</sup>* <sup>∇</sup>*<sup>ξ</sup> <sup>j</sup> i* 2 ≤ *c* 3 *k*=1 3 *j*=1 *∂*2 *k ξ j <sup>i</sup> ∂jf* <sup>+</sup> <sup>2</sup>*∂kξ <sup>j</sup> <sup>i</sup> ∂k∂jf* <sup>+</sup> <sup>2</sup>*∂kf <sup>j</sup> ∂k*∇*<sup>ξ</sup> <sup>j</sup> <sup>i</sup>* <sup>+</sup> *<sup>f</sup> <sup>j</sup> <sup>∂</sup>*<sup>2</sup> *<sup>k</sup>* <sup>∇</sup>*<sup>ξ</sup> <sup>j</sup> i* 2 = *c* 3 *k*=1 3 *j*=1 3 *l*=1 *∂*2 *k ξ j <sup>i</sup> ∂jf <sup>l</sup>* <sup>+</sup> <sup>2</sup>*∂kξ <sup>j</sup> <sup>i</sup> ∂k∂jf <sup>l</sup>* <sup>+</sup> <sup>2</sup>*∂kf <sup>j</sup> ∂k∂lξ <sup>j</sup> i* <sup>+</sup>*<sup>f</sup> <sup>j</sup> <sup>∂</sup>*<sup>2</sup> *<sup>k</sup> ∂lξ <sup>j</sup> i* 2 *<sup>L</sup>*2*(*O;R*)* ≤ *c* 3 *k*=1 3 *j*=1 3 *l*=1 *∂*2 *k ξ j <sup>i</sup> ∂jf <sup>l</sup>* 2 *<sup>L</sup>*2*(*O;R*)* + 2*∂kξ <sup>j</sup> <sup>i</sup> ∂k∂jf <sup>l</sup>* 2 *<sup>L</sup>*2*(*O;R*)* + 2*∂kf <sup>j</sup> ∂k∂lξ <sup>j</sup> i* 2 *<sup>L</sup>*2*(*O;R*)* + *<sup>f</sup> <sup>j</sup> <sup>∂</sup>*<sup>2</sup> *<sup>k</sup> ∂lξ <sup>j</sup> i* 2 *<sup>L</sup>*2*(*O;R*)* 

$$\begin{split} & \quad \| \leq c \| \xi\_{l} \|\_{W^{3,\infty}}^{2} \sum\_{k=1}^{3} \sum\_{j=1}^{3} \sum\_{l=1}^{3} \left( \| \partial\_{j} f^{l} \|\_{L^{2}(\mathcal{O};\mathbb{R})}^{2} + \| \partial\_{k} \partial\_{j} f^{l} \|\_{L^{2}(\mathcal{O};\mathbb{R})}^{2} \right) \\ & \qquad + \| \partial\_{k} f^{j} \|\_{L^{2}(\mathcal{O};\mathbb{R})}^{2} + \| f^{j} \|\_{L^{2}(\mathcal{O};\mathbb{R})}^{2} \right) \\ & \leq c \| \xi\_{l} \|\_{W^{3,\infty}}^{2} \| f \|\_{W^{2,2}}^{2} . \end{split}$$

We now prove Lemma 3.7 through the properties (26), (27) and (28) independently.

*Proof of (26)* We use the established result that for *k* = 2 then the Sobolev Space *<sup>W</sup>k,*2*(*O; <sup>R</sup>*)* is an algebra (a result first presented in [46]), to deduce that

$$\|\mathcal{L}\_{f\_n} f\_n\|\_{W^{2,2}} = \left\| \sum\_{j=1}^3 f\_n^j \partial\_j f\_n \right\|\_{W^{2,2}} \le c \|f\_n\|\_2 \|f\_n\|\_3.$$

From here we simply use that P*<sup>n</sup>* is self-adjoint and Young's Inequality to see that

$$\left| \langle \mathcal{P}\_n \mathcal{P} \mathcal{L}\_{f\_n} f\_n, f\_n \rangle\_2 \right| = \left| \langle \mathcal{P} \mathcal{L}\_{f\_n} f\_n, f\_n \rangle\_2 \right| \le c(\varepsilon) \|f\_n\|\_2^4 + \varepsilon \|f\_n\|\_3^2.$$

*Proof of (27)* As the P*<sup>n</sup>* are self-adjoint we can readily justify the inequality P*n*P*B*<sup>2</sup> *<sup>i</sup> fn, fn*<sup>1</sup> + P*n*P*Bifn,*P*n*P*Bifn*<sup>1</sup> ≤ P*B*<sup>2</sup> *<sup>i</sup> fn, fn*<sup>1</sup> + P*Bifn,*P*Bifn*<sup>1</sup>

and moreover from (12) that this is just

$$
\langle \mathcal{P}B\_i^2 f\_n, Af\_n \rangle + \langle \mathcal{P}B\_i f\_n, A\mathcal{P}B\_i f\_n \rangle.
$$

We rewrite this as

$$
\langle \left(\mathcal{P}\mathcal{B}\_{l}\right)^{2} f\_{n}, A f\_{n} \rangle + \langle \mathcal{P}\mathcal{B}\_{l} f\_{n}, A B\_{l} f\_{n} \rangle
$$

and further as

$$
\langle \mathcal{P}B\_l f\_n, B\_l^\* A f\_n \rangle - \langle \mathcal{P}B\_l f\_n, \Delta B\_l f\_n \rangle \tag{65}
$$

for the adjoint *B*∗ *<sup>i</sup>* = L*ξi* + T <sup>∗</sup> *ξi* . We look to commute the Laplacian and *Bi*, using Proposition 5.2 and subsequently the cancellation of the derivative in *Bi* when considered with the adjoint. Indeed,

$$\begin{split}-\langle\mathcal{P}B\_{l}f\_{n},\Delta B\_{l}f\_{n}\rangle &= -\langle\mathcal{P}B\_{l}f\_{n},([\Delta,\,\mathcal{B}\_{l}]+\mathcal{B}\_{l}\Delta)f\_{n}\rangle\\ &= -\langle\mathcal{P}B\_{l}f\_{n},[\Delta,\,\mathcal{B}\_{l}]f\_{n}\rangle - \langle\mathcal{P}B\_{l}f\_{n},\mathcal{P}B\_{l}\Delta f\_{n}\rangle\\ &= -\langle\mathcal{P}B\_{l}f\_{n},[\Delta,\,\mathcal{B}\_{l}]f\_{n}\rangle + \langle\mathcal{P}B\_{l}f\_{n},\mathcal{P}B\_{l}Af\_{n}\rangle\\ &= -\langle\mathcal{P}B\_{l}f\_{n},[\Delta,\,\mathcal{B}\_{l}]f\_{n}\rangle + \langle\mathcal{P}B\_{l}f\_{n},\mathcal{B}\_{l}Af\_{n}\rangle. \end{split}$$

Thus (65) becomes

$$
\langle \mathcal{P}B\_l f\_n, B\_l^\* A f\_n \rangle - \langle \mathcal{P}B\_l f\_n, [\Delta, B\_l] f\_n \rangle + \langle \mathcal{P}B\_l f\_n, B\_l A f\_n \rangle
$$

or simply

$$\langle \mathcal{P}B\_l f\_n, (\mathcal{T}\_{\xi\_l} + \mathcal{T}\_{\xi\_l}^\*)Af\_n - [\Delta, B\_l]f\_n \rangle \tag{66}$$

which we look to bound through Cauchy-Schwarz and the results of (18) and Proposition 5.2 to see that

$$\begin{split}(66) &\leq \|\mathcal{P}B\_{l}f\_{n}\|\left(\|(\mathcal{T}\_{\xi\_{l}}+\mathcal{T}\_{\xi\_{l}}^{\*})Af\_{n}\|+\|[\boldsymbol{\Delta},\boldsymbol{B}\_{l}]f\_{n}\|\right)\right) \\ &\leq c\|\|\xi\_{l}\|\|\_{W^{1,\infty}}\|\|f\_{n}\|\|\_{W^{1,2}}\left(\|\|\xi\_{l}\|\|\_{W^{1,\infty}}\|Af\_{n}\|+\|\xi\_{l}\|\_{W^{3,\infty}}\|\|f\_{n}\|\_{W^{2,2}}\right) \\ &\leq c\|\|\xi\_{l}\|\|\_{W^{1,\infty}}\|f\_{n}\|\_{1}\left(\|\|\xi\_{l}\|\|\_{W^{1,\infty}}\|\|f\_{n}\|\_{2}+\|\xi\_{l}\|\_{W^{3,\infty}}\|\|f\_{n}\|\_{2}\right) \\ &\leq c(\varepsilon)\|\|\xi\_{l}\|\_{W^{3,\infty}}^{2}\|\|f\_{n}\|\|\_{1}^{2}+\varepsilon\|\|\xi\_{l}\|\|\_{W^{3,\infty}}^{2}\|\|f\_{n}\|\|\_{2}^{2} \end{split}$$

as required.

*Proof of (28)* As with (27) we can immediately say that

$$
\begin{split}
\langle \mathcal{P}\_n \mathcal{P} \mathcal{B}\_l^2 f\_n, f\_n \rangle\_2 &+ \langle \mathcal{P}\_n \mathcal{P} \mathcal{B}\_l f\_n, \mathcal{P}\_n \mathcal{P} \mathcal{B}\_l f\_n \rangle\_2 \\ &\leq \langle \mathcal{P} \mathcal{B}\_l^2 f\_n, f\_n \rangle\_2 + \langle \mathcal{P} \mathcal{B}\_l f\_n, \mathcal{P} \mathcal{B}\_l f\_n \rangle\_2
\end{split} \tag{67}$$

which we again manipulate to give

$$
\begin{split}
\langle (67) = \langle A\mathcal{P}B\_{l}^{2}f\_{n}, Af\_{n}\rangle + \langle A\mathcal{P}B\_{l}f\_{n}, A\mathcal{P}B\_{l}f\_{n}\rangle \\ &= -\langle \mathcal{P}\Delta B\_{l}^{2}f\_{n}, Af\_{n}\rangle - \langle AB\_{l}f\_{n}, \mathcal{P}\Delta B\_{l}f\_{n}\rangle \\ &= -\langle \mathcal{P}[\Delta, B\_{l}]B\_{l}f\_{n} + \mathcal{P}B\_{l}\Delta B\_{l}f\_{n}, Af\_{n}\rangle - \langle AB\_{l}f\_{n}, \mathcal{P}[\Delta, B\_{l}]f\_{n} + \mathcal{P}B\_{l}\Delta f\_{n}\rangle \\ &= -\langle \mathcal{P}[\Delta, B\_{l}]B\_{l}f\_{n}, Af\_{n}\rangle + \langle \mathcal{P}B\_{l}AB\_{l}f\_{n}, Af\_{n}\rangle - \langle AB\_{l}f\_{n}, \mathcal{P}[\Delta, B\_{l}]f\_{n}\rangle \\ &+ \langle AB\_{l}f\_{n}, \mathcal{P}B\_{l}Af\_{n}\rangle \\ &= \langle B\_{l}AB\_{l}f\_{n}, Af\_{n}\rangle + \langle AB\_{l}f\_{n}, B\_{l}Af\_{n}\rangle - \langle \mathcal{P}[\Delta, B\_{l}]B\_{l}f\_{n}, Af\_{n}\rangle \\ &- \langle AB\_{l}f\_{n}, \mathcal{P}[\Delta, B\_{l}]f\_{n}\rangle
\end{split}
$$

$$\begin{split} 0 &= \langle AB\_l f\_n, (B\_l + B\_l^\*) A f\_n \rangle - \langle \mathcal{P}[\Delta, B\_l] B\_l f\_n, A f\_n \rangle - \langle AB\_l f\_n, \mathcal{P}[\Delta, B\_l] f\_n \rangle \\ &= \langle AB\_l f\_n, (\mathcal{T}\_{\xi\_l} + \mathcal{T}\_{\xi\_l}^\*) A f\_n \rangle - \langle \mathcal{P}[\Delta, B\_l] B\_l f\_n, A f\_n \rangle - \langle AB\_l f\_n, \mathcal{P}[\Delta, B\_l] f\_n \rangle. \end{split}$$

We shall treat each term individually using Cauchy-Schwarz, Young's Inequality and Proposition 5.2 in the same manner as the proof of (27):

$$\begin{aligned} \langle AB\_l f\_n, (\overline{\mathcal{T}\_{\xi\_l}} + \overline{\mathcal{T}\_{\xi\_l}^\*}) Af\_n \rangle &\le \| AB\_l f\_n \| \| (\mathcal{T}\_{\xi\_l} + \overline{\mathcal{T}\_{\xi\_l}^\*}) Af\_n \| \\ &\le c(\varepsilon) \| \xi\_l \|\_{W^{3,\infty}}^2 \| f\_n \|\_2^2 + \frac{\varepsilon}{3} \| \xi\_l \|\_{W^{3,\infty}}^2 \| f\_n \|\_3^2 \end{aligned}$$

as well as

− P[*,Bi*]*Bifn, Afn*≤P[*,Bi*]*BifnAfn* ≤ *c*[*,Bi*]*Bifnfn*<sup>2</sup> ≤ *cξiW*3*,*∞*BifnW*2*,*<sup>2</sup> *fn*<sup>2</sup> <sup>≤</sup> *<sup>c</sup>ξi*<sup>2</sup> *<sup>W</sup>*3*,*∞*fnW*3*,*<sup>2</sup> *fn*<sup>2</sup> <sup>≤</sup> *c(ε)ξi*<sup>2</sup> *W*3*,*∞*fn*<sup>2</sup> 2 + *ε* 3 *ξi*<sup>2</sup> *W*3*,*∞*fn*<sup>2</sup> 3

and finally

$$\begin{aligned} -\langle AB\_{\bar{l}}f\_n, \mathcal{P}[\Delta, B\_{\bar{l}}]f\_n \rangle &\leq \|AB\_{\bar{l}}f\_n\| \|\mathcal{P}[\Delta, B\_{\bar{l}}]f\_n\| \\ &\leq c \|B\_{\bar{l}}f\_n\|\_{W^{2,2}} \|[\Delta, B\_{\bar{l}}]f\_n\| \\ &\leq c \|\xi\_{\bar{l}}\|\_{W^{3,\infty}}^2 \|\|f\_n\|\_{W^{3,2}} \|\|f\_n\|\_{W^{2,2}} \\ &\leq c(\varepsilon) \|\|\xi\_{\bar{l}}\|\_{W^{3,\infty}}^2 \|\|f\_n\|\_2^2 + \frac{\varepsilon}{3} \|\|\xi\_{\bar{l}}\|\_{W^{3,\infty}}^2 \|\|f\_n\|\_3^2 \end{aligned}$$

Summing these up completes the proof.

*Proof of Lemma 3.8* Observe that

$$\begin{aligned} \langle \mathcal{P} \mathcal{L}\_f f - \mathcal{P} \mathcal{L}\_\mathfrak{g} \mathbf{g}, f - \mathbf{g} \rangle\_1 &= \langle \mathcal{P} \mathcal{L}\_f f - \mathcal{L}\_\mathfrak{g} \mathbf{g}, A(f - \mathbf{g}) \rangle \\ &= \langle \mathcal{L}\_{f - \mathfrak{g}} f + \mathcal{L}\_\mathfrak{g} (f - \mathbf{g}), A(f - \mathbf{g}) \rangle\_1 \end{aligned}$$

and so it is sufficient to control the terms

$$\begin{aligned} \left| \langle \mathcal{L}\_{f-\mathfrak{g}} f, A(f-\mathfrak{g}) \rangle \right| &\leq \| \mathcal{L}\_{f-\mathfrak{g}} f \| \| A(f-\mathfrak{g}) \| \\ &\leq c(\varepsilon) \| f \| \_2^2 \| f - \mathfrak{g} \| \_1^2 + \frac{\varepsilon}{2} \| f - \mathfrak{g} \| \_2^2 \end{aligned}$$

and

$$\begin{aligned} \left| \langle \mathcal{L}\_{\mathfrak{g}}(f - \mathfrak{g}), A(f - \mathfrak{g}) \rangle \right| &\leq \| \mathcal{L}\_{\mathfrak{g}}(f - \mathfrak{g}) \| \| A(f - \mathfrak{g}) \| \\ &\leq c \| \mathfrak{g} \| \_1 \| f - \mathfrak{g} \| \_1^{1/2} \| f - \mathfrak{g} \| \_2^{1/2} \| f - \mathfrak{g} \| \_2 \\ &\leq c(\varepsilon) \| \mathfrak{g} \| \_1^4 \| f - \mathfrak{g} \| \_1^2 + \frac{\varepsilon}{2} \| f - \mathfrak{g} \| \_2^2 \end{aligned}$$

using (62) and Young's Inequality with conjugate exponents 4 and 4*/*3.

*Proof of Lemma 3.9* As in Lemma 3.8, we use the inequality

$$\left| \left| \langle \mathcal{P} \mathcal{L}\_f f - \mathcal{P} \mathcal{L}\_\text{g} \mathbf{g}, f - \mathbf{g} \rangle \right| \right| \leq \left| \langle \mathcal{L}\_{f-\text{g}} f, f - \mathbf{g} \rangle \right| + \left| \langle \mathcal{L}\_\text{g} (f - \mathbf{g}), f - \mathbf{g} \rangle \right|.$$

For the first term, appealing to (4), observe that

$$\left| \langle \mathcal{L}\_{f-\mathbf{g}}f, f-\mathbf{g} \rangle \right| \leq \| \mathcal{L}\_{f-\mathbf{g}}f \| \| f-\mathbf{g} \| \leq c(\varepsilon) \| f \|\_{2}^{2} \| f-\mathbf{g} \|^{2} + \varepsilon \| f-\mathbf{g} \|\_{1}^{2}.$$

The second term is null due to (15), which concludes the proof.

*Proof of Lemma 4.4* We consider the proofs individually.


We note that property 3 can be shown for arbitrary *α* if we assume sufficient regularity for *ξi*, as we did in Proposition 2.6. Property 1 is clear from the same structure L ∗ *<sup>i</sup>* = −L*ξi* + Q<sup>∗</sup> *<sup>i</sup>* where Q*i,* Q<sup>∗</sup> *<sup>i</sup>* are of zeroth order. We calculate the commutator in <sup>2</sup> explicitly, acting on *<sup>f</sup>* <sup>∈</sup> *<sup>W</sup>*1*,*2*(*O; <sup>R</sup>3*)*:

$$\begin{aligned} \mathcal{Q}\_l \mathcal{L}\_{\xi\_l} f &= \sum\_{k=1}^3 \left( \mathcal{L}\_{\xi\_l} f \right)^k \partial\_k \xi\_l = \sum\_{k=1}^3 \left( \sum\_{j=1}^3 \xi\_l^j \partial\_j f \right)^k \partial\_k \xi\_l \\ &= \sum\_{k=1}^3 \sum\_{j=1}^3 \xi\_l^j \partial\_j f^k \partial\_k \xi\_l \end{aligned}$$

and

$$\mathcal{L}\_{\xi\_l} \mathcal{Q}\_l f = \sum\_{j=1}^3 \xi\_i^j \partial\_j \left( \mathcal{Q}\_l f \right) = \sum\_{j=1}^3 \xi\_i^j \partial\_j \left( \sum\_{k=1}^3 f^k \partial\_k \xi\_l \right)$$

$$= \sum\_{j=1}^3 \sum\_{k=1}^3 \xi\_i^j \left( \partial\_j f^k \partial\_k \xi\_l + f^k \partial\_j \partial\_k \xi\_l \right)$$

hence

$$[[\mathcal{Q}\_i, \mathcal{L}\_{\xi\_i}] = -\sum\_{j=1}^3 \sum\_{k=1}^3 \xi\_i^j f^k \partial\_j \partial\_k \xi\_k$$

which is of zeroth order. As for 3, the term which needs to be addressed is the one [L*<sup>D</sup>αξi,*L*ξi*] which was already attended to in the original proof, so the result is concluded here.

Assumption (46): Comparing to the proof of (20), this is a consequence of the property (15) once more and the same boundedness of (18).

Assumption (47): Once more the critical term is [*,*L*ξi*] which was addressed in the proof of Proposition 5.2.

#### *Proof of Lemma 4.6* Note that

$$\left| \left| \langle \mathcal{P}\_n \mathcal{P} \mathcal{L}\_{BS\_K \phi\_n} \phi\_n, \phi\_n \rangle\_1 \right| \right| = \left| \langle \mathcal{P}\_n \mathcal{P} \mathcal{L}\_{BS\_K \phi\_n} \phi\_n, A\phi\_n \rangle \right| \leq c(\varepsilon) \left\| \langle \mathcal{L}\_{BS\_K \phi\_n}^{\rho} \phi\_n \rangle \right\|^2 + \varepsilon \left\| \phi\_n \right\|\_2^2$$

and

$$\left\| \left\| \mathcal{L}\_{BS\_K\phi\_n} \phi\_n \right\| \right\|^2 \le 2 \left( \left\| \mathcal{L}\_{BS\_K\phi\_n} \phi\_n \right\| \right^2 + \left\| \mathcal{L}\_{\phi\_n} B S\_K \phi\_n \right\| ^2 \right)^{1/2}$$

so we look to control these two terms. Indeed,

$$\begin{split} \left\| \mathcal{L}\_{BS\_{K}\phi\_{n}}\phi\_{n} \right\|^{2} &\leq c \sum\_{j=1}^{3} \sum\_{k=1}^{3} \left\| BS\_{K}\phi\_{n}^{j} \right\|\_{L^{\infty}(\mathcal{O};\mathbb{R})}^{2} \left\| \partial\_{j}\phi\_{n}^{k} \right\|\_{L^{2}(\mathcal{O};\mathbb{R})}^{2} \\ &\leq c \sum\_{j=1}^{3} \sum\_{k=1}^{3} \left\| BS\_{K}\phi\_{n}^{j} \right\|\_{W^{2,2}(\mathcal{O};\mathbb{R})}^{2} \left\| \partial\_{j}\phi\_{n}^{k} \right\|\_{L^{2}(\mathcal{O};\mathbb{R})}^{2} \\ &\leq c \left\| BS\_{K}\phi\_{n} \right\|\_{W^{2,2}}^{2} \left\| \phi\_{n} \right\|\_{W^{1,2}}^{2} \\ &\leq c \left\| \phi\_{n} \right\|\_{W^{1,2}}^{4} \\ &\leq c \left\| \phi\_{n} \right\|\_{1}^{4} \end{split}$$

$$\Xi$$

using the Sobolev Embedding of *<sup>W</sup>*2*,*2*(*O; <sup>R</sup>*)* −→ *<sup>L</sup>*∞*(*O; <sup>R</sup>*)* and item (3) of Theorem 4.2. Likewise observe that

$$\begin{split} \|\mathcal{L}\_{\phi\_{n}} B S\_{K} \phi\_{n} \|^{2} &\leq c \sum\_{j=1}^{3} \sum\_{k=1}^{3} \|\phi\_{n}^{j}\|\_{L^{4}(\mathcal{O};\mathbb{R})}^{2} \|\partial\_{j} B S\_{K} \phi\_{n}^{k} \|\_{L^{4}(\mathcal{O};\mathbb{R})}^{2} \\ &\leq c \sum\_{j=1}^{3} \sum\_{k=1}^{3} \|\phi\_{n}^{j}\|\_{W^{1,2}(\mathcal{O};\mathbb{R})}^{2} \|\partial\_{j} B S\_{K} \phi\_{n}^{k} \|\_{W^{1,2}(\mathcal{O};\mathbb{R})}^{2} \\ &\leq c \|\phi\_{n}\|\_{W^{1,2}}^{2} \|B S\_{K} \phi\_{n} \|\_{W^{2,2}}^{2} \\ &\leq c \|\phi\_{n}\|\_{W^{1,2}}^{4} \\ &\leq c \|\phi\_{n}\|\_{1}^{4} .\end{split}$$

Summing these terms completes the proof.

*Proof of Lemma 4.7* We write out the left hand side in full:

$$\begin{aligned} \left| \left< \mathcal{P} \mathcal{L}\_{BS\_K \phi} \phi - \mathcal{P} \mathcal{L}\_{BS\_K \psi} \psi, \phi - \psi \right> \right| \\ &= \left| \left< \mathcal{L}\_{BS\_K \phi} \phi - \mathcal{L}\_{\phi} BS\_K \phi - \mathcal{L}\_{BS\_K \psi} \psi + \mathcal{L}\_{\psi} BS\_K \psi, \phi - \psi \right> \right| \\ &= \left| \left< \mathcal{L}\_{BS\_K \phi - BS\_K \psi} \phi + \mathcal{L}\_{BS\_K \psi} (\phi - \psi) - \mathcal{L}\_{\phi - \psi} BS\_K \phi \right> \right| \\ &\quad - \mathcal{L}\_{\psi} (BS\_K \phi - BS\_K \psi), \phi - \psi) \right| \end{aligned}$$

from which we shall split up the terms and control them individually. Firstly,

$$\begin{aligned} \left| \langle \mathcal{L}\_{BS\_K\phi - BS\_K\psi}\phi, \phi - \psi \rangle \right| &\leq \| \mathcal{L}\_{BS\_K\phi - BS\_K\psi}\phi \| \| \phi - \psi \| \\ &\leq c \| BS\_K(\phi - \psi) \| \_2 \| \phi \| \_1 \| \phi - \psi \| \\ &\leq c \| \phi - \psi \| \_1 \| \phi \| \_1 \| \phi - \psi \| \\ &\leq c(\varepsilon) \| \phi \| \_1^2 \| \phi - \psi \| ^2 + \frac{\varepsilon}{3} \| \phi - \psi \| \_1^2 \end{aligned}$$

using (4) and that [*BSKφ* − *BSKψ*]*(x)* = <sup>O</sup> *K(x, y)*[*φ* − *ψ*]*(y)dy* is the solution specified by *BSK(φ* − *ψ)* in Theorem 4.2 for *φ* − *ψ*. Even more directly we have that

$$\langle \mathcal{L}\_{BS\_K\psi}(\phi - \psi), \phi - \psi \rangle = 0$$

owing to (15), and for the final two terms the bounds

$$\left| \left< \mathcal{L}\_{\phi - \psi} B S\_K \phi, \phi - \psi \right> \right| \leq c \| \phi - \psi \|\_1 \| B S\_K \phi \|\_2 \| \phi - \psi \|\_1$$

$$\leq c(\varepsilon) \|\phi\|\_1^2 \|\phi - \psi\|^2 + \frac{\varepsilon}{3} \|\phi - \psi\|\_1^2$$

and

$$\begin{aligned} \left| \langle \mathcal{L}\_{\psi} (BS\_K \phi - BS\_K \psi), \phi - \psi \rangle \right| &\leq c \| \psi \| \_1 \| BS\_K \phi - BS\_K \psi \| \_2 \| \phi - \psi \| \_1 \\ &\leq c (\varepsilon) \| \psi \| \_1^2 \| \phi - \psi \| ^2 + \frac{\varepsilon}{3} \| \phi - \psi \| \_1^2 . \end{aligned}$$

Summing these terms concludes the proof.

## *5.2 A Conversion from Stratonovich to Itô*

This theory is taken from [30] Subsections 2.2 and 2.3, and is provided here for simplicity to apply in Sect. 3.3. We work with a quartet of embedded Hilbert Spaces

$$V \hookrightarrow H \hookrightarrow U \hookrightarrow X$$

where the embedding is assumed as a continuous linear injection. We start from an SPDE

$$
\Psi\_l = \Psi\_0 + \int\_0^l \mathcal{Q}\Psi\_s ds + \int\_0^l \mathcal{G}\Psi\_s \circ d\mathcal{W}\_s. \tag{68}
$$

where the mappings Q, G satisfy the following conditions, with the general operator *<sup>K</sup>*˜ : *<sup>H</sup>* <sup>→</sup> <sup>R</sup> defined by

$$\tilde{K}(\phi) := c \left( 1 + \|\phi\|\_{U}^p + \|\phi\|\_{H}^q \right).$$

for any constants *c, p, q* independent of *φ*.

**Assumption 5.3** Q : *V* → *U is measurable and for any φ* ∈ *V ,*

$$\|\mathcal{Q}\phi\|\_{U} \leq \tilde{K}(\phi)[1 + \|\phi\|\_{V}^{2}].$$

**Assumption 5.4** G *is understood as a measurable mapping*

<sup>G</sup> : *<sup>V</sup>* <sup>→</sup> <sup>L</sup> <sup>2</sup>*(*U; *H ),* <sup>G</sup> : *<sup>H</sup>* <sup>→</sup> <sup>L</sup> <sup>2</sup>*(*U; *U ),* <sup>G</sup> : *<sup>U</sup>* <sup>→</sup> <sup>L</sup> <sup>2</sup>*(*U; *X)*

*defined over* U *by its action on the basis vectors*

$$
\mathcal{G}(\cdot, e\_l) := \mathcal{G}\_l(\cdot) \,.
$$

*In addition each* G*<sup>i</sup> is linear and there exists constants ci such that for all φ* ∈ *V , ψ* ∈ *H, η* ∈ *U:*

$$\|\|\mathcal{G}\_l\phi\|\|\_H \le c\_l \|\|\phi\|\|\_V, \quad \|\mathcal{G}\_l\psi\|\|\_U \le c\_l \|\|\psi\|\|\_H, \quad \|\mathcal{G}\_l\eta\|\|\_X \le c\_l \|\|\eta\|\|\_U, \quad \sum\_{l=1}^\infty c\_l^2 < \infty.$$

In this setting, we have the following result ([30] Theorem 2.3.1 and Corollary 2.3.1.1).

**Theorem 5.5** *Suppose that (-,τ) are such that: τ is a* P − *a.s. positive stopping time and is a process whereby for* P − *a.e. ω, -*·*(ω)* ∈ *C (*[0*, T* ]; *H) and -*·*(ω)*1·≤*τ (ω)* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> *(*[0*, T* ]; *<sup>V</sup> ) for all T >* <sup>0</sup> *with -*·1·≤*<sup>τ</sup> progressively measurable in V , and moreover satisfy the identity*

$$\Psi\_l = \Psi\_0 + \int\_0^{t \wedge \tau} \left( \mathcal{Q} + \frac{1}{2} \sum\_{l=1}^{\infty} \mathcal{G}\_l^2 \right) \Psi\_s ds + \int\_0^{t \wedge \tau} \mathcal{G} \Psi\_s d\mathcal{W}\_s$$

P − *a.s. in U for all t* ≥ 0*. Then the pair (-,τ) satisfies the identity*

$$
\Psi\_t = \Psi\_0 + \int\_0^{t \wedge \tau} \mathcal{Q}\Psi\_s ds + \int\_0^{t \wedge \tau} \mathcal{G}\Psi\_s \circ d\mathcal{W}\_s
$$

P − *a.s. in X for all t* ≥ 0*.*

The mapping <sup>1</sup> 2 ∞ *<sup>i</sup>*=<sup>1</sup> <sup>G</sup><sup>2</sup> *<sup>i</sup>* is understood as a pointwise limit, which is justified in [30] Subsection 2.3.

**Remark 6** Practically, Theorem 5.5 provides an Itô equation from a Stratonovich one in the sense that solving this Itô equation is sufficient to satisfy the identity in Stratonovich form. To discuss an equivalence between the equations we would need to formally define a solution of the Stratonovich equation, which we do not do here. To make sense of the Stratonovich integral one would have to impose that the solution is a local semimartingale in *U*, in which case the two notions are genuinely equivalent.

## *5.3 Abstract Solution Criterion I*

The result is given in the context of an Itô SPDE

$$
\Psi\_l = \Psi\_0 + \int\_0^l \mathcal{A}(\mathbf{s}, \Psi\_s) ds + \int\_0^l \mathcal{G}(\mathbf{s}, \Psi\_s) d\mathcal{W}\_s. \tag{69}
$$

We state the assumptions for a triplet of embedded Hilbert Spaces

$$V \hookrightarrow H \hookrightarrow U$$

and ask that there is a continuous bilinear form ·*,* ·*U*×*<sup>V</sup>* : *<sup>U</sup>* <sup>×</sup> *<sup>V</sup>* <sup>→</sup> <sup>R</sup> such that for *φ* ∈ *H* and *ψ* ∈ *V* ,

$$
\langle \phi, \psi \rangle\_{U \times V} = \langle \phi, \psi \rangle\_{H}. \tag{70}
$$

The mappings A*,* G are such that for any *T >* 0, A : [0*, T* ] × *V* → *U,* G : [0*, T* ] × *<sup>V</sup>* <sup>→</sup> <sup>L</sup> <sup>2</sup>*(*U; *H )* are measurable. We assume that *<sup>V</sup>* is dense in *<sup>H</sup>* which is dense in *U*.

**Assumption 5.6** *There exists a system (an) of elements of V such that, defining the spaces Vn* := span {*a*1*,...,an*} *and* P*<sup>n</sup> as the orthogonal projection to Vn in U, then:*

*1. There exists some constant c independent of n such that for all φ* ∈ *H,*

$$\|\mathcal{P}\_n\phi\|\_H^2 \le c \|\phi\|\_H^2. \tag{71}$$

*2. There exists a real valued sequence (μn) with μn* → ∞ *such that for any φ* ∈ *H,*

$$\|(I - \mathcal{P}\_n)\phi\|\_U \le \frac{1}{\mu\_n} \|\phi\|\_H \tag{72}$$

*where I represents the identity operator in U.*

These conditions are of course supplemented by a series of assumptions on the mappings. We shall use general notation *ct* to represent a function *<sup>c</sup>*· : [0*,*∞*)* <sup>→</sup> <sup>R</sup> bounded on [0*, T* ] for any *T >* 0, evaluated at the time *t*. Moreover we define functions *K*, *K*˜ relative to some non-negative constants *p, p, q,* ˜ *q*˜. We use a generic notation to define the functions *<sup>K</sup>* : *<sup>U</sup>* <sup>→</sup> <sup>R</sup>, *<sup>K</sup>* : *<sup>U</sup>* <sup>×</sup> *<sup>U</sup>* <sup>→</sup> <sup>R</sup>, *<sup>K</sup>*˜ : *<sup>H</sup>* <sup>→</sup> <sup>R</sup> and *<sup>K</sup>*˜ : *<sup>H</sup>* <sup>×</sup> *<sup>H</sup>* <sup>→</sup> <sup>R</sup> by

$$\begin{aligned} K(\phi) &:= 1 + \|\phi\|\_{U}^{p}, \qquad K(\phi, \psi) := 1 + \|\phi\|\_{U}^{p} + \|\psi\|\_{U}^{q}, \\ \tilde{K}(\phi) &:= K(\phi) + \|\phi\|\_{H}^{\bar{p}}, \qquad \tilde{K}(\phi, \psi) := K(\phi, \psi) + \|\phi\|\_{H}^{\bar{p}} + \|\psi\|\_{H}^{\bar{q}} \end{aligned}$$

Distinct use of the function *K* will depend on different constants but in no meaningful way in our applications, hence no explicit reference to them shall be made. In the case of *K*˜ , when *p,*˜ *q*˜ = 2 then we shall denote the general *K*˜ by *K*˜ 2. In this case no further assumptions are made on the *p, q*. That is, *K*˜ <sup>2</sup> has the general representation

$$\tilde{K}\_2(\phi, \psi) = K(\phi, \psi) + \|\phi\|\_{H}^2 + \|\psi\|\_{H}^2 \tag{73}$$

and similarly as a function of one variable.

(75)

We state the subsequent assumptions for arbitrary elements *φ,ψ* <sup>∈</sup> *<sup>V</sup>* , *<sup>φ</sup><sup>n</sup>* <sup>∈</sup> *Vn*, *η* ∈ *H* and *t* ∈ [0*,*∞*)*, and a fixed *κ >* 0. Understanding G as a mappingG : [0*,*∞*)* × *V* × U → *H*, we introduce the notation G*i(*·*,* ·*)* := G*(*·*,* ·*, ei)*.

#### **Assumption 5.7**

$$\|\mathcal{A}(t,\boldsymbol{\Phi})\|\_{U}^{2} + \sum\_{l=1}^{\infty} \|\mathcal{G}\_{l}(t,\boldsymbol{\Phi})\|\_{H}^{2} \le c\_{l}K(\boldsymbol{\Phi})\left[1 + \|\boldsymbol{\Phi}\|\_{V}^{2}\right],\tag{74}$$

$$\|\mathcal{A}(t,\boldsymbol{\Phi}) - \mathcal{A}(t,\boldsymbol{\Psi})\|\_{U}^{2} \le c\_{l}\left[K(\boldsymbol{\Phi},\boldsymbol{\Psi}) + \left\|\boldsymbol{\Phi}\right\|\_{V}^{2} + \left\|\boldsymbol{\Psi}\right\|\_{V}^{2}\right] \|\boldsymbol{\Phi} - \boldsymbol{\Psi}\|\_{V}^{2},$$

$$\sum\_{l=1}^{\infty} \|\mathcal{G}\_l(t, \phi) - \mathcal{G}\_l(t, \psi)\|\_U^2 \le c\_l K(\phi, \psi) \|\phi - \psi\|\_H^2. \tag{76}$$

**Assumption 5.8**

$$\left\|\mathcal{D}\langle\mathcal{P}\_{n}\mathcal{A}(t,\boldsymbol{\Phi}^{n}),\boldsymbol{\Phi}^{n}\rangle\_{H} + \sum\_{l=1}^{\infty} \|\mathcal{P}\_{n}\mathcal{G}\_{l}(t,\boldsymbol{\Phi}^{n})\|\_{H}^{2} \leq c\_{l}\tilde{K}\_{2}(\boldsymbol{\Phi}^{n})\left[1 + \|\boldsymbol{\Phi}^{n}\|\_{H}^{2}\right] - \kappa\|\boldsymbol{\Phi}^{n}\|\_{V}^{2},\tag{77}$$

$$\sum\_{l=1}^{\infty} \langle \mathcal{P}\_n \mathcal{G}\_l(t, \boldsymbol{\phi}^n), \boldsymbol{\phi}^n \rangle\_H^2 \le c\_I \tilde{K}\_2(\boldsymbol{\phi}^n) \left[ 1 + \|\boldsymbol{\phi}^n\|\_H^4 \right]. \tag{78}$$

**Assumption 5.9**

$$\begin{split} 2\langle \mathcal{A}(t,\boldsymbol{\Phi}) - \mathcal{A}(t,\boldsymbol{\Psi}), \boldsymbol{\Phi} - \boldsymbol{\Psi} \rangle\_{U} + \sum\_{l=1}^{\infty} \| \mathcal{G}\_{l}(t,\boldsymbol{\Phi}) - \mathcal{G}\_{l}(t,\boldsymbol{\Psi}) \|\_{U}^{2} \\ \leq c\_{I} \tilde{K}\_{2}(\boldsymbol{\Phi}, \boldsymbol{\Psi}) \| \boldsymbol{\Phi} - \boldsymbol{\Psi} \|\_{U}^{2} - \kappa \left\| \boldsymbol{\Phi} - \boldsymbol{\Psi} \right\|\_{H}^{2}, \end{split} \tag{79}$$

$$\sum\_{l=1}^{\infty} \langle \mathcal{G}\_l(t, \phi) - \mathcal{G}\_l(t, \psi), \phi - \psi \rangle\_U^2 \le c\_1 \tilde{K}\_2(\phi, \psi) \|\phi - \psi\|\_U^4. \tag{80}$$

**Assumption 5.10**

$$2\langle \mathcal{A}(t,\phi),\phi\rangle\_U + \sum\_{l=1}^{\infty} \|\mathcal{G}\_l(t,\phi)\|\_U^2 \le c\_l K(\phi) \left[1 + \|\phi\|\_H^2\right],\tag{81}$$

$$\sum\_{l=1}^{\infty} \langle \mathcal{G}\_l(t, \phi), \Phi \rangle\_U^2 \le c\_l K(\phi) \left[ 1 + \|\phi\|\_H^4 \right]. \tag{82}$$

#### **Assumption 5.11**

$$\langle \mathcal{A}(t, \phi) - \mathcal{A}(t, \psi), \eta \rangle\_U \le c\_l (1 + \|\eta\|\_H) \left[ K(\phi, \psi) + \|\phi\|\_V + \|\psi\|\_V \right] \|\phi - \psi\|\_H. \tag{83}$$

With these assumptions in place we state the relevant definitions and results, first announced in [31] and proven in [32]. Definition 5.12 is stated for an F0− measurable *-*<sup>0</sup> :  → *H*.

**Definition 5.12** A pair *(-,τ)* where *<sup>τ</sup>* is a <sup>P</sup> <sup>−</sup> *a.s.* positive stopping time and *-* is a process such that for <sup>P</sup> <sup>−</sup> *a.e. <sup>ω</sup>*, *-*·*(ω)* ∈ *C (*[0*, T* ]; *H)* and *-*·*(ω)***1**·≤*τ (ω)* ∈ *<sup>L</sup>*<sup>2</sup> *(*[0*, T* ]; *<sup>V</sup> )* for all *T >* <sup>0</sup> with *-*·**1**·≤*<sup>τ</sup>* progressively measurable in *V* , is said to be an *H*-valued local strong solution of the Eq. (69) if the identity

$$\Psi\_t = \Psi\_0 + \int\_0^{t \wedge \tau} \mathcal{A}(s, \Psi\_s) ds + \int\_0^{t \wedge \tau} \mathcal{G}(s, \Psi\_s) d\mathcal{W}\_s \tag{84}$$

holds <sup>P</sup> <sup>−</sup> *a.s.* in *<sup>U</sup>* for all *<sup>t</sup>* <sup>≥</sup> 0.

**Definition 5.13** A pair *(-, )* such that there exists a sequence of stopping times *(θj )* which are <sup>P</sup> <sup>−</sup> *a.s.* monotone increasing and convergent to , whereby *(-*·∧*θj , θj )* is a *V* −valued local strong solution of the Eq. (69) for each *j* , is said to be an *H*−valued maximal strong solution of the Eq. (69) if for any other pair *(,)* with this property then <sup>≤</sup> <sup>P</sup> <sup>−</sup> *a.s.* implies <sup>=</sup> <sup>P</sup> <sup>−</sup> *a.s.*.

**Definition 5.14** An *H*−valued maximal strong solution *(-, )* of the equation (69) is said to be unique if for any other such solution *(,)*, then <sup>=</sup> <sup>P</sup> <sup>−</sup> *a.s.* and for all *t* ∈ [0*, )*,

$$\mathbb{P}\left(\{\omega \in \Omega : \Psi\_l(\omega) = \Phi\_l(\omega)\}\right) = 1.$$

We now state the main theorem in this setting.

**Theorem 5.15** *Suppose that Assumptions 5.6–5.11 are satisfied in this framework. Then for any given* F0− *measurable -*<sup>0</sup> :  → *H, there exists a unique H*−*valued maximal strong solution (-, ) of the equation (69). Moreover at* <sup>P</sup> <sup>−</sup> *a.e. <sup>ω</sup> for which (ω) <* ∞*, we have that*

$$\sup\_{r \in [0, \Theta(\omega))} \|\Psi\_r(\omega)\|\_H^2 + \int\_0^{\Theta(\omega)} \|\Psi\_r(\omega)\|\_V^2 dr = \infty. \tag{85}$$

*Proof* See [32] Theorem 3.15.

## *5.4 Abstract Solution Criterion II*

We extend the framework of Sect. 5.3, introducing now another Hilbert Space *X* which is such that *U* −→ *X*. We ask that there is a continuous bilinear form ·*,* ·*X*×*<sup>H</sup>* : *<sup>X</sup>* <sup>×</sup> *<sup>H</sup>* <sup>→</sup> <sup>R</sup> such that for *<sup>φ</sup>* <sup>∈</sup> *<sup>U</sup>* and *<sup>ψ</sup>* <sup>∈</sup> *<sup>H</sup>*,

$$
\langle \phi, \psi \rangle\_{X \times H} = \langle \phi, \psi \rangle\_U. \tag{86}
$$

Moreover it is now necessary that the system *(an)* from Assumption 5.6 forms an orthogonal basis of *U*. We state the remaining assumptions now for arbitrary elements *φ,ψ* ∈ *H* and *t* ∈ [0*,*∞*)*, and continue to use the *c, K, K,κ* ˜ notation of Assumption Set 1. We now further assume that for any *T >* 0, A : [0*, T* ]×*H* → *X* and <sup>G</sup> : [0*, T* ] × *<sup>H</sup>* <sup>→</sup> <sup>L</sup> <sup>2</sup>*(*U; *U )* are measurable.

#### **Assumption 5.16**

$$\|\mathcal{A}(t,\phi)\|\_{X}^{2} + \sum\_{l=1}^{\infty} \|\mathcal{G}\_{l}(t,\phi)\|\_{U}^{2} \le c\_{l}K(\phi)\left[1 + \|\phi\|\_{H}^{2}\right],\tag{87}$$

$$\|\mathcal{A}(t,\phi) - \mathcal{A}(t,\psi)\|\_{X} \le c\_{l}\left[K(\phi,\psi) + \|\phi\|\_{H} + \|\psi\|\_{H}\right]\|\phi - \psi\|\_{H}\tag{88}$$

**Assumption 5.17**

$$\begin{split} 2\langle \mathcal{A}(t,\boldsymbol{\Phi}) - \mathcal{A}(t,\boldsymbol{\Psi}), \boldsymbol{\Phi} - \boldsymbol{\Psi} \rangle\_{X} \\ &+ \sum\_{l=1}^{\infty} \| \mathcal{G}\_{l}(t,\boldsymbol{\Phi}) - \mathcal{G}\_{l}(t,\boldsymbol{\Psi}) \|\_{X}^{2} \leq c\_{l} \tilde{K}\_{2}(\boldsymbol{\Phi},\boldsymbol{\Psi}) \| \boldsymbol{\Phi} - \boldsymbol{\Psi} \|\_{X}^{2}, \end{split} \tag{89}$$

$$\sum\_{l=1}^{\infty} \langle \mathcal{G}\_l(t, \phi) - \mathcal{G}\_l(t, \psi), \phi - \psi \rangle\_X^2 \le c\_I \tilde{K}\_2(\phi, \psi) \|\phi - \psi\|\_X^4 \tag{90}$$

**Assumption 5.18** *For every φ* ∈ *V , it holds that*

$$2\langle \mathcal{A}(t,\phi),\Phi\rangle\_U + \sum\_{l=1}^{\infty} \|\mathcal{G}\_l(t,\phi)\|\_U^2 \le c\_l K(\phi) - \kappa \|\phi\|\_H^2,\tag{91}$$

$$\sum\_{l=1}^{\infty} \langle \mathcal{G}\_l(t, \phi), \Phi \rangle\_U^2 \le c\_l K(\phi). \tag{92}$$

**Remark 7** Note that Assumption 5.18 is stronger than Assumption 5.10, as we are bounding the same terms but we are not afforded a control in the *H* norm of *φ* in addition to its *U* norm. Thus in applying Theorem 5.22 it is sufficient to only demonstrate Assumption 5.18.

Analagously to Susbection 5.3, we state the relevant definitions and the resulting theorem in this context (again proved in [32]). Definition 5.19 is stated for an F0− measurable *-*<sup>0</sup> :  → *U*.

**Definition 5.19** A pair *(-,τ)* where *<sup>τ</sup>* is a <sup>P</sup> <sup>−</sup> *a.s.* positive stopping time and *-* is a process such that for <sup>P</sup> <sup>−</sup> *a.e. <sup>ω</sup>*, *-*·*(ω)* ∈ *C (*[0*, T* ]; *U)* and *-*·*(ω)***1**·≤*τ (ω)* ∈ *<sup>L</sup>*<sup>2</sup> *(*[0*, T* ]; *H)* for all *T >* <sup>0</sup> with *-*·**1**·≤*<sup>τ</sup>* progressively measurable in *H*, is said to be a *U*-valued local strong solution of the Eq. (69) if the identity

$$\Psi\_t = \Psi\_0 + \int\_0^{t \wedge \tau} \mathcal{A}(s, \Psi\_s) ds + \int\_0^{t \wedge \tau} \mathcal{G}(s, \Psi\_s) d\mathcal{W}\_s \tag{93}$$

holds <sup>P</sup> <sup>−</sup> *a.s.* in *<sup>X</sup>* for all *<sup>t</sup>* <sup>≥</sup> 0.

**Definition 5.20** A pair *(-, )* such that there exists a sequence of stopping times *(θj )* which are <sup>P</sup> <sup>−</sup> *a.s.* monotone increasing and convergent to , whereby *(-*·∧*θj , θj )* is a *U*−valued local strong solution of the Eq. (69) for each *j* , is said to be an *H*−valued maximal strong solution of the Eq. (69) if for any other pair *(,)* with this property then <sup>≤</sup> <sup>P</sup> <sup>−</sup> *a.s.* implies <sup>=</sup> <sup>P</sup> <sup>−</sup> *a.s.*.

**Definition 5.21** A *U*−valued maximal strong solution *(-, )* of the Eq. (69) is said to be unique if for any other such solution *(,)*, then <sup>=</sup> <sup>P</sup> <sup>−</sup> *a.s.* and for all *t* ∈ [0*, )*,

$$\mathbb{P}\left(\{\omega \in \Omega : \Psi\_l(\omega) = \Phi\_l(\omega)\}\right) = 1.$$

**Theorem 5.22** *Suppose that Assumptions 5.6–5.11 and 5.16–5.18 are satisfied in this framework. Then for any given* F0− *measurable -*<sup>0</sup> :  → *U, there exists a unique U*−*valued maximal strong solution (-, ) of the Eq. (69). Moreover at* <sup>P</sup> <sup>−</sup> *a.e. <sup>ω</sup> for which (ω) <* <sup>∞</sup>*, we have that*

$$\sup\_{r \in [0, \Theta(\omega))} \left\| \Psi\_r(\omega) \right\|\_U^2 + \int\_0^{\Theta(\omega)} \left\| \Psi\_r(\omega) \right\|\_H^2 dr = \infty. \tag{94}$$

*Proof* See [32] Theorem 4.9.

**Acknowledgments** Daniel Goodair was supported by the Engineering and Physical Sciences Research Council (EPSCR) Project 2478902. Dan Crisan was partially supported by the European Research Council (ERC) under the European Union's Horizon 2020 Research and Innovation Programme (ERC, Grant Agreement No 856408).

We would like to express our sincere gratitude to the anonymous reviewers of this paper, whose detailed and insightful comments have greatly contributed to this publication.

$$\square$$

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **On the Interactions Between Mean Flows and Inertial Gravity Waves in the WKB Approximation**

**Darryl D. Holm, Ruiao Hu, and Oliver D. Street**

**Abstract** We derive a Wentzel–Kramers–Brillouin (WKB) closure of the generalised Lagrangian mean (GLM) theory by using a phase-averaged Hamilton variational principle for the Euler–Boussinesq (EB) equations. Following Gjaja and Holm 1996, we consider 3D inertial gravity waves (IGWs) in the EB approximation. The GLM closure for WKB IGWs expresses EB wave mean flow interaction (WMFI) as WKB wave motion boosted into the reference frame of the EB equations for the Lagrangian mean transport velocity. We provide both deterministic and stochastic closure models for GLM IGWs at leading order in 3D complex vector WKB wave asymptotics. This paper brings the Gjaja and Holm 1996 paper at leading order in wave amplitude asymptotics into an easily understood short form and proposes a stochastic generalisation of the WMFI equations for IGWs.

# **1 Introduction**

Inertial gravity waves (IGWs), also known as internal waves, comprise a classical form of wave disturbances in fluid motions under gravity that propagate in threedimensional stratified, rotating, incompressible fluid and involve nonlinear dynamics among inertia, buoyancy, pressure gradients and Coriolis forces [23, 17, 24].

**Satellite Images and Field Data** Satellite Synthetic Aperture Radar (SAR) is a powerful sensor for ocean remote sensing, because of its continuous capabilities and high spatial resolution. The spatial resolution of the state-of-the-art satellite SAR images reaches 20—30 m, and the swath width reaches 100–450 km. Figure 1 shows a typical representation of the range of SAR field data and Fig. 2 shows a typical SAR image of IGWs on the ocean surface.

D. D. Holm · R. Hu · O. D. Street (-)

Department of Mathematics, Imperial College London, London, UK e-mail: d.holm@imperial.ac.uk; ruiao.hu15@imperial.ac.uk; o.street18@imperial.ac.uk

B. Chapron et al. (eds.), *Stochastic Transport in Upper Ocean Dynamics II*, Mathematics of Planet Earth 11, https://doi.org/10.1007/978-3-031-40094-0\_5

**Fig. 1** The distribution of observed IGW packets and bathymetry in the South China Sea courtesy of [25]. Bold lines represent crest lines of leading waves in IGW packets interpreted from SAR images. The rectangular box on the right of this figure outlines the IGW generation source region. Looking closely near the center of this figure, one sees the crescent shape of the Dongsha atoll whose diameter is 25 km. Details of SAR images of waves near Dongsha atoll are shown in Fig. 2

**Theoretical Basis of the Present Work** The paper [5] derived a hierarchy of approximate models of wave mean-flow interaction (WMFI) for IGWs by using asymptotic expansions and phase averages. Two different derivations of the same WMFI IGW equations were given. The first derivation was based on Fourier projections of the Euler–Boussinesq equations for a stratified rotating inviscid incompressible fluid. The second derivation was based on Hamilton's principle for these equations. Two small dimensionless parameters were used in the asymptotic expansions. One small parameter was the ratio of time scales between internal waves at most wavenumbers and the mesoscale mean flow of the fluid. This "adiabatic ratio" is small and is comparable to the corresponding ratio of space scales for the class of initial conditions that support internal waves. The other small parameter used in the asymptotic expansions was the ratio of the amplitude of the internal wave to its wavelength. An application of Noether's theorem to the phase-averaged Hamilton's principle showed that the resulting equations conserve the wave action, convect a potential vorticity and can, depending on the order of approximation, convect wave angular momentum. Legendre transforming from the phase-averaged Hamilton's principle to the Hamiltonian formulation brought the WMFI theory into the Lie-Poisson framework in which formal and nonlinear stability analysis methods are available [15]. The Hamiltonian framework also revealed an analogy between the two-fluid model of the interaction of waves and mean flow with the interaction of the superfluid and normal fluid components of liquid *H e*<sup>4</sup> without vortices. The relations to similar results for the Charney-Drazin non-acceleration theorem, Whitham averaging, WKB stability theory, Craik-Leibovich theory of Langmuir circulations as well as the generalised Lagrangian-mean (GLM) fluid equations for prescribed wave displacements were also discussed in [5].

**Goal of the Present Work** Our goal here is to derive 3D IGW equations in the class of wave mean flow interaction (WMFI) derived in [5] as a mutual interaction of the mean fluid flow and the slowly varying envelope of fluctuation dynamics that is consistent with IGWs in the full 3D Euler–Boussinesq fluid flow. Physically, we take nonhydrostatic pressure effects on the wave dispersion relation into account and derive consistent nonlinear feedback effects of the internal of waves on the generation of fluid circulation based on a dynamic version of the well-known Craik-Leibovich theory of Langmuir circulation [3]. Mathematically, we introduce the two WMFI degrees of freedom by factorising the full 3D Euler–Boussinesq flow map into the composition of two smooth invertible maps in Hamilton's principle for Eulerian fluid dynamics [14].

The present work derives 3D equations for wave mean flow interaction (WMFI) as WKB wave motion boosted into the reference frame of the fluid equations for the Lagrangian mean transport velocity. The final equations derived here are consistent with traditional approaches such as Craik-Leibovih (CL) theory [3] except that the Stokes drift velocity in the CL formulation has its own dynamics in the present formulation. The present formulation can also be considered as a WKB closure for the GLM approach [1], similar also to the oscillation centre ponderomotive closure in magnetohydrodynamics [18, 19]. Namely, the present formulation uses a combination of asymptotic expansion and phase resonance to close the GLM equations derived by the composition-of-maps approach and obtaining explicit formulas for the wave polarisation parameters and dispersion relation for the Doppler-shifted frequency.

Finally, the present work also formulates stochastic equations of motion for 3D WMFI dynamics, permitting a statistical representation of the uncertainty present in observational data of geophysical flows.

# **2 Deterministic 3D Euler–Boussinesq (EB) Internal Gravity Waves**

# *2.1 Lagrangian Formulation of the WMFI Equations at Leading Order*

**GLM Theory** The Generalised Lagrangian Mean (GLM) theory of wave mean flow interaction (WMFI) is derived in Andrews and McIntyre [1] by taking the time mean *(*·*)* at a fixed position **x** of the Eulerian fluid velocity, *U (***x***,t)*, shifted to a rapidly fluctuating position, **<sup>x</sup>***<sup>ξ</sup>* := **<sup>x</sup>** <sup>+</sup> *<sup>α</sup><sup>ξ</sup> (***x***,t)* with constant scale factor *<sup>α</sup>* - 1 and zero Eulerian mean *<sup>ξ</sup> (***x***,t)* <sup>=</sup> 0. The Lagrangian mean velocity **<sup>u</sup>***L(***x***,t)* at Eulerian position **x** is then defined via the following calculation,

$$U(\mathbf{x}^{\xi},t) := U(\mathbf{x} + a\xi(\mathbf{x},t),t) = \mathbf{u}^{L}(\mathbf{x},t) + a\frac{d}{dt}\xi(\mathbf{x},t)$$
 
$$\text{where} \quad \overline{U(\mathbf{x} + a\xi(\mathbf{x},t))} =: \mathbf{u}^{L}(\mathbf{x},t) \,, \tag{2.1}$$

with

$$\frac{d}{dt}\xi(\mathbf{x},t) = \partial\_l \xi + (\mathbf{u}^L \cdot \nabla)\xi =: \mathbf{u}^\ell, \quad \overline{\mathbf{u}^\ell} = 0 \quad \text{and} \quad \overline{\mathbf{u}^L} = \mathbf{u}^L. \tag{2.2}$$

Consequently, the Kelvin circulation integral for GLM in a rotating frame with constant Coriolis parameter 2 may be derived; see, e.g., [1, 5, 6, 7, 9, 10] and the appendix for details,

$$\begin{split} I\_{GLM}(\mathbf{u}^{L}) &= \oint\_{c(\boldsymbol{u}^{L})} \overline{\left(\mathbf{u}(\mathbf{x}^{\boldsymbol{\xi}},t) + \mathbf{\mathcal{Q}} \times \mathbf{x}^{\boldsymbol{\xi}}\right) \cdot d\mathbf{x}^{\boldsymbol{\xi}}} \\ &= \oint\_{c(\boldsymbol{u}^{L})} \left(\mathbf{u}^{L}(\mathbf{x},t) + \mathbf{\mathcal{Q}} \times \mathbf{x}\right) \cdot d\mathbf{x} \\ &\quad + a^{2} \overline{\left(\mathbf{\mathcal{Q}} \times \mathbf{\xi}(\mathbf{x},t) + \mathbf{u}^{\ell}(\mathbf{x},t)\right) \cdot d\boldsymbol{\xi}(\mathbf{x},t)} .\end{split} \tag{2.3}$$

The Lagrangian transport velocity for GLM in (2.3) is indeed **u***L*. However, the Eulerian momentum per unit mass in the integrand of the GLM circulation integral in (2.3) acquires an order *O(α*2*)* shift due to the mean effects of the quadratic nonlinearity in the last fluctuating displacement terms in (2.3).

**Choice of GLM Closure** Gjaja and Holm [5] studied the dynamics of 3D IGWs by closing the GLM theory for the case that the fluctuation displacement *αξ (***x***,t)* in (2.1) is given by a single-frequency travelling wave *(***a***(***x***, t)eiφ(***x***,t)/ )* with slowly varying complex vector amplitude **a***(***x***,t)* and slowly varying, but rapid phase *φ(***x***, t)/*, with - 1; so that the time averaged Lagrangian mean of the displacement field *αξ (***x***,t)* would be negligible.

We choose to represent the fluctuation displacement field *ξ (***x***,t)* in the following form

$$\mathbf{f}(\mathbf{x},t) = \mathbf{a}(\epsilon \mathbf{x}, \epsilon t)e^{l\phi(\epsilon \mathbf{X}, \epsilon t)/\epsilon} + \mathbf{a}^\*(\epsilon \mathbf{x}, \epsilon t)e^{-l\phi(\epsilon \mathbf{X}, \epsilon t)/\epsilon},\tag{2.4}$$

and the total pressure decomposes into

$$p(\mathbf{X},t) = p\_0(\mathbf{X},t) + \sum\_{j\geq 1} \alpha^j \left( b\_j(\epsilon \mathbf{X}, \epsilon t) e^{lj\phi(\epsilon \mathbf{X}, \epsilon t)/\epsilon} + b\_j^\*(\epsilon \mathbf{X}, \epsilon t) e^{-lj\phi(\epsilon \mathbf{X}, \epsilon t)/\epsilon} \right) \,. \tag{2.5}$$

Here the adiabatic parameter is defined as the ratio between space and time scales of the wave oscillations and mean flow respectively. Thus, quantities that are functions of **x** and *t*, for example *ξ (***x***,t)*, have *fast* dependence on **x** and *t*. Likewise, quantities which are functions of **x** and *t*, for example **a***(***x***, t)*, have *slow* dependence of the space and time coordinates. Thus, in the fluctuation displacement *ξ* in (2.4), the fast phase dynamics is represented by exp *iφ(***x***, t)/* which is slowly modulated by the complex vector amplitude **a***(***x***, t)*.

We will apply the GLM closure in Eqs.(2.4) and (2.5) to the 3D Euler– Boussinesq equations, which can be derived from Hamilton's principle with the following reduced Lagrangian

$$0 = \delta \int\_{t0}^{t\_1} \int\_{\mathcal{M}} \mathcal{Q} \left(\frac{1}{2} |\mathbf{U}|^2 + \mathbf{U} \cdot \mathbf{\mathcal{Q}} \times \mathbf{X} - g\varrho Z\right) + p(1 - \mathcal{Q}) \, d^3 X \, dt,\tag{2.6}$$

where <sup>D</sup>*d*3*<sup>X</sup>* <sup>=</sup> *<sup>d</sup>*3*x*<sup>0</sup> <sup>∈</sup> Den*(*R3*)* is the fluid density,  <sup>∈</sup> <sup>F</sup>*(*R3*)* is the fluid buoyancy, and M is the spatial domain. Substitution of (2.1), (2.4) and (2.5) into the Euler-Boussinesq Lagrangian in (2.6) followed by asymptotic expansion in *α* - 1 and - 1 at order *O(α*2*)* neglecting corrections at orders *O(α*2*)* and *O(α*4*)* and phase averaging (i.e., keeping coefficients of resonant phase factors only) produces the following wave mean flow interaction (WMFI) closure for Hamilton's principle in Eulerian fluid variables, which splits into the sum of the average mean-flow action *L*¯ *MF* and the average wave action *L*¯ *<sup>W</sup>* , given by [5] and derived in the appendix as, cf. Eq. (A.8),

$$\begin{split} 0 &= \delta(S\_{MF} + S\_W) = \delta \int\_{t\_0}^{t\_1} (\bar{L}\_{MF} + \alpha^2 \bar{L}\_W) \, dt \\ &= \delta \int\_{t\_0}^{t\_1} \int\_{\mathcal{M}} D \left[ \frac{1}{2} |\mathbf{u}^L|^2 + \mathbf{u}^L \cdot \mathbf{\mathcal{Q}} \times \mathbf{x} - \rho g z + \alpha^2 \tilde{w}^2 |\mathbf{a}|^2 + 2i\alpha^2 \tilde{w} \mathbf{\mathcal{Q}} \cdot (\mathbf{a} \times \mathbf{a}^\*) \right] \, dt \\ &- \alpha^2 i \left( b \mathbf{k} \cdot \mathbf{a}^\* - b^\* \mathbf{k} \cdot \mathbf{a} \right) - \alpha^2 a\_l^\* a\_j \frac{\partial^2 p\_0}{\partial x\_l \partial x\_j} \Big] + (1 - D) p\_0 + \mathcal{O}(\alpha^2 \epsilon) \, d^3 x \, dt \, . \end{split} \tag{2.7}$$

The averaged fluid quantities **u***L(***x***, t)*, *D(***x***, t)* and *ρ(***x***, t)* are defined to have slow dependence on **x** and *t* in the averaging procedure. To see the construction of slow dependence from Lagrangian labels, see section (2.1) of Gjaja and Holm [5]. In the averaged wave Lagrangian *L*¯ *<sup>W</sup>* , the wave vector and wave frequency are defined in terms of the wave phase *φ(***x***, t)*, as

$$\mathbf{k}(\epsilon \mathbf{x}, \epsilon t) := \nabla\_{\epsilon \mathbf{X}} \phi(\epsilon \mathbf{x}, \epsilon t) \quad \text{and} \quad \omega(\epsilon \mathbf{x}, \epsilon t) := -\frac{\partial}{\partial \epsilon t} \phi(\epsilon \mathbf{x}, \epsilon t) \,. \tag{2.8}$$

The Doppler-shifted oscillation frequency, *<sup>ω</sup>*, due to the coupling to the mean flow **u***<sup>L</sup>* is defined through the advective time derivative *<sup>d</sup> dt* := *<sup>∂</sup> ∂t* <sup>+</sup> **<sup>u</sup>***<sup>L</sup>* · ∇**<sup>x</sup>** and the wave phase as

$$\widetilde{\boldsymbol{\omega}} := -\frac{d}{d\epsilon t}\boldsymbol{\phi} = -\left(\frac{\partial}{\partial \epsilon t}\boldsymbol{\phi} + \mathbf{u}^{L} \cdot \nabla\_{\epsilon \mathbf{X}} \boldsymbol{\phi}\right) = \boldsymbol{\omega} - \mathbf{u}^{L} \cdot \mathbf{k} \,. \tag{2.9}$$

Upon introducing the Doppler-shifted oscillation *<sup>ω</sup>* into *<sup>L</sup>*¯ *<sup>W</sup>* in (2.7) and pairing its definition in (2.9) with a Lagrange multiplier, *N*, one arrives at the following variational principle

$$\begin{split} 0 &= \delta \left( S\_{MF} + S\_W \right) = \delta \int\_{t\_0}^{t\_1} (\bar{L}\_{MF} + \alpha^2 \bar{L}\_W) \, dt \\ &= \delta \int\_{t\_0}^{t\_1} \int\_{\mathcal{A}} D \left[ \frac{1}{2} |\mathbf{u}^L|^2 + \mathbf{u}^L \cdot \mathbf{\mathcal{Q}} \times \mathbf{x} - \rho \mathbf{g} z + \alpha^2 \tilde{\omega}^2 |\mathbf{a}|^2 + 2i \alpha^2 \tilde{\omega} \mathbf{\mathcal{Q}} \cdot (\mathbf{a} \times \mathbf{a}^\*) \right. \\ &\quad \left. - \alpha^2 i \left( b \mathbf{k} \cdot \mathbf{a}^\* - b^\* \mathbf{k} \cdot \mathbf{a} \right) - \alpha^2 a\_l^\* a\_j \frac{\partial^2 p\_0}{\partial x\_l \partial x\_j} \right] + (1 - D) p\_0 \, d^3 x \\ &\quad + \alpha^2 \left\langle N \,, -\frac{\partial}{\partial \epsilon t} \phi - \mathbf{u}^L \cdot \nabla\_\epsilon \mathbf{x} \phi - \tilde{\omega} \right\rangle + \mathcal{O}(\alpha^2 \epsilon) \, dt \, . \end{split} \tag{2.10}$$

Since, it may not be immediately clear how to take variations of the action (2.7). The inclusion of the Lagrange multiplier, *N*, imposes the relationship among the Doppler-shifted frequency *<sup>ω</sup>*, the Lagrangian mean velocity **<sup>u</sup>***L*, and the phase *<sup>φ</sup>*, thereby facilitating the variations. Namely, the forms of the constrained variations of the velocity field **u***<sup>L</sup>* and its advected quantities, *D* and *ρ*, are shown in (2.14). All other variables have arbitrary variations. The Euler-Poincaré theorem can then be applied to the variational derivatives with respect to **u***L*, *D*, and *ρ*, to obtain an equation for the total momentum of the system, and stationarity of the action with respect to the remaining variables implies a collection of equations for the remaining dynamics. This procedure results in a closed system of equations for both waves and mean flow, and describes their mutual interaction. For Hamilton's principle of least action to apply to an asymptotically expanded action, we make use of the following definition to formalise the idea of Hamilton's principle to a given order, in the situation where the action is expanded asymptotically.

**Definition 2.1 (Variational Derivatives in an Asymptotically Expanded Action)** When making an asymptotic expansion in Hamilton's principle, the Lagrangian in terms of any new variables, *(***u***L,D, ρ)* for example, becomes an infinite sum. Variational derivatives are then defined *under the integral* up to some order, i.e.

$$\begin{split} 0 &= \delta S = \delta \int \ell(\mathbf{u}^{L}, D, \rho) \, dt \\ &= : \int \left\langle \frac{\delta \ell\_{\alpha^{2}}}{\delta \mathbf{u}^{L}}, \, \delta \mathbf{u}^{L} \right\rangle + \left\langle \frac{\delta \ell\_{\alpha^{2}}}{\delta D}, \, \delta D \right\rangle + \left\langle \frac{\delta \ell\_{\alpha^{2}}}{\delta \rho}, \, \delta \rho \right\rangle + \mathcal{O}(\alpha^{2} \epsilon) \,, \end{split} \tag{2.11}$$

where the *truncated Lagrangian*, *α*<sup>2</sup> , is defined as the part of the Lagrangian which corresponds to these variations

$$\ell(\mathbf{u}, D, \rho) = \ell\_{\mathbf{u}^2}(\mathbf{u}, D, \rho) + H.O.T.$$

Note that we have declined to use the 'big O' notation in the above equation, since *α*<sup>2</sup> is defined to include all terms of order less than *α*2 *as well as* any higher order terms which produce lower order terms after integrating by parts to take variational derivatives.

Hamilton's action principle (2.10) yields the following variations up to order <sup>O</sup>*(α*2*)*

0 = *δ <sup>t</sup>*<sup>1</sup> *t*2 *<sup>L</sup>*¯ *MF* <sup>+</sup> *<sup>α</sup>*2*L*¯ *<sup>W</sup> dt* = *<sup>t</sup>*<sup>2</sup> *t*1 *<sup>δ</sup>***u***<sup>L</sup> , D***u***L*+*D* <sup>×</sup> **<sup>x</sup>** <sup>−</sup> *<sup>α</sup>*2*N*∇**x***<sup>φ</sup>* + *δρ ,* −*Dgz* + *δb ,* <sup>−</sup>*α*2*i***<sup>k</sup>** · **<sup>a</sup>**<sup>∗</sup> + *δb*<sup>∗</sup> *, α*2*i***<sup>k</sup>** · **<sup>a</sup>** + *δ***a** *, α*<sup>2</sup> *<sup>D</sup> <sup>ω</sup>*2**a**<sup>∗</sup> <sup>+</sup> <sup>2</sup>*<sup>i</sup> <sup>ω</sup>***a**<sup>∗</sup> <sup>×</sup> <sup>+</sup> *ib*∗**<sup>k</sup>** <sup>−</sup> *(***a**<sup>∗</sup> · ∇*)*∇*p*<sup>0</sup> + *δ***a**<sup>∗</sup> *, α*<sup>2</sup> *<sup>D</sup> <sup>ω</sup>*2**<sup>a</sup>** <sup>+</sup> <sup>2</sup>*<sup>i</sup> <sup>ω</sup>***<sup>a</sup>** <sup>×</sup> <sup>−</sup> *ib***<sup>k</sup>** <sup>−</sup> *(***<sup>a</sup>** · ∇*)*∇*p*<sup>0</sup> + *<sup>δ</sup> ω ,* <sup>2</sup>*α*2*D ω*|**a**| <sup>2</sup> <sup>+</sup> *<sup>i</sup>* · *(***<sup>a</sup>** <sup>×</sup> **<sup>a</sup>**∗*)* <sup>−</sup> *<sup>α</sup>*2*<sup>N</sup>* + *δD ,* + *δN ,* <sup>−</sup> *<sup>∂</sup> ∂t <sup>φ</sup>* <sup>−</sup> **<sup>u</sup>***<sup>L</sup>* · ∇**x***<sup>φ</sup>* <sup>−</sup> *<sup>ω</sup>* + *δφ , ∂ ∂t <sup>N</sup>* <sup>+</sup> div**x***(***u***LN )* <sup>+</sup> *<sup>i</sup>*div**x***(Db***a**<sup>∗</sup> <sup>−</sup> *Db*∗**a***) dt* <sup>+</sup> *δp*<sup>0</sup> *,* <sup>1</sup> <sup>−</sup> *<sup>D</sup>* <sup>+</sup> <sup>O</sup>*(α*2*).* (2.12)

where we have

$$\begin{split} \boldsymbol{\varpi} \cdot & \coloneqq \delta \left( \tilde{\boldsymbol{L}}\_{MF} + \boldsymbol{\alpha}^{2} \tilde{\boldsymbol{L}}\_{W} \right) / \delta D \\ &= \frac{1}{2} |\mathbf{u}^{L}|^{2} - \rho g \boldsymbol{z} + \mathbf{u}^{L} \cdot \boldsymbol{\mathfrak{A}} \times \mathbf{x} - p\_{0} \\ &+ a^{2} \left( \widetilde{\boldsymbol{\alpha}}^{2} |\mathbf{a}|^{2} + 2i \widetilde{\boldsymbol{\alpha}} \boldsymbol{\mathfrak{A}} \cdot (\mathbf{a} \times \mathbf{a}^{\*}) - i (b \mathbf{k} \cdot \mathbf{a}^{\*} - b^{\*} \mathbf{k} \cdot \mathbf{a}) - a\_{l}^{\*} a\_{j} \frac{\partial^{2} p\_{0}}{\partial \mathbf{x}\_{l} \partial \mathbf{x}\_{j}} \right) . \end{split} \tag{2.13}$$

The constrained variations in (2.12) take the Euler-Poincaré form [14]

$$\delta \mathbf{u}^{L} = \frac{\partial}{\partial \epsilon t} \mathbf{v} + \mathbf{u}^{L} \cdot \nabla\_{\epsilon \mathbf{X}} \mathbf{v} - \mathbf{v} \cdot \nabla\_{\epsilon \mathbf{X}} \mathbf{u}^{L}, \quad \delta \rho = -\mathbf{v} \cdot \nabla\_{\epsilon \mathbf{X}} \rho \; , \quad \delta D = -\text{div}\_{\epsilon \mathbf{X}}(\mathbf{v} D) \; , \tag{2.14}$$

where the appears in the derivatives of the constrained variations due to their slow dependence on space and time. Note that when isolating the arbitrary variations, **v**, through integration by parts, ∇**x** does not generate higher order terms when operating on . From the constrained variations, one has that *ρ* and *D* are advected by the flow which then satisfies the following advection equations

$$\frac{\partial}{\partial \epsilon t} D + \text{div}\_{\epsilon \mathbf{X}} (\mathbf{u}^L D) = 0, \quad \frac{\partial}{\partial \epsilon t} \rho + \mathbf{u}^L \cdot \nabla\_{\epsilon \mathbf{X}} \rho = 0. \tag{2.15}$$

The *total momentum* of the mean and fluctuating parts of the flow is defined through the variational derivative w.r.t to **u***L*, which is given by

$$\mathbf{M} := D\mathbf{u}^L + D\mathbf{\varDelta} \times \mathbf{x} - \alpha^2 N \nabla\_{\boldsymbol{\epsilon}\mathbf{X}} \boldsymbol{\phi} \,, \tag{2.16}$$

which through the Euler-Poincaré theorem [14], satisfies the Euler-Poincaré equation

$$\begin{split} \frac{\partial}{\partial \epsilon t} \left( \frac{\mathbf{M}}{D} \right) - \mathbf{u}^{L} \times \operatorname{curl}\_{\epsilon \mathbf{X}} \left( \frac{\mathbf{M}}{D} \right) + \nabla\_{\epsilon \mathbf{X}} \left( \frac{1}{2} |\mathbf{u}^{L}|^{2} + p\_{0} \right) + \frac{1}{\epsilon} g \rho \mathbf{\widehat{z}} \\ + \boldsymbol{\alpha}^{2} \nabla\_{\epsilon \mathbf{X}} \left( -\boldsymbol{\omega} \frac{N}{D} + \widetilde{\boldsymbol{\omega}}^{2} |\mathbf{a}|^{2} + a\_{l} a\_{j}^{\*} \frac{\partial^{2} p\_{0}}{\partial x\_{l} \partial x\_{j}} \right) = \mathbf{0}, \end{split} \tag{2.17}$$

where *<sup>z</sup>* := ∇**x***z*. Stationarity of the sum of actions *SMF* <sup>+</sup> *<sup>α</sup>*<sup>2</sup>*SW* in (2.10) under variations with respect to the fluid variables *(***u***L,D, ρ)* has produced the equations for the mean flow, with order *O(α*2*)* wave forcing which arises from the dependence of *α*2*L*¯ *<sup>W</sup>* on the fluid variables *D* and *ρ*. We note from the variation in *p*<sup>0</sup> that incompressibility of the Lagrangian mean velocity holds only within the asymptotic regime, and does not hold in an exact form. Indeed,

On the Interactions Between Mean Flows and Inertial Gravity Waves in the... 119

$$D = 1 - \alpha^2 \epsilon^2 \frac{\partial^2}{\partial \epsilon \ge \partial \epsilon \ge \epsilon\_j} \left( D a\_l^\* a\_j \right) = 1 + \mathcal{O}(\alpha^2 \epsilon^2) \quad \Longrightarrow \quad \text{div}\_{\epsilon \mathbf{X}}(\mathbf{u}^L) = O(\alpha^2 \epsilon) \,. \tag{2.18}$$

**Conservation of Wave Action Density** Keeping only resonant combinations in the Lagrangian *L*¯ *<sup>W</sup>* in (2.29) has eliminated any explicit dependence on the phase, *φ*. Hence, a symmetry of the Lagrangian under constant phase shift, *φ* → *φ* + *φ*0, has arisen. Consequently, one expects that Noether's theorem will yield a conservation law for the conjugate momentum *N* under variations in *φ* of the average wave Lagrangian, *<sup>L</sup>*¯ *<sup>W</sup>* . The arbitrary variation *<sup>δ</sup> <sup>ω</sup>* in (2.12) reveals the definition of *<sup>N</sup>* as

$$N := \frac{\delta L\_W}{\delta \widetilde{\omega}} = 2D(\widetilde{\omega}|\mathbf{a}|^2 + i\mathbf{\mathfrak{A}} \cdot \mathbf{a} \times \mathbf{a}^\*)\,,\tag{2.19}$$

and the arbitrary variation *δφ* in (2.12) produces the following wave action conservation law,

$$\frac{\partial N}{\partial \epsilon t} + \text{div}\_{\epsilon \mathbf{x}} \left( N(\mathbf{u}^L + \mathbf{v}\_G) \right) = 0, \text{ where } \mathbf{v}\_G := \frac{iD}{N} (\mathbf{a}^\* b - \mathbf{a} b^\*) = \frac{2D}{N} \Im(\mathbf{a} b^\*) \,. \tag{2.20}$$

Thus, the wave action *N* is transported in an Eulerian frame by the sum of the Lagrangian mean velocity **u***<sup>L</sup>* and the group velocity of the waves, **v***G*, defined above in (2.20). The evolution equation of *φ* in (2.9) can be written in terms of *N* as follows

$$\frac{\partial}{\partial \epsilon t} \phi + \mathbf{u}^L \cdot \nabla\_{\epsilon \mathbf{X}} \phi = \frac{1}{2D|\mathbf{a}|^2} \left( N - 2Di\mathbf{2} \cdot \mathbf{a} \times \mathbf{a}^\* \right), \tag{2.21}$$

thus removing the explicit dependence on *<sup>ω</sup>*. The Eqs. (2.20) and (2.21) are in fact canonical Hamilton's equations boosted to the reference frame of the mean flow **u***<sup>L</sup>* which is discussed in Sect. 2.2.

**Remark 2.1 (Boundary Conditions for Integrations by Parts)** In taking variations of wave properties, we are not considering a free upper boundary. Instead, we have set

$$\delta(\widehat{\mathfrak{n}} \cdot \delta \mathbf{a}^\*) \mathbf{a} \cdot \frac{\partial p}{\partial \mathbf{x}} = 0 \quad \text{and} \quad \delta \phi \,\, \widehat{\mathfrak{n}} \cdot \mathcal{N} \left(\mathbf{u}^L + \mathbf{v}\_G \right) = 0 \,, \tag{2.22}$$

on the boundary, when integrating by parts. This means that the displacement of the wave amplitude and the flux of wave action density are both taken to be tangential to the boundary.

Combining the evolution equation of wave action density *N* (2.20) and wave phase *φ* (2.9), one has the evolution equation of the internal wave momentum **<sup>p</sup>***/D* := *<sup>α</sup>*2*N*∇**x***φ/D*.

120 D. D. Holm et al.

$$\frac{\partial}{\partial \epsilon t} \frac{\mathbf{p}}{D} - \mathbf{u}^{L} \times \left(\nabla\_{\epsilon \mathbf{X}} \times \frac{\mathbf{p}}{D}\right) + \nabla\_{\epsilon \mathbf{X}} \left(\mathbf{u}^{L} \cdot \frac{\mathbf{p}}{D}\right) = -\frac{\alpha^{2}}{D} \left(N \nabla \widetilde{\boldsymbol{\omega}} + \mathbf{k} \, \text{div}\_{\epsilon \mathbf{X}} \left(N \mathbf{v}\_{G}\right)\right). \tag{2.23}$$

The Euler-Poincaré equations for the total momentum (2.17) and wave momentum (2.23) may be assembled into the Euler-Poincaré equation for the mean flow momentum, **<sup>m</sup>** <sup>=</sup> *<sup>D</sup>***u***<sup>L</sup>* <sup>+</sup> *<sup>D</sup>* <sup>×</sup> **<sup>x</sup>**. Dividing this through by the advected mass density, *D*, gives the following equation for **u***<sup>L</sup>*

$$\begin{split} & \text{density, } D, \text{ gives the following equation for } \mathbf{u}^{\omega} \\ & \frac{\partial}{\partial \epsilon t} \mathbf{u}^{L} - \mathbf{u}^{L} \times \text{curl}\_{\epsilon \mathbf{x}} \left( \mathbf{u}^{L} + \mathfrak{A} \times \mathbf{x} \right) + \nabla\_{\epsilon} \mathbf{x} \left( \frac{1}{2} |\mathbf{u}^{L}|^{2} + p\_{0} \right) + \frac{1}{\epsilon} \mathbf{g} \rho \widehat{\mathbf{z}} \\ & \quad = -\alpha^{2} \nabla\_{\epsilon \mathbf{x}} \left( -\widetilde{\boldsymbol{\alpha}} \frac{N}{D} + \widetilde{\boldsymbol{\alpha}}^{2} |\mathbf{a}|^{2} + a\_{l} a\_{j}^{\*} \frac{\partial^{2} p\_{0}}{\partial \boldsymbol{\alpha}\_{l} \partial \mathbf{x}\_{j}} \right) \\ & \quad \quad - \frac{\alpha^{2}}{D} \left( N \nabla\_{\epsilon \mathbf{x}} \widetilde{\boldsymbol{\omega}} + \mathbf{k} \, \mathrm{div}\_{\epsilon \mathbf{x}} (N \mathbf{v}\_{G}) \right) . \end{split} \tag{2.24}$$

**Remark 2.2 (Hydrostatic and Geostrophic Balances)** As explained in section 2 of Gjaja and Holm [5], at leading order *O(*1*/)* the motion equation (2.24) establishes hydrostatic and geostrophic balances, namely

$$\begin{aligned} \text{hydrostatic and geostrophic balances, namely} \\\\ 2\mathbf{\hat{z}} \times \mathbf{u}^L(\epsilon \mathbf{x}, \epsilon t) + \mathbf{g}\rho(\mathbf{x}, \epsilon t)\hat{\mathbf{z}} + \frac{\partial p\_0(\mathbf{x}, \epsilon t)}{\partial \mathbf{x}} = 0. \end{aligned} \tag{2.25}$$

In order to provide the restoring force for internal waves, the advected relative density (or, buoyancy) *ρ l A(***x***,t)* must have one derivative of order *O(*1*)* with respect to the vertical coordinate *z*. In order to contribute to the wave component of the pressure gradient at order *O(α*2*)* in the motion equation (2.24), the mean pressure *p*<sup>0</sup> must have two derivatives of order *O(*1*)* with respect to the vertical coordinate *z*.

**Remark 2.3 (Kelvin's Circulation Theorem for WMFI)** The two Euler-Poincaré equations for the total momentum **M** and mean flow momentum **m** readily implies their respective Kelvin-circulation theorems. Namely, for the mean flow momentum **m**, (2.24) implies the following

$$\begin{split} & \text{ implies the following} \\ & \frac{d}{d\epsilon t} \oint\_{c(\mathbf{U}^L)} \left( \mathbf{u}^L + \mathbf{\mathcal{R}} \times \mathbf{x} \right) \cdot d\mathbf{x} + \oint\_{c(\mathbf{U}^L)} \frac{1}{\epsilon} \rho g \mathbf{\hat{z}} \cdot d\mathbf{x} \\ & = -\alpha^2 \oint\_{c(\mathbf{U}^L)} D^{-1} \left( N \nabla\_{\epsilon \mathbf{X}} \tilde{\boldsymbol{\omega}} + \mathbf{k} \, \text{div}\_{\epsilon \mathbf{X}} \left( N \mathbf{v}\_G \right) \right) \cdot d\mathbf{x} \,, \end{split} \tag{2.26}$$

in which one notes that the internal wave terms contribute to the creation of circulation of the mean flow at order *O(α*2*)*. For the total momentum **M**, Eq. (2.17) implies that

On the Interactions Between Mean Flows and Inertial Gravity Waves in the... 121

$$\max\_{\mathbf{u}\in\mathcal{U}} \text{maximations } \mathbf{D}\mathbf{c}\mathbf{w}\mathbf{C}\mathbf{u} \text{ m\'arrow} \mathbf{w}\mathbf{w} \text{ an\'u} \text{mean\'{}c\'\!u\text{w}\!y\text{}} \text{ ways } \mathbf{u}\mathbf{u}\mathbf{x}\dots\mathbf{u}\mathbf{x}$$

$$\frac{d}{d\epsilon\!t}\oint\_{c(\mathbf{u}^{L})} \left(\mathbf{u}^{L} + \mathbf{A}\times\mathbf{x} - \alpha^{2}D^{-1}N\mathbf{k}\right) \cdot d\mathbf{x} + \oint\_{c(\mathbf{u}^{L})} \frac{1}{\epsilon} \rho g \widehat{\mathbf{z}} \cdot d\mathbf{x} = \mathbf{0}\,.\tag{2.27}$$

Thus, just as for the introduction of Stokes drift velocity into the integrand of Kelvin's circulation theorem in Craik-Leibovich theory [3], one may regard the additional non-inertial force of the internal waves on the mean flow circulation as arising from a shift in the momentum per unit mass in the Kelvin circulation integrand, performed to include the internal wave degree of freedom.

**Legendre Transforming Wave Lagrangian** *L*¯ *<sup>W</sup>* **Into Canonical Phase Space Variables** By using the definitions of *<sup>N</sup>* and *<sup>ω</sup>*, one can compute the Legendre transform of *L*¯ *<sup>W</sup>* to obtain the following WMFI Hamiltonian *H*¯*<sup>W</sup>* ,

$$\begin{split} \bar{H}\_W &:= \langle N \,, \,\widetilde{\boldsymbol{\omega}} \rangle - \bar{L}\_W = \int\_{\mathcal{M}} D \left( \widetilde{\boldsymbol{\omega}}^2 |\mathbf{a}|^2 + i \left( b \mathbf{k} \cdot \mathbf{a}^\* - b^\* \mathbf{k} \cdot \mathbf{a} \right) \right. \\ &\left. + a\_l^\* a\_j \frac{\partial^2 p\_0}{\partial x\_l \partial x\_j} \right) d^3 \mathbf{x} \\ &= \int\_{\mathcal{M}} \frac{1}{4D |\mathbf{a}|^2} \left( N - 2i D \mathbf{\mathcal{R}} \cdot \mathbf{a} \times \mathbf{a}^\* \right)^2 + i D \left( b \mathbf{k} \cdot \mathbf{a}^\* - b^\* \mathbf{k} \cdot \mathbf{a} \right) \\ &+ D a\_l^\* a\_j \frac{\partial^2 p\_0}{\partial x\_l \partial x\_j} d^3 \mathbf{x} \,, \end{split} (2.28)$$

where we have dropped the dependence on higher order terms *O(α*2*, α*4*)* in the asymptotic expansion. Inserting the expression (2.28) for *H*¯*<sup>W</sup>* into (2.10) yields the phase space expression of *L*¯ *<sup>W</sup>* as

$$\begin{split} \bar{L}\_{W} &= \int\_{\mathcal{M}} -N \left( \frac{\partial \phi}{\partial \epsilon t} + \mathbf{u}^{L} \cdot \nabla\_{\epsilon \mathbf{X}} \phi \right) - \frac{1}{4D|\mathbf{a}|^{2}} \left( N - 2i \, D \mathbf{\mathcal{R}} \cdot \mathbf{a} \times \mathbf{a}^{\*} \right)^{2} \\ &- i \, D \left( b \mathbf{a}^{\*} - b^{\*} \mathbf{a} \right) \cdot \nabla \phi - D a\_{i}^{\*} a\_{j} \frac{\partial^{2} p\_{0}}{\partial x\_{i} \partial x\_{j}} + \mathcal{O}(a^{2} \epsilon) \, d^{3} \mathbf{x} \,. \end{split} \tag{2.29}$$

**Remark 2.4 (Physical Interpretation of GLM WMFI)** The variations of the WKB mean wave Lagrangian *L*¯ *<sup>W</sup>* in the variables *N* and *φ* recover canonical Hamiltonian WKB wave equations (2.20) and (2.21) for *N* and *φ*. These canonical equations have been boosted into the reference frame of the Lagrangian mean transport velocity **u***L*. Moreover, the Lagrangian mean transport velocity **u***<sup>L</sup>* satisfies the Euler-Boussinesq equations on the left-hand side of Eq. (2.26). Thus, the phase space expression of the wave Lagrangian *L*¯ *<sup>W</sup>* provides the physical interpretation of the WKB mean wave motion in GLM. Namely, GLM expresses WMFI as WKB wave motion boosted into the reference frame of the Euler-Boussinesq equations satisfied by the Lagrangian mean transport velocity, **u***L*, and its corresponding pressure, *p*0, and density, *ρ*. The dependence of the wave Lagrangian *α*2*L*¯ *<sup>W</sup>* on the fluid variables *D* and *ρ* implies from its variation in *p*<sup>0</sup> that incompressibility of the Lagrangian mean transport velocity, **u***L*, no longer holds exactly. Indeed, Eq. (2.18) shows that the divergence of **u***<sup>L</sup>* is of order *O(α*2*)*, which would need to be considered when going beyond the order of asymptotics *O(α*2*)* considered here.

**Remark 2.5 (Potential Vorticity (PV) Advection Theorem for WMFI)** Rewriting the indicated operations in the Kelvin circulation theorem for WMFI after applying the Stokes thereom gives us

$$(\partial\_l + \mathcal{L}\_{\mu^L}^{\epsilon})d\left(D^{-1}\mathbf{M} \cdot d\mathbf{x}\right) + \frac{1}{\epsilon}g d\rho \wedge dz = 0 \,,\tag{2.30}$$

where <sup>L</sup> denotes the Lie-derivative taken w.r.t to the rescaled basis **x**. Since *<sup>D</sup>* and *ρ* are advected, i.e. they satisfies the advection equations (2.15), one finds

$$\begin{cases} \partial\_l + \mathcal{L}\_{\boldsymbol{\mu}^L}^\epsilon \left( d \left( D^{-1} \mathbf{M} \cdot d \mathbf{x} \right) \wedge d\rho \right) \\\\ = \left( \partial\_l + \mathcal{L}\_{\boldsymbol{\mu}^L}^\epsilon \right) \left( D^{-1} \nabla\_\epsilon \mathbf{x} \rho \cdot \operatorname{curl}\_\epsilon \mathbf{x} \left( D^{-1} \mathbf{M} \right) D \, d^3 x \right) = 0 \,. \end{cases} \tag{2.31}$$

Consequently, one finds the following total advective conservation law for WMFI potential vorticity PV,

$$\frac{\partial}{\partial \epsilon t} q + \mathbf{u}^L \cdot \nabla\_{\epsilon \mathbf{X}} q = 0 \text{ where } q := D^{-1} \nabla\_{\epsilon \mathbf{X}} \rho \cdot \operatorname{curl}\_{\epsilon \mathbf{X}} \left( \mathbf{u}^L + \mathbf{\mathcal{Q}} \times \mathbf{x} + a^2 D^{-1} N \mathbf{k} \right). \tag{2.32}$$

**Solving for Wave Polarisation Parameters/Lagrange Multipliers** *p***,** *b***,** *b*∗**, a and a**∗ The quantities *p* and *b* in (2.10) are Lagrange multipliers which impose the incompressibility constraints for volume preservation *D* = 1 and transversality of the wave vectors **k** · **a** = 0, respectively. The complex vector wave amplitudes **a** and **a**∗ are also Lagrange multipliers whose variations impose a linear relationship among most of the wave variables. In particular, stationarity of wave action *SW* under variations of wave polarisation parameters *b* and **a**∗ gives, respectively,

$$\mathbf{k} \cdot \mathbf{a} = 0 \quad \text{and} \quad \widetilde{\boldsymbol{\omega}}^2 \mathbf{a} - 2i \widetilde{\boldsymbol{\omega}} \mathbf{\Omega} \times \mathbf{a} - (\mathbf{a} \cdot \nabla) \frac{\partial p\_0}{\partial \mathbf{x}} = ib\mathbf{k} \,\tag{2.33}$$

from which *b* follows easily from the first constraint, upon taking the dot product of **k** with the second constraint,

$$|\mathbf{k}|^{2}ib = -2i\widehat{\boldsymbol{\omega}}(\mathbf{Q} \times \mathbf{a}) \cdot \mathbf{k} - \mathbf{k} \cdot (\mathbf{a} \cdot \nabla)\nabla p\_{0} = -k^{l} \left(2i\widetilde{\boldsymbol{\omega}}\widehat{\boldsymbol{\Omega}}\_{lj} + (p\_{0})\_{lj}\right)a^{j},\tag{2.34}$$

where  *ij* = −*ijk
k* and the complex vector amplitude **<sup>a</sup>** is found from the 3 <sup>×</sup> <sup>3</sup> Hermitian matrix inversion,

$$
\begin{bmatrix}
\widetilde{\omega}^2 - (p\_0)\_{11} & i\widetilde{\omega}\widetilde{\omega}\widehat{\Omega}\_{12} - (p\_0)\_{12} \ i\widetilde{\omega}\widetilde{\omega}\widehat{\Omega}\_{13} - (p\_0)\_{13} \\
i\widetilde{\omega}\widehat{\omega}\widehat{\Omega}\_{12} - (p\_0)\_{12} & \widetilde{\omega}^2 - (p\_0)\_{22} & i\widetilde{\omega}\widetilde{\omega}\widehat{\Omega}\_{23} - (p\_0)\_{23} \\
i\widetilde{\omega}\widehat{\omega}\widehat{\Omega}\_{13} - (p\_0)\_{13} \ i\widetilde{\omega}\widehat{\Omega}\widehat{\Omega}\_{23} - (p\_0)\_{23} & \widetilde{\omega}^2 - (p\_0)\_{33} \\
\end{bmatrix}
\mathbb{P}\_{\perp}\begin{bmatrix}a\_{\mathsf{l}}\\a\_{\mathsf{2}}\\a\_{\mathsf{3}}\\\end{bmatrix} = ib\begin{bmatrix}k\_{\mathsf{l}}\\k\_{\mathsf{2}}\\k\_{\mathsf{3}}\\\end{bmatrix},\tag{2.35}
$$

in which the 3 <sup>×</sup> 3 matrix <sup>P</sup><sup>⊥</sup> given by

$$\mathbb{P}\_{\perp ij} := \left( \delta\_{ij} - \frac{k\_i k\_j}{|\mathbf{k}|^2} \right)^{\perp}$$

projects out the component along **<sup>k</sup>** of the complex vector amplitude **<sup>a</sup>** <sup>∈</sup> <sup>C</sup>3.

**An Index Operator Form of the Polarisation Constraints** The wave polarisation constraints in (2.33) and (2.35) may be rewritten in index form as

$$a^l k\_l = 0 \quad \text{and} \quad D\_{lj} a^j = ib k\_l \quad \text{with} \quad D\_{lj} = \widetilde{\alpha}^2 \delta\_{lj} + i \widetilde{\alpha} 2 \widehat{\Omega}\_{lj} - \frac{\partial^2 p\_0}{\partial x^j \partial x^l}$$

$$\text{so} \quad a^{\*l} D\_{lj} a^j = 0 \,. \tag{2.36}$$

The index operator form in (2.36) of the polarisation relations for *(***a***, b)* in (2.33) suggests a more compact representation of the wave Lagrangian, *L*¯ *<sup>W</sup>* , as we discuss next.

**Representing the Wave Polarisation Parameters a and** *b* **as a Complex Four-Vector Field** After an integration by parts using the boundary conditions in (2.22), the Eulerian action principle in (2.10) may be expressed equivalently as

$$\begin{split} 0 = \delta(S\_{MF} + \alpha^2 S\_W) &= \delta \int\_{t\_0}^{t\_1} (\bar{L}\_{MF} + \alpha^2 \bar{L}\_W) \, dt \\ &:= \delta \int\_{t\_0}^{t\_1} \int\_{\mathcal{M}} \left( \frac{D}{2} |\mathbf{u}^L|^2 + D \mathbf{u}^L \cdot \mathbf{\mathcal{Q}} \times \mathbf{x} - \mathbf{g} D \rho z - p(D - 1) \\ &\qquad + \alpha^2 D F^{\mu \*} D\_{\mu \nu} F^{\nu} + O(\alpha^2 \epsilon, \alpha^4) \right) d^3 x \, dt \,, \end{split} \tag{2.37}$$

where, for notational convenience, the fields **a** and *b* comprise a complex "fourvector field",

$$F^{\mu} = (\mathbf{a}, b)^{T} \,,$$

with *μ* = 1*,* 2*,* 3*,* 4, and the Hermitian dispersion tensor *Dμν* = *D*<sup>∗</sup> *νμ* is given by

$$D\_{lj} = \widetilde{\omega}^2 \delta\_{lj} + i \widetilde{\omega} 2 \widehat{\Omega}\_{lj} - \frac{\partial^2 p\_0}{\partial x^i \partial x^j}, \quad D\_{4j} = ik\_j = -D\_{j4}, \quad D\_{44} = 0 \ .$$

It is clear from the decomposition of the WMFI action in (2.37) that stationarity of *SW* with respect to variations of the fields *<sup>F</sup>* <sup>=</sup> *(***a***, b)T* yields linear relations among the wave parameters *(***a***, b)* that recover the polarisation relations in (2.33)

$$D\_{\mu\upsilon}F^{\upsilon} = 0.\tag{2.38}$$

Equation (2.38) recovers the linear constraints in (2.33) on the polarization eigendirections of the field *F <sup>μ</sup>* up to an overall complex constant that can be set at the initial time.

**Doppler-Shifted Dispersion Relation** The solvability condition det*(Dμν )* = 0 for (2.38) now produces the dispersion relation for the Doppler-shifted frequency of internal gravitational waves (IGW),

$$\widetilde{\boldsymbol{\omega}}^{2} := (\boldsymbol{\omega} - \mathbf{u}^{L} \cdot \mathbf{k})^{2} = \left(-\frac{\partial}{\partial \epsilon t} \boldsymbol{\phi} - \mathbf{u}^{L} \cdot \nabla\_{\epsilon} \mathbf{x} \boldsymbol{\phi}\right)^{2}$$

$$= \frac{(2\mathbf{\varDelta} \cdot \mathbf{k})^{2}}{|\mathbf{k}|^{2}} + \left(\delta\_{lj} - \frac{k\_{l}k\_{j}}{|\mathbf{k}|^{2}}\right) \frac{\partial^{2} p\_{0}}{\partial x^{l} \partial x^{j}},\tag{2.39}$$

which is independent of the magnitude |**k**| of the wave vector **k**, except for the Doppler shift due to the fluid motion. Formula (2.39) updates the phase *φ* of the wave at each time step. The complex vector amplitude **a** is then found from inversion of the 3 × 3 Hermitian matrix in (2.35).The remaining wave quantity *b* is then determined from (2.34) at a given time step.

**Remark 2.6** Under conditions of hydrostatic balance and equilibrium stratification, when **<sup>u</sup>***<sup>L</sup>* <sup>=</sup> 0 and the pressure Hessian *pij* has only the *<sup>p</sup>*<sup>33</sup> component, Eq. (2.39) reduces to the well-known dispersion relation for linear internal waves [23]. For non-equilibrium flows, though, Eq. (2.39) shows the sensitivity of the propagation of internal waves to the pressure Hessian.

# *2.2 Hamiltonian Structure for the WMFI Equations at Leading Order*

Thus far, we have considered a Legendre transform within the *wave* Lagrangian (see Eq. (2.28)). It remains to perform the same calculation for the mean flow to see the full Hamiltonian structure of the model. We define the momentum of the entire flow by

$$\mathbf{M} := D\mathbf{u}^L + D\boldsymbol{\mathfrak{Q}} \times \mathbf{x} - \boldsymbol{\alpha}^2 N \nabla\_{\boldsymbol{\epsilon}\mathbf{X}} \boldsymbol{\phi} =: \mathbf{m} - \mathbf{p}$$
 
$$\text{with } \mathbf{m} := D\mathbf{u}^L + D\boldsymbol{\mathfrak{Q}} \times \mathbf{x}, \quad \text{and} \quad \mathbf{p} := \boldsymbol{\alpha}^2 N \nabla\_{\boldsymbol{\epsilon}\mathbf{X}} \boldsymbol{\phi}. \tag{2.40}$$

In the above definition, the momenta **m** and **p** are the mean and wave parts of the momentum respectively and the total momentum, **M**, is the variational derivative of the constrained Lagrangian (2.10) with respect to the Lagrangian mean velocity. We perform both the wave and mean flow Legendre transforms concurrently as

$$\begin{split} h &= \left< \mathbf{M}, \, \mathbf{u}^{L} \right> + \alpha^{2} \left< N \, \,, \, \boldsymbol{\omega} \right> - \bar{L}\_{MF} - \alpha^{2} \bar{L}\_{W} \\ &= \left< D\mathbf{u}^{L} + D\mathbf{A} \times \mathbf{x} \,, \, \mathbf{u}^{L} \right> + \alpha^{2} \left< N \, \,, \, \widetilde{\boldsymbol{\omega}} \right> - \bar{L}\_{MF} - \alpha^{2} \bar{L}\_{W} \, . \end{split} \tag{2.41}$$

The resulting WMFI Hamiltonian then follows as

$$\begin{split} h(\mathbf{M}, D, \rho, \mathbf{p}, N) &= \int \left\{ \frac{1}{2D} \left| \mathbf{M} + \mathbf{p} - D(\mathbf{q} \times \mathbf{x}) \right|^2 \right. \\ &\left. + D \rho gz + \frac{a^2 D}{4|\mathbf{a}|^2} \left( \frac{N}{D} - 2i \mathbf{\varPhi} \cdot (\mathbf{a} \times \mathbf{a}^\*) \right)^2 \\ &\left. + \frac{i D}{N} (b \, \mathbf{p} \cdot \mathbf{a}^\* - b^\* \, \mathbf{p} \cdot \mathbf{a}) \right. \\ &\left. + a^2 D a\_l^\* a\_j \frac{\partial^2 p\_0}{\partial x\_l \partial x\_j} + (D - 1) p\_0 \right) d^3 x \,. \end{split} \tag{2.42}$$

The variational derivatives of the constrained Hamiltonian (2.42) may be determined from the coefficients in the following expression,

$$\begin{split} \delta h &= \int \left\{-\boldsymbol{\varpi} \delta \boldsymbol{D} + D \boldsymbol{g} \boldsymbol{z} \, \delta \boldsymbol{\rho} + \mathbf{u}^{L} \cdot \delta \mathbf{M} - (1 - D) \delta p\_{0} \\ &+ \boldsymbol{\alpha}^{2} \Big[\widetilde{\boldsymbol{\omega}} - \frac{i \boldsymbol{D}}{N} \left(b \, \mathbf{k} \cdot \mathbf{a}^{\*} - b^{\*} \, \mathbf{k} \cdot \mathbf{a}\right)\right] \delta N \\ &+ \left[\mathbf{u}^{L} + \mathbf{v}\_{G}\right] \cdot \delta \mathbf{p} + i \, \boldsymbol{\alpha}^{2} D (\delta b \, \mathbf{k} \cdot \mathbf{a}^{\*} - \delta b^{\*} \, \mathbf{k} \cdot \mathbf{a}) \\ &- \boldsymbol{\alpha}^{2} \Big[\delta \mathbf{a}^{\*} \cdot \left(D \widetilde{\boldsymbol{\omega}}^{2} \mathbf{a} + 2i \, D \widetilde{\boldsymbol{\omega}} (\mathbf{Q} \times \mathbf{a})\right) \\ &- i \, Db \mathbf{k} - D \left(\mathbf{a} \cdot \frac{\partial}{\partial \mathbf{x}}\right) \frac{\partial p\_{0}}{\partial \mathbf{x}}\right) + \text{c.c.}\bigg] \end{split} (2.43)$$

#### **Remark 2.7 (Discussion)**


generate the constraints on the prognostic variables and the relations among the diagnostic variables. The solvability condition for these relations among the diagnostic variables determines the dispersion relation for the WKB IGWs.

• The *N* and **p** equations can combine to yield

$$
\partial\_{\epsilon I} \mathbf{k} + \nabla\_{\epsilon \mathbf{X}} \boldsymbol{\alpha} = 0 \dots
$$

This is the so-called 'conservation of waves' equation, which imposes equality of cross derivatives of the phase function, *φ(***x***, t)*.

The above variational derivatives can be assembled into the following *untangled* Lie-Poisson Hamiltonian form which separates the dynamics of the total momentum **M** in (2.40) and the advected fluid variables, *D* and *ρ*, from the wave momentum **p** and wave action density *N*,

$$
\frac{\partial}{\partial \epsilon t} \begin{bmatrix} M\_j \\ D \\ \rho \\ p\_j \\ N \end{bmatrix} = - \begin{bmatrix} M\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} M\_j \ D \partial\_{\epsilon j} - \rho\_{\epsilon j} & 0 & 0 \\ \partial\_{\epsilon k} D & 0 & 0 & 0 \\ \rho\_{,\epsilon k} & 0 & 0 & 0 \\ 0 & 0 & 0 & p\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} p\_j \ N \partial\_{\epsilon j} \\ 0 & 0 & 0 & \partial\_{\epsilon k} N \end{bmatrix} \\
$$

$$
\times \begin{bmatrix} \delta h/\delta M\_k = \mathbf{u}^{L,k} \\ \delta h/\delta D = -\varpi \\ \delta h/\delta \rho = D \ g z \\ \delta h/\delta p\_k = \left( \mathbf{u}^L + \mathbf{v}\_G \right)^k \\ \delta h/\delta N = \mathbf{a}^2 \widetilde{\omega} \end{bmatrix}. \tag{2.44}
$$

Here, we are using a shorthand notation for the derivatives, *∂j* = *∂/∂xj* for example, and we have used the constraint that **k** · **v***<sup>G</sup>* = 0 in taking the variations in *b* and *b*∗.

**Remark 2.8** The *untangled* Lie-Poisson Hamiltonian form in (2.44) of the ideal wave mean flow system of equations derived in the previous section from the GLM Hamilton's principle represents a constrained Lie-Poisson Hamiltonian fluid system. Its Lie-Poisson bracket is defined on the dual of the direct sum of two semidirectproduct Lie algebras

$$
\mathfrak{X}\_{TOT}(\widehat{\mathbb{S}})(\mathcal{F}\_{MF} \oplus \mathrm{Den}\_{MF}) \oplus (\mathfrak{X}\_{W}(\widehat{\mathbb{S}})\mathcal{F}\_{W})\dots
$$

Dual variables in *L*2*(*R3*)* pairing are the following, whose definitions also explain the geometric meanings of the standard calculus notation for the (MF) and (W) variables.


On the Interactions Between Mean Flows and Inertial Gravity Waves in the... 127


**Remark 2.9 (Preservation of PV Casimirs)** Notice that the Casimir functions for the Hamiltonian structure of GLM WMFI in the upper left block diagonal of the Lie-Poisson operator in Eq. (2.44) are in the same form as for the Euler-Boussinesq fluid, except they have been modified to accommodate the wave momentum. Consequently, no Casimir functions have been gained or lost in coupling the mean flow to the fluctuations.

#### **Canonical Structure of the Wave Dynamics**

The wave dynamics above are written in their Lie-Poisson Hamiltonian structure. Should we return to the canonical variables, *N* and *φ*, then the standard canonical structure emerges. Indeed, substituting **<sup>p</sup>** <sup>=</sup> *<sup>α</sup>*2*N*∇*<sup>φ</sup>* into the Hamiltonian (2.42) and taking variations gives<sup>1</sup>

$$
\alpha^2 \frac{\partial \phi}{\partial \epsilon t} = -\frac{\delta h}{\delta N} = -\alpha^2 \mathbf{u}^L \cdot \nabla\_{\epsilon \mathbf{X}} \phi - \alpha^2 \widetilde{\omega} \,, \tag{2.45}
$$

$$\alpha^2 \frac{\partial N}{\partial \epsilon t} = \frac{\delta h}{\delta \phi} = -\alpha^2 \operatorname{div}\_{\epsilon \mathbf{X}}(N\mathbf{u}^L) - \alpha^2 i \operatorname{div}\_{\epsilon \mathbf{X}}\left(D(b\mathbf{a}^\* - b^\*\mathbf{a})\right). \tag{2.46}$$

**Tangled Version of the Lie-Poisson Hamiltonian Structure** By writing the Hamiltonian in terms of the mean flow momentum, **m**, rather than the total momentum, **M**, we recover the tangled version of the Lie-Poisson Hamiltonian form of the equations. Above, as in [13], we have presented wave-current interaction in its untangled form. In a previous work [12], the authors presented both the tangled and untangled forms, and an analogous calculation is also possible for this model of WMFI. Indeed, the Hamiltonian *h(***M***,D, ρ,* **p***,N)* in (2.42) becomes

$$\begin{split} h'(\mathbf{m}, D, \rho, \mathbf{p}, N) &= \int \left\{ \left[ \frac{1}{2D} \left| \mathbf{m} - D(\mathbf{\varPhi} \times \mathbf{x}) \right|^2 \right. \\ &\left. + D\rho gz + \frac{\alpha^2 D}{4|\mathbf{a}|^2} \left( \frac{N}{D} - 2i\mathbf{\varPhi} \cdot (\mathbf{a} \times \mathbf{a}^\*) \right)^2 \right] \\ &\left. + \frac{iD}{N} (b \,\mathbf{p} \cdot \mathbf{a}^\* - b^\* \,\mathbf{p} \cdot \mathbf{a}) \right. \\ &\left. + \alpha^2 D a\_l^\* a\_l \frac{\partial^2 p\_0}{\partial x\_l \partial x\_j} + (D - 1) p\_0 \right) d^3 x \,. \end{split} (2.47)$$

<sup>1</sup> The constant factor of *α*<sup>2</sup> appearing within the canonical structure has emerged due to the choice of multiplying the constraints in Hamilton's principle by the same constant.

The variational derivatives are largely the same, with differences only in the variation with respect to **p**. The tangled form of the Hamiltonian equations in the Hamiltonian *h(***m***,D, ρ,* **p***,N)* in (2.47) is

$$
\begin{aligned}
\begin{bmatrix} m\_j \\ D \\ \frac{\partial}{\partial \epsilon t} \\ p\_j \\ N \end{bmatrix} = - \begin{bmatrix} m\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} m\_j \ D \partial\_{\epsilon j} - \rho\_{,\epsilon j} \ p\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} p\_j \ N \partial\_{\epsilon j} \\ \partial\_{\epsilon k} D & 0 & 0 & 0 \\ \rho\_{,\epsilon k} & 0 & 0 & 0 \\ p\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} p\_j & 0 & 0 & p\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} p\_j \ N \partial\_{\epsilon j} \\ \partial\_{\epsilon k} N & 0 & 0 & \partial\_{\epsilon k} N & 0 \end{bmatrix} \\
\times \begin{bmatrix} \delta h'/\delta m\_k = \mu^{L,k} \\ \delta h'/\delta D = -\varpi \\ \delta h'/\delta \rho = D \varrho \varepsilon \\ \delta h'/\delta p\_k = \mathbf{v}\_G^k \\ \delta h'/\delta N = \alpha^2 \widetilde{\omega} \end{bmatrix}. \end{aligned} \tag{2.48}$$

Instead of the direct sum in the untangled case in Remark 2.8, this tangled Lie-Poisson bracket is defined on the dual of two nested semidirect-product Lie algebras

$$(\mathfrak{X}\_{MF}\widehat{\uplus})(\mathcal{F}\_{MF}\oplus \mathrm{Den}\_{MF})\bigotimes (\mathfrak{X}\_{W}\widehat{\uplus}\mathcal{F}\_{W})\ .$$

Corresponding dual variables in *L*2*(*R3*)* pairing are similar to those explained in Remark 2.8.

## **3 Stochastic WMFI**

Stochastic equations of motion may be used in fluid dynamics to model uncertainty, and such equations may be derived through Hamilton's principle [8]. Such stochastic terms can be used to parametrise unresolved 'subgridscale' dynamics absent in computational simulations, and as such are particularly relevant to geophysical applications.

Motivated by the fact that, due to computational limitations, the mean flow may only be solved for on a coarse grid when considering large scale geophysical flows, we apply the method of stochastic advection by Lie transport [8] to the mean flow map, *g*¯*<sup>t</sup>* . This may be done as

$$\mathbf{d}\bar{\mathbf{g}}\_{l}\mathbf{x}\_{0} = (\mathbf{u}^{L}\diamond \bar{\mathbf{g}}\_{l})\mathbf{x}\_{0}dt + \sum\_{l} (\mathfrak{k}\_{l}\diamond \bar{\mathbf{g}}\_{l})\mathbf{x}\_{0}\diamond dW\_{l}^{l},\tag{3.1}$$

where *W<sup>i</sup> <sup>t</sup>* are independent and identically distributed Brownian motions and ◦*dW<sup>i</sup> t* denotes Stratonovich integration.<sup>2</sup> This is equivalent to

$$\operatorname{d}\bar{\operatorname{g}}\_{l}\bar{\operatorname{g}}^{-1}(\mathbf{x}\_{l}) = \mathbf{u}^{L}(\mathbf{x}\_{l}) \, dt + \sum\_{l} \mathfrak{f}\_{l}(\mathbf{x}\_{l}) \diamond \operatorname{d}W\_{l}^{l} =: \operatorname{d}\mathbf{x}\_{l} \,,$$

and we see that the Lagrangian mean velocity, **u***L*, has been stocastically perturbed. By an application of the Kunita-Itô-Wentzell formula [4], we see that

$$\mathbf{d}\mathbf{X}\_{l} = \mathbf{d}\mathbf{g}\_{l}\mathbf{g}\_{l}^{-1}\mathbf{X}\_{l} = \mathbf{d}\mathbf{x}\_{l} + \alpha^{2} \left(\mathbf{d}\xi\_{l}(\mathbf{x}\_{l}) + \mathbf{d}\mathbf{x}\_{l} \cdot \nabla\xi\_{l}(\mathbf{x}\_{l})\right). \tag{3.2}$$

Should we assume that the entire motion, corresponding to **<sup>U</sup>***<sup>t</sup>* = ˙*gtg*−1, also has a stochastic part, corresponding to *ζ <sup>ξ</sup> <sup>i</sup>* , then we have

$$\mathbf{U}\_l \, dt + \sum\_l \mathbf{f}\_l^\xi \diamond d\boldsymbol{W}\_l^l = \mathbf{d}\mathbf{x}\_l + \boldsymbol{\alpha}^2 \Big(\mathbf{d}\boldsymbol{\xi}\_l(\mathbf{x}\_l) + \mathbf{d}\mathbf{x}\_l \cdot \nabla \boldsymbol{\xi}\_l(\mathbf{x}\_l)\Big). \tag{3.3}$$

The uniqueness of the Doob-Meyer decomposition then indicates that each *ζ <sup>ξ</sup> i* decomposes into a part corresponding to the mean flow, *ζ <sup>i</sup>*, and a part corresponding to the wave motion, which we call *σi*..[21].

**Remark 3.1** Following Street and Crisan, [21], by the compatibility of *ξ <sup>t</sup>* with the driving semimartingale, we have a representation d*ξ* = *A*<sup>0</sup> *dt* + *<sup>i</sup> <sup>A</sup><sup>i</sup>* ◦ *dW<sup>i</sup> <sup>t</sup>* . The uniqueness of the Doob-Meyer decomposition then gives **<sup>U</sup>***<sup>t</sup>* <sup>=</sup> **<sup>u</sup>***<sup>L</sup>* <sup>+</sup> *<sup>α</sup>*<sup>2</sup> *<sup>A</sup>*<sup>0</sup> <sup>+</sup> *<sup>u</sup><sup>L</sup>* · ∇**x***ξ <sup>t</sup>* and *<sup>σ</sup><sup>i</sup>* <sup>=</sup> *<sup>ζ</sup> <sup>i</sup>* <sup>+</sup> *<sup>α</sup>*<sup>2</sup> *A<sup>i</sup>* + *ζ <sup>i</sup>* · ∇**x***ξ <sup>i</sup>* .

WMFI is not limited to temporally averaged terms. The variability of WMFI must also be considered. This consideration results inevitably in differential equations for the slow components of the climate system, which include stochastic transport and forcing terms. There are many ways of introducing stochasticity into the WMFI system. Some guidance in this matter can be found, e.g., in [10].

In this section, we will consider two distinct framework of introducing stochasticity into Hamiltonian fluid systems. The first option laid out here in this section enables wave and fluid dynamics to possess different stochastically fluctuating components in their *transport and phase velocities*, as follows, in which variations of the deterministic Hamiltonian below are the as those in Eq. (2.43). The introduction of the stochastic vector fields to the WMF evolution equations can be accomplished by making the deterministic Hamiltonian to the WMFI a semimartingale in each degree of freedom. The chosen augmentation of the Hamiltonian is based on coupling noise by *L*<sup>2</sup> pairings of spatially varying noise 'modes' with the momentum maps dual to the respective velocities for each degree of freedom, **m**, **p**, and *N*. The coupling is done such that the variational derivatives with respect to the momentum variables

<sup>2</sup> The notation ◦ may be used to denote both composition and Stratonovich integration.

will add stochastic transport terms to each of the corresponding dual velocities, as follows,

$$\begin{split} \mathbf{d}h &= \int \left\{ \left[ \frac{1}{2D} \left| \mathbf{m} - D(\mathbf{\dot{\mathbf{a}}} \times \mathbf{x}) \right|^2 + D\rho gz + \frac{\alpha^2 D}{4|\mathbf{a}|^2} \left( \frac{N}{D} - 2i\mathbf{\dot{\mathbf{a}}} \cdot (\mathbf{a} \times \mathbf{a}^\*) \right)^2 \right] \\ &+ \frac{iD}{N} (b \,\mathbf{p} \cdot \mathbf{a}^\* - b^\* \,\mathbf{p} \cdot \mathbf{a}) + \alpha^2 D a\_l^\* a\_j \frac{\partial^2 p\_0}{\partial x\_l \partial x\_j} \right\} d^3 x \, d\epsilon t \\ &+ \int (D - 1) \mathrm{d}p\_0 d^3 x \\ &+ \sum\_i \int \mathbf{m} \cdot \boldsymbol{\xi}\_i(\mathbf{x}) \diamond d \, W\_{\epsilon t}^i \, d^3 x + \sum\_i \int \mathbf{p} \cdot \boldsymbol{\sigma}\_i(\mathbf{x}) \diamond d \, B\_{\epsilon t}^i \, d^3 x \, . \end{split} \tag{3.4}$$

where *dW<sup>i</sup> <sup>t</sup>* and *dB<sup>i</sup> <sup>t</sup>* are chosen to be distinct Brownian motions and *ζ i(***x***)* and *σi(***x***)* in principle need to be determined from calibration of transport data of each type, and leading eventually to uncertainty quantification. We have introduced a stochastic component of the pressure, thus introducing the notation d*p*0, following the framework of semimartingale driven variational principles [21]. The influence of the stochastic terms on the Lie-Poisson Hamiltonian dynamics can then be easily revealed, as

$$\begin{aligned} \mathbf{d} \begin{bmatrix} m\_j \\ D \\ D \\ p\_j \\ N \end{bmatrix} &= - \begin{bmatrix} m\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} m\_j \ D \partial\_{\epsilon j} - \rho\_{\epsilon kj} \ p\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} p\_j \ N \partial\_{\epsilon j} \\ \partial\_{\epsilon k} D & 0 & 0 & 0 \\ \rho\_{\epsilon k} & 0 & 0 & 0 \\ p\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} p\_j & 0 & 0 & p\_k \partial\_{\epsilon j} + \partial\_{\epsilon k} p\_j \ N \partial\_{\epsilon j} \\ \partial\_k N & 0 & 0 & \partial\_{\epsilon k} N \end{bmatrix} \\ &\times \begin{bmatrix} \delta \mathbf{d} / \delta m\_k = u^L \, d \, \epsilon t + \xi^k\_i (\mathbf{x}) \diamond d \, W^l\_{\epsilon t} \\ \delta \mathbf{d} / \delta D = \overline{\mathbf{n}} \, d \epsilon t + \mathrm{d} p\_0 \\ \delta \mathbf{d} / \delta \rho = D \mathbf{g} \, z \, d \epsilon t \\ \delta \mathbf{d} / \delta p\_k = \mathbf{v}^k\_G \, d \epsilon t + \sigma^k\_i (\mathbf{x}) \diamond d \, B^l\_{\epsilon t} \\ \delta \mathbf{d} / \delta N = \alpha^2 \tilde{\omega} \, d \epsilon t \end{bmatrix} . \end{aligned} (3.5)$$

where *π* is given by

$$
\overline{\pi} = -\varpi - p\_0 \,,
$$

for  as defined in Eq. (2.13). The Hamiltonian variables are as defined in the deterministic case,

$$\mathbf{m} := D(\mathbf{u}^L + \mathbf{\mathcal{Q}} \times \mathbf{x})\,, \quad \mathbf{p} := \alpha^2 N \mathbf{k}\,, \quad \mathbf{v}\_G := \frac{iD}{N}(\mathbf{a}^\* b - \mathbf{a} b^\*) = \frac{2D}{N} \mathbb{S}(\mathbf{a} b^\*)\, .$$

These variables have already appeared in the integrand of Kelvin's circulation theorem in (2.27). The stochastic version of the GLM Kelvin circulation theorem for Euler–Boussinesq incompressible flow in Eq. (2.27) thus becomes

$$\begin{split} \mathop{\rm d} \oint\_{c(\mathbf{dx}\_{l})} D^{-1} \mathbf{M} \cdot d\mathbf{x} &= \mathop{\rm d} \oint\_{c(\mathbf{dx}\_{l})} \left( \mathbf{u}^{L} + \mathbf{\mathcal{Q}} \times \mathbf{x} - D^{-1} \mathbf{p} \right) \cdot d\mathbf{x} \\ &= -\frac{1}{\epsilon} \mathop{\rm g} \oint\_{c(\mathbf{dx}\_{l})} \rho \, dz \, d\epsilon \, t \,, \end{split} \tag{3.6}$$

in which the material loop *c(*d**x***t)* moves along stochastic Lagrangian trajectories given by the characteristics of the following stochastic vector field

$$\mathbf{dx}\_{l} = \mathbf{u}^{L}(\mathbf{x}\_{l}, t)dt + \sum\_{a=1}^{N} \xi\_{a}(\mathbf{x}\_{l}) \diamond dW\_{l}^{a} \,. \tag{3.7}$$

#### **A Stochastic Canonical Structure in the Wave Dynamics**

The canonical structure between the wave variables *N* and *φ*, noted in equations (2.45) and (2.46), now becomes stochastic. Indeed, substituting **M** and **p** = *<sup>α</sup>*2*N*∇**x***<sup>φ</sup>* into the action and taking variations gives

$$\alpha^2 \mathbf{d}\phi = -\frac{\delta \mathbf{d}h}{\delta N} = -\alpha^2 \mathbf{u}^L \cdot \nabla\_\epsilon \mathbf{x} \phi \, d\epsilon t - \alpha^2 \sum\_i \boldsymbol{\xi}\_i \cdot \nabla\_\epsilon \mathbf{x} \phi \diamond d\boldsymbol{W}\_{\epsilon t}^l - \alpha^2 \widetilde{\omega} \, d\epsilon t$$

$$-\alpha^2 \sum\_i \nabla\_\epsilon \mathbf{x} \phi \cdot \boldsymbol{\sigma}\_l \diamond d\boldsymbol{B}\_{\epsilon t}^l \,, \tag{3.8}$$

$$\alpha^2 \mathbf{d}N = \frac{\delta \mathbf{d}h}{\delta \phi} = -\alpha^2 \operatorname{div}\_{\epsilon \mathbf{X}}(N\mathbf{u}^L) \, d\epsilon t - \alpha^2 \sum\_i \operatorname{div}\_{\epsilon \mathbf{X}}(N\boldsymbol{\xi}\_i) \diamond d\boldsymbol{W}\_{\epsilon t}^l$$

$$-\alpha^2 i \operatorname{div}\_{\epsilon \mathbf{X}} \left( D(b\mathbf{a}^\* - b^\* \mathbf{a}) \right) \, d\epsilon t - \alpha^2 \sum\_i \operatorname{div}\_{\epsilon \mathbf{X}}(N\boldsymbol{\sigma}\_l) \diamond d\boldsymbol{B}\_{\epsilon t}^l \,.$$

$$(3.9)$$

Such a stochastic generalisation of Hamilton's canonical equations has been noted and discussed for wave hydrodynamics previously [20] for the classical water wave system.

#### **An Alternative, Energy-Conserving Approach to the Incorporation of Stochastic Noise**

The second option of introducing stochasticity into the WMFI system is through the modification of mean flow and wave momentum to include different stochastically fluctuating components. The introduction of the stochastic momentum can be accomplished by making the deterministic Lie-Poisson bracket of the WMFI system to include stochastic components. Following [11], the chosen modification is the addition of "frozen" Lie-Poisson bracket multiplying semi-martingales. The fixed (frozen) parameters in the frozen Lie-Poisson brackets are the spatially, possibly temporal varying noise "modes" which are transformed by the deterministic transport and phase velocities in the same way as the deterministic momentum. Let *λ<sup>i</sup>* and *ψ<sup>i</sup>* denote the stochastic fluctuations of the mean flow and wave momentum respectively, the stochastic Lie-Poisson equation can be written as

d ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ *mj D ρ pj N* ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ = − ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ *mk∂j* + *∂kmj D∂j* − *ρ,j pk∂j* + *∂kpj N∂j ∂kD* 00 0 0 *ρ,k* 00 0 0 *pk∂j* + *∂kpj* 0 0 *pk∂j* + *∂kpj N∂j ∂kN* 0 0 *∂kN* 0 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ × ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ *uL k dt* −*π dt* + d*p*<sup>0</sup> *D gz dt* **v***k <sup>G</sup> dt <sup>α</sup>*<sup>2</sup> *ω dt* ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ − *i* ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ *λi <sup>k</sup>∂j* <sup>+</sup> *∂kλ<sup>i</sup> j* ◦ *dW<sup>i</sup> t* 0 0 *ψi <sup>k</sup>∂j* <sup>+</sup> *∂kψ<sup>i</sup> j* ◦ *dB<sup>i</sup> t* 0 0 00 0 0 0 00 0 0 *ψi <sup>k</sup>∂j* <sup>+</sup> *∂kψ<sup>i</sup> j* ◦ *dB<sup>i</sup> t* 0 0 *ψi <sup>k</sup>∂j* <sup>+</sup> *∂kψ<sup>i</sup> j* ◦ *dB<sup>i</sup> t* 0 0 00 0 0 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ × ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ *uL k* −*π dt* + d*p*<sup>0</sup> *Dgz* **v***k G α*2*ω* ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ *.* (3.10)

Here, the stochastic component of the pressure d*p* is added as before following the semimartingale driven variational principle [21]. Similarly to the stochastic vector fields *ζ <sup>i</sup>* and *σ<sup>i</sup>* , we need to determine *λ<sup>i</sup>* and *ψ<sup>i</sup>* through calibration with existing data for each type of momentum. The influence of the stochasticicty on the circulation dynamics of the mean flow and wave momentum is clear from the following modified Kelvin-circulation theorem

$$\begin{split} & \text{La molécauons } \text{Re}\,\text{cwech } \text{wech } \text{m} \text{ aus } \text{m} \text{ do} \text{m} \text{ aus } \text{m} \text{ aus } \dots \\ & \text{a} \int\_{c(\mathbf{u}^{L})} \left( \mathbf{u}^{L} + \mathbf{\Omega} \times \mathbf{x} \right) \cdot d\mathbf{x} + \oint\_{c(\mathbf{u}^{L})} \frac{1}{\epsilon} \rho g \widehat{\boldsymbol{\xi}} \\ & \qquad + \boldsymbol{\alpha}^{2} \boldsymbol{D}^{-1} \Big( N \nabla\_{\epsilon} \widetilde{\boldsymbol{\omega}} \widetilde{\boldsymbol{\omega}} + \mathbf{k} \,\text{div}\_{\epsilon \mathbf{X}} \Big( N \mathbf{v}\_{G} \big) \Big) \cdot d\mathbf{x} \,\text{det} \\ & \qquad + \sum\_{i} \oint\_{c(\mathbf{u}^{L})} \boldsymbol{D}^{-1} \Big( \mathbf{u}^{L} \times \frac{\partial}{\partial \epsilon \mathbf{x}} \times \boldsymbol{\lambda}^{i} - \nabla\_{\epsilon} \mathbf{x} \left( \mathbf{u}^{L} \cdot \boldsymbol{\lambda}^{i} \right) \Big) \cdot d\mathbf{x} \diamond \boldsymbol{d} \,\text{W}\_{\epsilon t}^{i} \\ & \qquad + \sum\_{i} \oint\_{c(\mathbf{u}^{L})} \boldsymbol{D}^{-1} \Big( \mathbf{v}\_{G} \times \frac{\partial}{\partial \epsilon \mathbf{x}} \times \boldsymbol{\Psi}^{i} - \nabla\_{\epsilon} \mathbf{x} \left( \mathbf{v}\_{G} \cdot \boldsymbol{\Psi}^{i} \right) \Big) \cdot d\mathbf{x} \diamond \boldsymbol{d} \,\text{B}\_{\epsilon t}^{i} = 0, \end{split} \tag{3.11}$$

where the loop is moving with the *deterministic* velocity field. By construction, the Eq. (3.10) preserves the deterministic energy path-wise as the Poisson structure remain anti-symmetric and the variational derivative of the Hamiltonian is unchanged. However, the modification of the Poisson structure implies that the standard EB fluid Casimirs are no longer conserved.

## **4 Conclusion**

In this paper we have derived a closed system of equations for the interaction of a GLM flow with the slowly varying envelope of a WKB field of internal gravity waves (IGW) by incorporating the two approximate descriptions into Hamilton's principle. Building on the work of Gjaja and Holm [5], we have shown that this approach boosts the canonical equations for the WKB IGW into the reference frame of the Lagrangian mean transport velocity, **u***L*, satisfying the Euler-Boussinesq equations on the left-hand side of Eq. (2.26). Thus, GLM expresses WMFI as WKB wave motion boosted into the reference frame of the Euler-Boussinesq equations satisfied by the Lagrangian mean transport velocity, **u***L*, and its corresponding pressure, *p*0, and density, *ρ*. The dependence of the wave Lagrangian *α*2*L*¯ *<sup>W</sup>* on the fluid variables *D* and *ρ* implies from its variation in *p*<sup>0</sup> that incompressibility of the Lagrangian mean transport velocity **u***<sup>L</sup>* does continue to hold for the order *O(α*2*)* asymptotic expansion treated here.

We have further demonstrated how stochasticity in the fluid can permeate through both the wave and mean flow dynamics, and that such terms can be included through the variational structure. Moreover, this paper has identified the nested semidirect-product Lie-Poisson structure possessed by the Hamiltonian formulation of the GLM WMFI equations. The continued preservation of the fundamental Lie algebraic structure for the Hamiltonian formulation of the stochastic GLM WMFI system implies that its data calibration and uncertainty quantification can still be treated systematically using the stochastic advection by Lie transport (SALT) approach [8]. Future work will focus next on deriving a 2D vertical slice model for these 3D GLM WMFI equations and developing data calibration methods for the 2D vertical slice model, in order to investigate the inclusion of mean internal gravity wave effects on the responses of the stochastic Eady problem.

**Acknowledgments** We are grateful to our friends, colleagues and collaborators for their advice and encouragement in the matters treated in this paper. DH especially thanks C. Cotter, F. Gay-Balmaz, I. Gjaja, J.C. McWilliams, T. S. Ratiu and C. Tronci for many insightful discussions of corresponding results similar to the ones derived here for WMFI, and in earlier work together in deriving hybrid models of complex fluids, turbulence, plasma dynamics, vertical slice models and the quantum–classical hydrodynamic description of molecules. DH and RH were partially supported during the present work by Office of Naval Research (ONR) grant award N00014-22-1- 2082, "Stochastic Parameterization of Ocean Turbulence for Observational Networks". DH and OS were partially supported during the present work by European Research Council (ERC) Synergy grant "Stochastic Transport in Upper Ocean Dynamics" (STUOD)—DLV-856408.

## **Appendix: Asymptotic Expansion**

This appendix fills in details of the derivations of the approximations discussed in Sect. 2. Namely, the displacement of a fluid element from its mean trajectory is represented by

$$\mathbf{X}\_{l} = \mathbf{x}\_{l} + a\xi(\mathbf{x}\_{l}, t) \,, \tag{A.1}$$

and the associated velocity is given by

$$\mathbf{U}\_{l}(\mathbf{X}\_{l}) = \mathbf{u}^{L}(\mathbf{x}\_{l},t) + \alpha \left(\partial\_{l}\xi(\mathbf{x}\_{l},t) + \mathbf{u}^{L} \cdot \nabla\_{\mathbf{X}\_{l}}\xi(\mathbf{x}\_{l},t)\right) \tag{A.2}$$

The fluctuating terms are assumed to have a WKB structure, lending the pressure an associated slow/fast decomposition

$$\mathbf{f}(\mathbf{x},t) = \mathbf{a}(\epsilon \mathbf{x}, \epsilon t)e^{i\phi(\epsilon \mathbf{X}, \epsilon t)/\epsilon} + \mathbf{a}^\*(\epsilon \mathbf{x}, \epsilon t)e^{-i\phi(\epsilon \mathbf{X}, \epsilon t)/\epsilon},\tag{A.3}$$

$$p(\mathbf{X},t) = p\_0(\mathbf{X},t) + \sum\_{j\geq 1} \alpha^j \left( b\_j(\epsilon \mathbf{X}, \epsilon t) e^{ij\phi(\epsilon \mathbf{X}, \epsilon t)/\epsilon} + b\_j^\*(\epsilon \mathbf{X}, \epsilon t) e^{-ij\phi(\epsilon \mathbf{X}, \epsilon t)/\epsilon} \right) \,. \tag{A.4}$$

Making these approximations within a fluid governed by the Euler-Boussinesq equations may be performed by substituting them into Hamilton's principle, asymptotically expanding, and truncating to leave only the leading order terms. The relevant variational principle in this case is as follows

$$0 = \delta \int\_{l0}^{l\_1} \int\_{\mathcal{M}} \mathcal{Q} \left(\frac{1}{2} |\mathbf{U}|^2 + \mathbf{U} \cdot \mathbf{\hat{a}} \times \mathbf{X} - g\varrho Z\right) + p(1 - \mathcal{Q}) \, d^3 X \, dt \,. \tag{2.6 revisited}$$

We first note that the volume form must be written in terms of the mean basis, as

$$
\partial \!\!/ (\mathbf{X}) d^3 X = \partial \!\!/^\xi (\mathbf{x}) d^3 X = \partial \!\!/^\xi (\mathbf{x}) \, \_\mathcal{J} \!\!/^3 d \!\!/^3 \mathbf{x} =: D d^3 \mathbf{x} \,, \tag{A.5}
$$

where

$$\mathcal{J}' = \det \left( \delta\_{lj} + \alpha \frac{\partial \xi^l}{\partial \alpha^j} \right) .$$

Similarly,  also transforms as

$$
\varrho(\mathbf{X}) = \varrho^{\xi}(\mathbf{x}) =: \rho \dots
$$

Before calculating the terms featuring **U**, note that

*∂t <sup>ξ</sup> (***x***t,t)* <sup>+</sup> **<sup>u</sup>***<sup>L</sup>* · ∇**x***<sup>t</sup> <sup>ξ</sup> (***x***t,t)* <sup>=</sup> *∂a ∂t <sup>e</sup>iφ/* <sup>+</sup> *<sup>a</sup><sup>i</sup> ∂φ ∂t <sup>e</sup>iφ/* + *∂a*∗ *∂t <sup>e</sup>*−*iφ/* <sup>−</sup> *<sup>a</sup>*∗*<sup>i</sup> ∂φ ∂t <sup>e</sup>*−*iφ/* <sup>+</sup> *eiφ/***u***<sup>L</sup>* · ∇*xa* <sup>+</sup> *<sup>i</sup>aeiφ/***u***<sup>L</sup>* · ∇*x<sup>φ</sup>* <sup>+</sup> *e*−*iφ/***u***<sup>L</sup>* · ∇*xa*<sup>∗</sup> <sup>−</sup> *<sup>i</sup>a*∗*e*−*iφ/***u***<sup>L</sup>* · ∇*x<sup>φ</sup>* <sup>=</sup> *<sup>i</sup>aeiφ/ ∂φ ∂t* <sup>+</sup> **<sup>u</sup>***<sup>L</sup>* · ∇**x***<sup>φ</sup>* <sup>+</sup> *<sup>i</sup>a*∗*e*−*iφ/* −*i ∂φ ∂t* <sup>−</sup> **<sup>u</sup>***<sup>L</sup>* · ∇**x***<sup>φ</sup>* <sup>+</sup> *eiφ/ <sup>∂</sup><sup>a</sup> ∂t* <sup>+</sup> **<sup>u</sup>***<sup>L</sup>* · ∇*xa* <sup>+</sup> *e*−*iφ/ <sup>∂</sup>a*<sup>∗</sup> *∂t* <sup>+</sup> **<sup>u</sup>***<sup>L</sup>* · ∇*xa*<sup>∗</sup> = −*<sup>i</sup> <sup>ω</sup>aeiφ/* <sup>+</sup> *<sup>i</sup> <sup>ω</sup>a*∗*e*−*iφ/* <sup>+</sup> <sup>O</sup>*() ,* (A.6)

where we define *<sup>ω</sup>* := − *<sup>d</sup> dt <sup>φ</sup>* = − *<sup>∂</sup> ∂t <sup>φ</sup>* <sup>+</sup> **<sup>u</sup>***<sup>L</sup>* · ∇**x***<sup>φ</sup>* and **<sup>k</sup>** := ∇**x***<sup>φ</sup>* as in (2.9) and (2.8). The may now calculate the energy terms, beginning with kinetic energy, making use of the above relation and taking the mean.3 Note that the following relations are true within the Lagrangian, but are expressed here in isolation.

$$\frac{1}{2} |\mathbf{U}|^2 = \frac{1}{2} |\mathbf{u}^L + \alpha \left( \partial\_t \boldsymbol{\xi}(\mathbf{x}\_l, t) + \mathbf{u}^L \cdot \nabla\_{\mathbf{X}\_l} \boldsymbol{\xi}(\mathbf{x}\_l, t) \right)|^2$$

$$= \frac{1}{2} |\mathbf{u}^L|^2 + \alpha^2 |\partial\_t \boldsymbol{\xi}(\mathbf{x}\_l, t) + \mathbf{u}^L \cdot \nabla\_{\mathbf{X}\_l} \boldsymbol{\xi}(\mathbf{x}\_l, t)|^2$$

$$= \frac{1}{2} |\mathbf{u}^L|^2 + 2\alpha^2 \widetilde{\boldsymbol{\alpha}}^2 \mathbf{a} \cdot \mathbf{a}^\* + \mathcal{O}(\alpha^2 \boldsymbol{\epsilon}) \,.$$

<sup>3</sup> In taking the mean within the action integral, we discard the terms multiplied by rapid oscillations exp*(*±*iφ/)* and exp*(*±2*iφ/)*. These non-resonant terms are assumed to oscillate to zero under the time integral.

The rotation term and potential energy are

$$\mathbf{U} \cdot \mathbf{\mathfrak{A}} \times \mathbf{X} = \left(\mathbf{u}^{L} + \alpha \left(\partial\_{l} \xi(\mathbf{x}\_{l}, t) + \mathbf{u}^{L} \cdot \nabla\_{\mathbf{X}\_{l}} \xi(\mathbf{x}\_{l}, t)\right) \cdot \mathbf{\mathfrak{A}} \times (\mathbf{x} + \alpha \xi(\mathbf{x}, t))\right)$$

$$= \mathbf{u}^{L} \cdot \mathbf{\mathfrak{A}} \times \mathbf{x} + \mathbf{u}^{L} \cdot \mathbf{\mathfrak{A}} \times (\alpha \xi) + \alpha (\partial\_{l} \xi + \mathbf{u}^{L} \cdot \nabla\_{\mathbf{X}\_{l}} \xi) \cdot \mathbf{\mathfrak{A}} \times \mathbf{x}$$

$$\quad + \alpha (\partial\_{l} \xi + \mathbf{u}^{L} \cdot \nabla\_{\mathbf{X}\_{l}} \xi) \cdot \mathbf{\mathfrak{A}} \times (\alpha \xi)$$

$$= \mathbf{u}^{L} \cdot \mathbf{\mathfrak{A}} \times \mathbf{x} + \alpha^{2} (\partial\_{l} \xi + \mathbf{u}^{L} \cdot \nabla\_{\mathbf{X}\_{l}} \xi) \cdot \mathbf{\mathfrak{A}} \times \mathbf{\mathfrak{F}}$$

$$= \mathbf{u}^{L} \cdot \mathbf{\mathfrak{A}} \times \mathbf{x} + 2i \alpha^{2} \tilde{\omega} \mathbf{\mathfrak{A}} \cdot (\mathbf{a} \times \mathbf{a}^{\*}) + \mathcal{O}(\alpha^{2} \epsilon),$$

$$\qquad g \varrho Z = g \rho (z + \alpha \xi\_{3}) = g \rho z \ .$$

Within the pressure term, we need to take care of expansion in both *p<sup>ξ</sup>* and J . We have

$$\overline{(1-\partial^{\xi})p^{\xi}\,d^{3}X} = \overline{(\mathcal{J}^{\prime}-D)p^{\xi}\,d^{3}\mathcal{X}} = \left(\overline{\mathcal{J}^{\prime}p^{\xi}} - D\,\overline{p^{\xi}}\right)d^{3}\mathcal{X}$$

Dealing with the terms separately, we have the expanded expression for *p<sup>ξ</sup>*

*<sup>p</sup><sup>ξ</sup> (***x***)* <sup>=</sup> *<sup>p</sup>*0*(***x***,t)* <sup>+</sup> *<sup>α</sup> ∂p*<sup>0</sup> *∂xi ξi* + *α*2 2 *∂*2*p*<sup>0</sup> *∂xi∂xj ξiξj* <sup>+</sup> <sup>O</sup>*(α*3*)* + *j*≤1 *αj bj (***x***, t)* + *α ∂bj ∂xi ξi* + *α*2<sup>2</sup> 2 *∂*<sup>2</sup>*bj ∂xi∂xk ξiξk* <sup>+</sup> <sup>O</sup>*(α*3*)* · exp *ij φ(***x***, t)*+*α ∂φ ∂xi ξi*+ *α*2<sup>2</sup> 2 *∂*2*φ ∂xi∂xk ξiξk*+O*(α*3*)* + *c.c.* = *p*<sup>0</sup> + *α ∂p*<sup>0</sup> *∂xi ξi* + *α*2 2 *∂*2*p*<sup>0</sup> *∂xi∂xj ξiξj* <sup>+</sup> <sup>O</sup>*(α*3*)* + *j*≤1 *αj bj* + *α ∂bj ∂xi ξi* <sup>+</sup> <sup>O</sup>*(α*2*)* <sup>×</sup> exp *ijφ* <sup>1</sup> <sup>+</sup> *ijα ∂φ ∂xi ξi* <sup>+</sup> <sup>O</sup>*(α*2*)* + *c.c* = *p*<sup>0</sup> + *α ∂p*<sup>0</sup> *∂xi ξi* + *α*2 2 *∂*2*p*<sup>0</sup> *∂xi∂xj ξiξj* <sup>+</sup> exp *iφ αb*<sup>1</sup> <sup>+</sup> *<sup>α</sup>*2 *∂b*<sup>1</sup> *∂xi ξi* <sup>+</sup> *<sup>b</sup>*1*iα*<sup>2</sup> *∂φ ∂xi ξi* <sup>+</sup> exp <sup>2</sup>*iφ <sup>α</sup>*2*b*<sup>2</sup> <sup>+</sup> *c.c* <sup>+</sup> <sup>O</sup>*(α*3*) ,*

which gives the phase averaged expression

$$\begin{split} \overline{p^{\overline{\varepsilon}}} &= p\_0 + \frac{\alpha^2}{2} \frac{\partial^2 p\_0}{\partial \boldsymbol{x}\_l \partial \boldsymbol{x}\_j} \left( a\_l \boldsymbol{a}\_j^\* + a\_l^\* \boldsymbol{a}\_j \right) \\ &+ \alpha^2 \left( \epsilon \boldsymbol{a}\_l^\* \frac{\partial b\_1}{\partial \epsilon \boldsymbol{x}\_l} + i b\_1 \boldsymbol{a}\_l^\* \frac{\partial \phi}{\partial \epsilon \boldsymbol{x}\_l} + \epsilon \boldsymbol{a}\_l \frac{\partial b\_1^\*}{\partial \epsilon \boldsymbol{x}\_l} - i b\_1^\* \boldsymbol{a}\_l \frac{\partial \phi}{\partial \epsilon \boldsymbol{x}\_l} \right) + \mathcal{O}(\boldsymbol{\alpha}^3) \dots \end{split}$$

Note that

<sup>J</sup> <sup>=</sup> det *δij* + *α ∂ξi ∂xj* = 1 + *α ∂ξi ∂xi* <sup>+</sup> *<sup>α</sup>*<sup>2</sup> 2*δij* − 1 *∂ξi ∂xj ∂ξj ∂xi* <sup>+</sup> <sup>O</sup>*(α*3*)* = 1 + *α ∂ξi ∂xi* <sup>+</sup> *<sup>α</sup>*<sup>2</sup> *∂ξ*<sup>1</sup> *∂x*<sup>1</sup> *∂ξ*<sup>2</sup> *∂x*<sup>2</sup> <sup>+</sup> *∂ξ*<sup>3</sup> *∂x*<sup>3</sup> *∂ξ*<sup>1</sup> *∂x*<sup>1</sup> <sup>+</sup> *∂ξ*<sup>3</sup> *∂x*<sup>3</sup> *∂ξ*<sup>2</sup> *∂x*<sup>2</sup> <sup>−</sup> *∂ξ*<sup>1</sup> *∂x*<sup>2</sup> *∂ξ*<sup>1</sup> *∂x*<sup>2</sup> <sup>−</sup> *∂ξ*<sup>1</sup> *∂x*<sup>3</sup> *∂ξ*<sup>3</sup> *∂x*<sup>1</sup> <sup>−</sup> *∂ξ*<sup>2</sup> *∂x*<sup>3</sup> *∂ξ*<sup>3</sup> *∂x*<sup>2</sup> <sup>+</sup> <sup>O</sup>*(α*3*)* = 1 + *α ∂ai ∂xi* + *i ai ∂φ ∂xi* exp *(iφ/)* + *∂a*<sup>∗</sup> *i ∂xi* − *i a*∗ *i ∂φ ∂xi* exp *(*−*iφ/)* <sup>+</sup> *<sup>α</sup>*<sup>2</sup> 2*δij* − 1 *∂ai ∂xj* + *i ai ∂φ xj* exp *(iφ/)* <sup>+</sup> *c.c* × *∂aj ∂xi* + *i aj ∂φ xi* exp *(iφ/)* <sup>+</sup> *c.c* <sup>+</sup> <sup>O</sup>*(α*3*) .*

Taking the phase average gives

$$\begin{split} \overline{\mathcal{J}} &= 1 + \boldsymbol{\alpha}^2 (2\delta\_{lj} - 1) \left( \left( \frac{\partial a\_l}{\partial \mathbf{x}\_j} + \frac{i}{\epsilon} a\_l \frac{\partial \phi}{\partial \mathbf{x}\_j} \right) \left( \frac{\partial a\_j^\*}{\partial \mathbf{x}\_l} - \frac{i}{\epsilon} a\_j^\* \frac{\partial \phi}{\partial \mathbf{x}\_l} \right) + c.c. \right) + \mathcal{O}(\boldsymbol{\alpha}^3), \\ &= 1 + i\boldsymbol{\alpha}^2 \frac{\partial \phi}{\partial \epsilon \mathbf{x}} \cdot \frac{\partial}{\partial \mathbf{x}} \times \left( \mathbf{a} \times \mathbf{a}^\* \right) + \mathcal{O}(\boldsymbol{\alpha}^3), \end{split}$$

where the last equality uses the fact that we are operating under a spatial integral and integration by parts applies. Then, we have

$$\begin{split} p^{\xi} \, \mathcal{J} &= p\_0 + \alpha \frac{\partial p\_0}{\partial \boldsymbol{x}\_l} \xi\_l + \frac{\alpha^2}{2} \frac{\partial^2 p\_0}{\partial \boldsymbol{x}\_l \partial \boldsymbol{x}\_j} \xi\_l \xi\_j \\ &\quad + \exp\left(\frac{i\phi}{\epsilon}\right) \left(\alpha b\_1 + \alpha^2 \epsilon \frac{\partial b\_1}{\partial \epsilon \boldsymbol{x}\_l} \xi\_l + b\_1 i \alpha^2 \frac{\partial \phi}{\partial \epsilon \boldsymbol{x}\_l} \xi\_l\right) \\ &\quad + \exp\left(\frac{2i\phi}{\epsilon}\right) \alpha^2 b\_2 + c.c \end{split}$$

$$\begin{split} &+p\_0\alpha \frac{\partial \xi\_l}{\partial \boldsymbol{x}\_l} + \alpha^2 \frac{\partial \xi\_l}{\partial \boldsymbol{x}\_l} \frac{\partial p\_0}{\partial \boldsymbol{x}\_j} \xi\_j + \alpha^2 \exp\left(\frac{i\phi}{\epsilon}\right) b\_1 \frac{\partial \xi\_l}{\partial \boldsymbol{x}\_l} \\ &+ p\_0\alpha^2 \left(2\delta\_{lj} - 1\right) \frac{\partial \xi\_l}{\partial \boldsymbol{x}\_j} \frac{\partial \xi\_j}{\partial \boldsymbol{x}\_l} + c.c. + \mathcal{O}(\boldsymbol{\alpha}^3). \end{split}$$

Applying phase averaging gives

$$\begin{split} \overline{p^{\*}\mathcal{J}} &= p\_{0} + \frac{\alpha^{2}}{2} \frac{\partial^{2} p\_{0}}{\partial x\_{l} \partial x\_{j}} \left( a\_{l} a\_{j}^{\*} + a\_{l}^{\*} a\_{j} \right) \\ &+ \alpha^{2} \left( \epsilon a\_{l}^{\*} \frac{\partial b\_{1}}{\partial \epsilon x\_{l}} + i b\_{1} a\_{l}^{\*} \frac{\partial \phi}{\partial \epsilon x\_{l}} + \epsilon a\_{l} \frac{\partial b\_{1}^{\*}}{\partial \epsilon x\_{l}} - i b\_{1}^{\*} a\_{l} \frac{\partial \phi}{\partial \epsilon x\_{l}} \right) \\ &+ \alpha^{2} \left( \left( \frac{\partial a\_{l}}{\partial x\_{l}} + i \frac{\partial \phi}{\partial \epsilon x\_{l}} a\_{l} \right) \left( a\_{j}^{\*} \frac{\partial p\_{0}}{\partial x\_{j}} + b\_{1}^{\*} \right) \\ &+ \left( \frac{\partial a\_{l}^{\*}}{\partial x\_{l}} - i \frac{\partial \phi}{\partial \epsilon x\_{l}} a\_{l}^{\*} \right) \left( a\_{j} \frac{\partial p\_{0}}{\partial x\_{j}} + b\_{1} \right) \right) \\ &+ p\_{0} \alpha^{2} \frac{\partial \phi}{\partial \epsilon \mathbf{x}} \cdot \frac{\partial}{\partial \mathbf{x}} \times \left( \mathbf{a} \times \mathbf{a}^{\*} \right) + \mathcal{O}(\alpha^{3}) \,. \end{split}$$

We may assemble these statements into the following action integral, which may be regarded as an approximation of (2.6).

*S* = *<sup>t</sup>*<sup>1</sup> *t*0 M *D* 1 2 <sup>|</sup>**u***L*<sup>|</sup> <sup>2</sup> <sup>+</sup> *<sup>α</sup>*<sup>2</sup> *<sup>ω</sup>*2|**a**<sup>|</sup> <sup>2</sup> <sup>−</sup> *ρgz* <sup>+</sup> **<sup>u</sup>***<sup>L</sup>* · <sup>×</sup> **<sup>x</sup>** <sup>+</sup> <sup>2</sup>*iα*<sup>2</sup> *<sup>ω</sup>* · *(<sup>a</sup>* <sup>×</sup> *<sup>a</sup>*∗*)* + *α*<sup>2</sup> 2 *∂*2*p*<sup>0</sup> *∂xi∂xj aia*<sup>∗</sup> *<sup>j</sup>* + *a*<sup>∗</sup> *<sup>i</sup> aj* <sup>+</sup>*α*<sup>2</sup> *a*∗ *i ∂b*<sup>1</sup> *∂xi* + *ib*1*a*<sup>∗</sup> *i ∂φ ∂xi* + *ai ∂b*∗ 1 *∂xi* − *ib*<sup>∗</sup> 1*ai ∂φ ∂xi (*1 − *D)* <sup>+</sup> *<sup>α</sup>*<sup>2</sup> *∂ai ∂xi* <sup>+</sup> *<sup>i</sup> ∂φ ∂xi ai a*∗ *j ∂p*<sup>0</sup> *∂xj* + *b*<sup>∗</sup> 1 + *∂a*<sup>∗</sup> *i ∂xi* <sup>−</sup> *<sup>i</sup> ∂φ ∂xi a*∗ *i aj ∂p*<sup>0</sup> *∂xj* + *b*<sup>1</sup> <sup>+</sup> *<sup>p</sup>*0*(*<sup>1</sup> <sup>−</sup> *D)* <sup>+</sup> *iα*2*p*<sup>0</sup> *∂φ ∂***<sup>x</sup>** · *<sup>∂</sup> <sup>∂</sup>***<sup>x</sup>** <sup>×</sup> **a** × **a**∗ *d*3*x dt .* (A.7)

We now seek to simplify this integral. Firstly, we note that the following relationships hold for the last four terms on the second row of Eq. (A.7)

$$\begin{split} &\alpha^{2} \left( i b\_{1} a\_{l}^{\*} \frac{\partial \phi}{\partial \epsilon \mathbf{x}\_{l}} - i b\_{1}^{\*} a\_{l} \frac{\partial \phi}{\partial \epsilon \mathbf{x}\_{l}} \right) (1 - D) + \alpha^{2} \left( i \frac{\partial \phi}{\partial \epsilon \mathbf{x}\_{l}} a\_{l} b\_{1}^{\*} - i \frac{\partial \phi}{\partial \epsilon \mathbf{x}\_{l}} a\_{l}^{\*} b\_{1} \right) \\ &= -\alpha^{2} i D \left( b\_{1} \mathbf{k} \cdot \mathbf{a}^{\*} - b\_{1}^{\*} \mathbf{k} \cdot \mathbf{a} \right) \,, \end{split}$$

and

$$\begin{split} &\alpha^{2} \int\_{\mathcal{M}} \left( \epsilon a\_{l}^{\*} \frac{\partial b\_{l}}{\partial \epsilon \boldsymbol{x}\_{l}} + \epsilon a\_{l} \frac{\partial b\_{l}^{\*}}{\partial \epsilon \boldsymbol{x}\_{l}} \right) (1 - D) + \frac{\partial a\_{l}}{\partial \boldsymbol{x}\_{l}} b\_{l}^{\*} + \frac{\partial a\_{l}^{\*}}{\partial \boldsymbol{x}\_{l}} b\_{l} \, d^{3} \boldsymbol{x}\_{l} \\ &= -\alpha^{2} \epsilon \int\_{\mathcal{M}} D \left( a\_{l}^{\*} \frac{\partial b\_{l}}{\partial \epsilon \boldsymbol{x}\_{l}} + a\_{l} \frac{\partial b\_{l}^{\*}}{\partial \epsilon \boldsymbol{x}\_{l}} \right) \, d^{3} \boldsymbol{x} = \mathcal{O}(\alpha^{2} \epsilon) \,, \end{split}$$

after integration by parts. We have thus far involved several of the order *α*<sup>2</sup> terms on the third line of (A.7). The remainder of these are handled as follows

*α*2*i* M *∂φ ∂xi aia*<sup>∗</sup> *j ∂p*<sup>0</sup> *∂xj* <sup>−</sup> *∂φ ∂xi a*∗ *<sup>i</sup> aj ∂p*<sup>0</sup> *∂xj d*3*x* <sup>=</sup> *<sup>α</sup>*2*<sup>i</sup>* M −*p*<sup>0</sup> *∂ ∂xj ∂φ ∂xi aia*<sup>∗</sup> *j* + *p*<sup>0</sup> *∂ ∂xj ∂φ ∂xi a*∗ *<sup>i</sup> aj d*3*x* = *i* M <sup>−</sup>*α*2*p*<sup>0</sup> *∂φ ∂xi ∂ ∂xj aia*<sup>∗</sup> *j* <sup>+</sup> *<sup>α</sup>*2*p*<sup>0</sup> *∂φ ∂xi ∂ ∂xj a*∗ *<sup>i</sup> aj d*3*x* = − M *iα*2*p*<sup>0</sup> *∂φ ∂***x** · **a***(*∇ · **a**∗*)* − **a**∗*(*∇ · **a***)* + *(***a**<sup>∗</sup> · ∇*)***a** − *(***a** · ∇*)***a**<sup>∗</sup> *d*3*x* = − M *iα*2*p*<sup>0</sup> *∂φ ∂***<sup>x</sup>** · *<sup>∂</sup> <sup>∂</sup>***<sup>x</sup>** <sup>×</sup> **a** × **a**∗ *d*3*x ,*

and

$$\begin{split} &\int\_{\mathcal{M}} \alpha^{2} \epsilon \left( \frac{\partial a\_{l}}{\partial \epsilon \boldsymbol{x}\_{l}} a\_{j}^{\*} \frac{\partial p\_{0}}{\partial \boldsymbol{x}\_{j}} + \frac{\partial a\_{l}^{\*}}{\partial \epsilon \boldsymbol{x}\_{l}} a\_{j} \frac{\partial p\_{0}}{\partial \boldsymbol{x}\_{j}} \right) d^{3} \mathbf{x} \\ &= \int\_{\mathcal{M}} \alpha^{2} \epsilon \left( \frac{\partial a\_{l}}{\partial \epsilon \boldsymbol{x}\_{l}} a\_{j}^{\*} \frac{\partial p\_{0}}{\partial \boldsymbol{x}\_{j}} - a\_{l}^{\*} \frac{\partial}{\partial \epsilon \boldsymbol{x}\_{l}} \left( a\_{j} \frac{\partial p\_{0}}{\partial \boldsymbol{x}\_{j}} \right) \right) d^{3} \mathbf{x} \\ &= - \int\_{\mathcal{M}} \alpha^{2} a\_{i}^{\*} a\_{j} \frac{\partial^{2} p\_{0}}{\partial \boldsymbol{x}\_{l} \partial \boldsymbol{x}\_{j}} d^{3} \mathbf{x} = - \int\_{\mathcal{M}} \frac{\alpha^{2}}{2} \left( a\_{l} a\_{j}^{\*} + a\_{i}^{\*} a\_{j} \right) \frac{\partial^{2} p\_{0}}{\partial \boldsymbol{x}\_{l} \partial \boldsymbol{x}\_{j}} d^{3} \mathbf{x} \,\boldsymbol{x}\_{l} \end{split}$$

Assembling this back into the action integral (A.7) finally yields the expression for *S* in (2.7),

$$\begin{split} S &= \int\_{l\_0}^{l\_1} \int\_{\mathcal{M}} D \left[ \frac{1}{2} |\mathbf{u}^L|^2 + a^2 \widetilde{\boldsymbol{w}}^2 |\mathbf{a}|^2 - \rho \boldsymbol{g} \boldsymbol{z} + \mathbf{u}^L \cdot \mathbf{R} \times \mathbf{x} + 2i a^2 \widetilde{\boldsymbol{w}} \mathbf{Q} \cdot (\mathbf{a} \times \mathbf{a}^\*) \right] \\ &- a^2 i \left( b \mathbf{k} \cdot \mathbf{a}^\* - b^\* \mathbf{k} \cdot \mathbf{a} \right) - a^2 a\_i^\* a\_j \frac{\partial^2 p\_0}{\partial x\_i \partial x\_j} \bigg] \\ &+ (1 - D) p\_0 + \mathcal{O}(a^2 \epsilon) \, d^3 \mathbf{x} \, dt \, . \tag{A.8} \end{split} \tag{A.8}$$

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Toward a Stochastic Parameterization for Oceanic Deep Convection**

#### **Quentin Jamet, Etienne Mémin, Franck Dumas, Long Li, and Pierre Garreau**

**Abstract** Current climate models are known to systematically overestimate the rate of deep water formation at high latitudes in response to too deep and too frequent deep convection events. We propose in this study to investigate a misrepresentation of deep convection in Hydrostatic Primitive Equation (HPE) ocean and climate models due to the lack of constraints on vertical dynamics. We discuss the potential of the Location Uncertainty (LU) stochastic representation of geophysical flow dynamics to help in the process of re-introducing some degree of non-hydrostatic physics in HPE models through a pressure correction method. We then test our ideas with idealized Large Eddy Simulations (LES) of buoyancy driven free convection with the CROCO modeling platform. Preliminary results at LES resolution exhibit a solution obtained with our Quasi-nonhydrostatic (Q-NH) model that tends toward the reference non-hydrostatic (NH) model. As compared to a pure hydrostatic setting, our Q-NH solution exhibits vertical convective plumes with larger horizontal structure, a better spatial organization and a reduced intensity of their associated vertical velocities. The simulated Mixed Layer Depth (MLD) deepening rate is however too slow in our Q-NH experiment as compared to the reference NH, a behaviour that opposes to that of hydrostatic experiments of producing too fast MLD deepening rate. These preliminary results are encouraging, and support future efforts in the direction of enriching coarse resolution, hydrostatic ocean and climate models with a stochastic representation of non-hydrostatic physics.

Q. Jamet (-) INRIA, ODYSSEY Group, Plouzané, France

e-mail: quentin.jamet@inria.fr

E. Mémin · L. Li INRIA, ODYSSEY Group, Rennes, France

F. Dumas SHOM, Brest, France

P. Garreau Laboratoire d'Océanographie Physique et Spatiale, IFREMER, Plouzané, France

# **1 Introduction**

Deep ocean convection is a crucial mechanism for large scale ocean circulation and climate. It controls the rate of deep ocean water masses formation, sequestrating atmospheric properties such as heat and carbon in the abyssal ocean. In the North Atlantic basin, deep ocean convection in the Labrador Sea and the Nordic Seas is part of the large scale Atlantic Meridional Overturning Circulation (AMOC), an oceanic metric with many climate implications (Zhang et al. 2019). Coarse resolution (i.e. *Δx* ∼ O*(*100*)* km) climate models are known to overestimate the rate of deep water formation at high latitudes in response to too deep and too frequent convective events (Heuzé 2017; 2021), a bias that is expected to worsen with next generation climate models with ocean components at higher resolution (Masson-Delmotte et al. 2021). Among other possibilities (e.g. preconditioning, air-sea interactions), we explore in this paper the possible misrepresentation of deep ocean convection in current climate models in response to their hydrostatic formulation.

Ocean modulus of current climate models solve the Hydrostatic Primitive Equations (HPE), a simplified version of the full Navier-Stokes equations (NS). Geophysical fluids have specific characteristics that allow some approximations from the general NS, leading to drastic simplifications in their numerical implementation which, in turn, allow us to model the global ocean at climate scales (i.e. for several decades/centuries) with the currently available computational resources. Among those approximations is the hydrostatic balance which arises from the relatively thin thickness of the ocean (H ∼ O*(*1*)* km) as compared to the horizontal extension of its large scale dynamics (L ∼ O*(*1000*)* km for gyres and L ∼ O*(*10 − 100*)* km for ocean mesoscale eddies). The aspect ratio *δ* = <sup>H</sup> <sup>L</sup> is thus orders of magnitude smaller than unity. Scaling the vertical velocity *W* = *δU*, with *U* the typical horizontal velocity of the flow, leads to small contribution of vertical acceleration as compared to horizontal components. For a regime satisfying such a scaling, only vertical pressure gradients are able to balance gravitational acceleration in the vertical component of the NS equations, and the system can be simplified as:

$$
\partial\_t \mu + \nabla \cdot (\mu u) - fv = -\frac{1}{\rho\_0} \partial\_x p + \mathcal{F}\_u + \mathcal{D}\_u,\tag{1a}
$$

$$
\partial\_t v + \nabla \cdot (\mathbf{u} v) + f u = -\frac{1}{\rho\_0} \partial\_\mathbf{y} p + \mathcal{F}\_\mathbf{v} + \mathcal{D}\_\mathbf{v}, \tag{1b}
$$

$$0 = -\frac{1}{\rho\_0} \partial\_{\mathbb{Z}} p - b,\tag{1c}$$

where *u* = *(u, v, w)* is the three-dimensional velocity field, *f* = 2*Ωsin(θ )* is the traditional Coriolis pseudo-force, *<sup>p</sup>* is pressure, *<sup>b</sup>* <sup>=</sup> *<sup>ρ</sup>*−*ρ*<sup>0</sup> *<sup>ρ</sup>*<sup>0</sup> *g* is the *buoyancy* defined for Boussinesq fluids (i.e. when density *ρ* is replaced by its constant value *ρ*0, unless multiplied by gravity in which case it is expressed as density anomaly *<sup>ρ</sup>*−*ρ*<sup>0</sup> *<sup>ρ</sup>*<sup>0</sup> ), and F and D are forcing and dissipative processes, respectively. Equations (1a)– (1c) are the HPE momentum equations used in current climate models. From a numerical viewpoint, using HPE instead of general NS or other Non-Hydrostatic (NH) sets of equation greatly simplifies the procedure as only (1a) and (1b) have to be stepped forward in time for each discretized ocean layers, while (1c) is used (diagnostically) to obtain the pressure field through vertical integration of density variations subject to gravitational acceleration. In HPE models, convection is part of the parameterized (i.e. unresolved) three-dimensional turbulence and mixing processes which are encapsulated in D*u,v*. Usually, these operators are formulated with a down-gradient approach, where the vertical fluxes of a scalar *θ* are parameterized as *w θ* = −*Kθ ∂zθ*, with *θ* the local, resolved field. Several models can be used to estimate the dissipation coefficient *Kθ* (e.g. TKE (Gaspar et al. 1990), GLS (Umlauf and Burchard 2003), KPP (Large et al. 1994)), but in case of convection, this coefficient is usually set to an unrealistically large value (0.1 to 10 m2s−1) to quickly restore static instabilities associated with convective processes and avoid model instabilities. More recently, Giordani et al. (2020) proposed an oceanic application of the eddy-diffusivity mass-flux formulation initially derived by the atmospheric community (e.g., Hourdin et al. 2006, Suselj et al. 2019), which allows a better representation of vertical advective fluxes associated with convection. The approximations leading to HPE are likely to be satisfied in most of the ocean where vertical velocities are small and their spatial patterns are of small scales. However, for the case of deep ocean convection where vertical velocities can reach *<sup>W</sup>* <sup>∼</sup> <sup>O</sup>*(*10 cm s−1*)* and over horizontal scales of <sup>L</sup> <sup>∼</sup> <sup>O</sup>*(*1 km*)*, such approximations become questionable. In case such approximations turn out to be violated, it becomes necessary to find ways of re-introducing some form of nonhydrostasy within HPE.

Klingbeil and Burchard (2013) have proposed a direct implementation of full non-hydrostatic effects into an HPE model through a pressure correction method. Instead of solving the full three-dimentional velocity field equations

$$
\partial\_t u + \nabla \cdot (\mathfrak{u}u) - fv + \tilde{f}w = -\frac{1}{\rho\_0} \partial\_\mathbf{x} p + \mathcal{F}\_\mathbf{u} + \mathcal{D}\_\mathbf{u},\tag{2a}
$$

$$
\partial\_t v + \nabla \cdot (\mu v) + f u \qquad = -\frac{1}{\rho\_0} \partial\_\mathbf{y} p + \mathcal{F}\_v + \mathcal{D}\_v,\tag{2b}
$$

$$
\partial\_t w + \underline{\nabla \cdot (uw)} \qquad -\tilde{f}u = -\frac{1}{\rho\_0} \partial\_{\overline{z}} p - b + \mathcal{F}\_w + \underline{\mathcal{D}\_w}, \tag{2c}
$$

where *f*˜ = 2*Ωcos(θ )* is the non-traditional Coriolis pseudo-force (shown for consistency but not considered in the following), and non-hydrostatic contributions are shown in blue. To avoid the complexity of solving a three dimensional Poisson equation to recover the non-hydrostatic pressure (as usually done in NH pressure correction methods, e.g. Marshall et al. 1997) Klingbeil and Burchard (2013) proposed to account for non-hydrostatic pressure correction through a vertical integration of a so-called *non-hydrostatic buoyancy*, i.e. following the strategy of HPE models. This strategy offers a general implementation of NH physics in HPE, but still suffers from numerical instabilities in the case of strongly non-hydrostatic dynamics. For the case of deep ocean convection, it can be shown that further simplifications can be made by only accounting for the horizontal viscosity acting on the vertical velocities in the computation of the NH pressure correction (through vertical integration ; Pierre Garreau, personal communication). As will be shown later through the analysis of different idealized Large Eddy Simulations (LES), HPE models tend to produce convective plumes near the grid size of the model, leading to unstructured (on the horizontal) convective cells. From one grid point to the next, vertical velocities could be of opposite sign leading to intense horizontal gradients. Including a horizontal viscous operator on the HPE vertical velocities (we recall here that in HPE vertical velocities are diagnosed from the horizontal velocity field though continuity) leads to a broadening of the convective plumes and a more realistic horizontal organization. In other words, when convective plumes start to form, they '*entrain*' the neighboring points thus communicating horizontally their vertical momentum. Such a process can be seen as a simplified entrainment/detrainment mechanism discussed by Giordani et al. (2020) for the case of edddy-diffusivity mass-flux parameterization. In the present study, we consider the approach of Klingbeil and Burchard (2013) as a starting point and discuss a strategy to extend this idea in the context of a stochastic parameterization. The results presented here are all obtained at LES resolution, such that a clear connection with climate scale regimes is still lacking. However, these results provide a first step toward the development of robust stochastic parameterization for climate models, which will be the subject of dedicated studies.

The paper is organized as follow. In Sect. 2 we briefly recall the Location Uncertainty (LU; Mémin 2014, Bauer et al. 2020, Resseguier et al. 2017) framework used to represent the inertial and dissipative effects on vertical momentum (underlined terms in (2c)) as a result of a strong noise regime or for application to flow dynamics where the hydrostatic approximation becomes questionable. Section 3 is dedicated to the numerical implementation of the stochastic, nonhydrostatic pressure correction into the terrain-following Coastal and Regional Ocean Community (CROCO), along with the description of the simulations we have conducted. Preliminary results are described and discussed in Sect. 4. We summarize our paper and provide some perspectives for further work in Sect. 5.

# **2 Stochastic Formulation of Direct Non-hydrostatic Pressure Correction**

Following Mémin (2014), the stochastic version of the horizontal momentum equation (in vector notation) reads:

$$\mathbb{D}\_{\mathbf{l}}\mathbf{u}\_{h} + f\mathbf{k} \times (\mathbf{u}\_{h}\mathbf{d}t + \sigma \mathbf{d}\mathbf{B}\_{\mathbf{l}}^{H}) = -\frac{1}{\rho\_{0}}\nabla\_{H}(p\mathbf{d}t + \mathbf{d}p\_{\mathbf{l}}^{\sigma}),\tag{3}$$

with *<sup>u</sup><sup>h</sup>* <sup>=</sup> *(u, v,* <sup>0</sup>*)*, <sup>D</sup>*<sup>t</sup>* the stochastic transport operator defined as:

$$\mathbb{D}\_{l}\boldsymbol{\mu}\_{h} = \mathrm{d}\_{l}\boldsymbol{\mu}\_{h} + (\boldsymbol{\mu}^{\star}\mathrm{d}t + \sigma\,\mathrm{d}\mathcal{B}\_{l}) \cdot \nabla\,\boldsymbol{\mu}\_{h} - \frac{1}{2}\nabla \cdot (\boldsymbol{a}\nabla\boldsymbol{\mu}\_{h})\mathrm{d}t,\tag{4}$$

with *<sup>u</sup>*  the incompressible (i.e. **∇ ·** *<sup>σ</sup>*d*B<sup>t</sup>* <sup>=</sup> 0) modified advection defined as:

$$
\mu^\star = \mu - \frac{1}{2}\nabla \cdot a \tag{5}
$$

where *u* = *(uh, w)* is the three dimensional velocity field, *σ*d*B<sup>t</sup>* represents the stochastic flow and *a* its associated variance tensor. The term <sup>1</sup> <sup>2</sup>**∇ ·** *a* can be interpreted as an equivalent of the Stokes drift for an inhomogeneous random fast component *σ*d*B<sup>t</sup>* (Bauer et al. 2020).

The introduction of the stochastic pressure d*p<sup>σ</sup> <sup>t</sup>* in (3) requires some discussion. This stochastic pressure is associated with the small scale velocity component encoded through the noise. Following Resseguier et al. (2017), for smooth-in-time momentum equation subject to a classical deterministic large scale momentum equation, its (three dimensional) gradient can be expressed as:

$$-\frac{1}{\rho\_0} \nabla \mathbf{d} p\_I^{\sigma} = (\sigma \, \mathbf{d} \mathcal{B}\_I) \cdot \nabla \, \mathbf{u} + \mathbf{f} \times \sigma \, \mathbf{d} \mathcal{B}\_I \tag{6}$$

such that its interpretation (and scalling) should be related to the processes the stochastic formulation aims at representing. In the context of large scale modelling parameterization such as Tucciarone et al. (2023), the stochastic Primitive Equations they derived is meant to represent the effects of meso (and potentially submeso) scale eddies onto the large scale gyre circulation. The usual hydrostatic arguments are thus used, such that the vertical gradient of the stochastic pressure is identically zero (i.e. *∂z*d*p<sup>σ</sup> <sup>t</sup>* = 0) and its horizontal gradient is strictly balanced by the stochastic Coriolis pseudo-force. Here, we are interested in relaxing the hydrostatic approximation on the noise structure, but retaining it for the smooth-in-time, resolved flow, and derive the stochastic equation for the vertical momentum. As a first step in this direction, we will not include the contribution of the non-traditional Coriolis pseudo-force. After some manipulations, we obtain the following equation for the vertical momentum:

$$(-\frac{1}{2}\nabla \cdot \mathbf{a}) \cdot \nabla w \mathbf{d}t - \frac{1}{2}\nabla \cdot (\mathbf{a}\nabla w)\mathbf{d}t + \underline{\sigma \mathbf{d} \mathbf{B}\_t \cdot \nabla w} = (-\frac{1}{\rho\_0}\partial\_t p - b)\mathbf{d}t - \frac{1}{\rho\_0}\partial\_t \mathbf{d}p\_t^\sigma,\tag{7}$$

Black terms in (7) are associated with hydrostatic physics and terms in blue are the different stochastic contributions that emerge when applying non-hydrostatic thinkings on the stochastic noise. The left-hand side terms corresponds to the vertical acceleration with a scaling such that the noise vertical acceleration is strong compared to the large-scale vertical acceleration terms. Note that the two underlined terms in (7) are Brownian terms emerging from the stochastic pressure formulation (6) on the right-hand side and from the vertical velocity transport by the noise on the left-hand side. The two other blue terms on the LHS are associated with modified advection and dissipation (projected on the vertical velocity *w*) that emerged through the three dimensional generalization of (4):

$$\mathbb{D}\_l \mathbf{u} = \mathbf{d}\_l \boldsymbol{\mu}\_h + \boldsymbol{\mu} \cdot \nabla \, \mathbf{u}\_h + (-\frac{1}{2} \nabla \cdot \mathbf{a} \mathbf{d}t + \sigma \, \mathbf{d} \mathbf{B}\_l) \cdot \nabla \, \mathbf{u} - \frac{1}{2} \nabla \cdot (\boldsymbol{a} \nabla \boldsymbol{\mu}) \mathbf{d}t,\qquad(8)$$

where the material derivative of vertical velocities associated with the resolved flow (i.e. d*tw* + *u* · **∇***w*) has been neglected.

The noise being given (and calibrated from data or a known relation), from (7), it is thus possible to compute the various Brownian terms on the LHS, then to integrate vertically the results to obtain a 3D map of the modified pressure field as a result of the noise transport. Separating safely the martingale part (Brownian terms) from the smooth-in-time components ("d*t*" terms), we have

$$\mathrm{d}p\_{\mathrm{I}}^{\sigma}(z) = \mathrm{d}p\_{\mathrm{I}}^{\sigma}|\_{z=\eta} + \rho\_0 \int\_z^{\eta} (\sigma \, \mathrm{d}\mathcal{B}\_{\mathrm{I}} \cdot \nabla w) \, dz',\tag{9}$$

for the martingale component, and

$$\begin{aligned} p(z)dt &= p|\_{z=\eta}dt + \rho\_0 g(\eta - z)dt \\ &+ \rho\_0 \int\_z^{\eta} \left( bdt - \underbrace{\left( (\frac{1}{2} \nabla \cdot \mathbf{a}) \cdot \nabla w \mathbf{d}t + \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla w) \mathbf{d}t \right)}\_{b\_{NH}} \right) dz', \qquad (10) \end{aligned}$$

for the smooth-in-time component. The three last terms on the RHS of (10) can be compared to the deterministic non-hydrostatic pressure correction of Klingbeil and Burchard (2013), although the material derivative of *w* associated with resolved flow is not included in our stochastic formulation. It can be noted that our formulation involves a 3D diffusion of the vertical velocity ensuing from the noise action as well as the contribution of the modified Ito-Stokes term arising from the spatial inhomogeneity of the noise. Both (9) and (10) should be integrated with appropriate boundary conditions at *η* to incorporate fast and smooth-in-time surface pressure contributions (such as surface waves or atmospheric pressure loading, respectively), but such contributions can be neglected at first approximation without loss of generality. Results of (9) and (10) can then be used to feedback onto the horizontal momentum equation (3) solved by an hydrostatic model. Assuming a strict separation of the martingale part and the smooth-in-time component, only (10) is assumed to feedback onto the resolved flow. The martingale components are assumed to balance each other, thus not affecting the resolved flow. This assumption can be interpreted as a Large Eddy Simulation (LES) -like approach, as discussed by Bauer et al. (2020).

As a preliminary step, we will further simplify the structure of the variance tensor *a* in order to reduce the second and third terms on the RHS of (10) to a simple Laplacian viscosity—induced here by the noise contribution. This simplification is motivated in the following. In the LU framework, the strength of the noise is measured by its (one-point co-) variance, such that

$$a(\mathbf{x}, t) \stackrel{\Delta}{=} \check{q}(\mathbf{x}, \mathbf{x}, t), \tag{11}$$

with *q*˘*(x, x,t)* a matrix kernel defined as

$$\check{\boldsymbol{q}}(\mathbf{x},\mathbf{x},t) \stackrel{\Delta}{=} \int\_{\varOmega} \check{\boldsymbol{\sigma}}(\mathbf{x},\mathbf{x}',t) \check{\boldsymbol{\sigma}}(\mathbf{x},\mathbf{x}',t)^{T} d\mathbf{x}',\tag{12}$$

with *σ*˘*(***·***,***·***,t)* a bounded matrix kernel defining the correlation deterministic integral operator *<sup>σ</sup><sup>t</sup>* : *<sup>L</sup>*2*(Ω)* <sup>→</sup> *<sup>L</sup>*2*(Ω)*

$$
\sigma\_1 f(\mathbf{x}) \stackrel{\Delta}{=} \int\_{\varOmega} \check{\sigma}(\mathbf{x}, \mathbf{y}, t) f(\mathbf{y}) d\mathbf{y}, \quad \forall f \in (L^2(\varOmega)). \tag{13}
$$

(See, e.g. Bauer et al. 2020, Mémin 2014, Resseguier et al. 2017, for further details). Although the previous definition of the noise is general, it is possible, through the Mercer's theorem, to express the noise variance as a spectral decomposition of the form:

$$\mathfrak{a}(\mathbf{x},t) = \sum\_{n \in \mathbb{N}} \lambda\_n(t) \boldsymbol{\phi}\_n(\mathbf{x},t) \boldsymbol{\phi}\_n^T(\mathbf{x},t), \tag{14}$$

where *φn(x,t)* define an orthonormal eigenfunction basis of the correlation operator, *σ<sup>t</sup>* , with *λn(t)* their corresponding eigenvalues. For a stationary noise, this reduces to a classical POD (or EOF) decomposition, in which the eigenfunctions are the solution of the eigenvalue problem

$$\int\_{\mathcal{Q}} \mathbf{K}(\mathbf{x}, \mathbf{x}') \phi\_n(\mathbf{x}') d\mathbf{x}' = \lambda\_n \phi\_n(\mathbf{x}) \tag{15}$$

with *K* the two-point correlation tensor.

The next step is to assume isotropy and homogeneity of the noise structure, in which case the Fourier modes *<sup>φ</sup><sup>n</sup>* <sup>=</sup> *<sup>e</sup>*2*π ik***·***<sup>x</sup>* are a natural choice to satisfy (15), which implies (Berkooz et al. 1993):

$$K = \sum\_{n} \lambda\_n e^{2\pi i \mathbf{k} \cdot \mathbf{x}} e^{-2\pi i \mathbf{k} \cdot \mathbf{x}'}.\tag{16}$$

Under isotropic condition, the variance of the divergence-free noise is constant and diagonal, such that the first term associated with *bNH* in (10) is identically zero, and the noise induced dissipation reduces to:

$$\frac{1}{2}\nabla \cdot (a\nabla w) = \nu \Delta w,\tag{17}$$

with *ν* the isotropic, homogeneous noise induced momentum dissipation. Through vertical integration of (17), we recover part of the non-hydrostatic pressure correction proposed by Klingbeil and Burchard (2013), which in the present case mimic entrainement/detrainement of convective plumes leading to changes in their spatial organization. This modified HPE will be termed Quasi-Nonhydrostatic (Q-NH), by analogy with the Quasi-Hydrostatic (QH) of Marshall et al. (1997) where nontraditional Coriolis terms are added into the HPE.

## **3 Numerical Implementation and Simulations**

Our objective is to implement this stochastic, non-hydrostatic pressure correction in the hydrostatic kernel of the Coastal and Regional Ocean Community model (CROCO ; http://www.croco-ocean.org). CROCO is a new ocean model that builds upon the structure of the ROMS-AGRIF primitive equation solver (Shchepetkin and McWilliams 2005, Debreu et al. 2012). The non-hydrostatic, non-Boussinesq (NQB ; Auclair et al. 2018) capabilities of CROCO will also be used to construct a reference simulation for validation (see Table 1). We review in the following some important steps for the implementation of the stochastic pressure correction within CROCO, discuss their implications and how we treat the pressure correction within the hydrostatic CROCO kernel.

## *3.1 Stochastic, Non-hydrostatic Pressure Correction*

In its hydrostatic mode, CROCO computes *<sup>p</sup> ρ*0 , from which horizontal gradients directly feed the baroclinic horizontal momentum equation (i.e. (2a) and (2b)). Our strategy is to include the NH pressure correction via a modified density/buoyancy field, such that the pressure field becomes:

$$\frac{p(z)}{\rho\_0} = \frac{p|\_{z=\eta}}{\rho\_0} + \int\_z^{\eta} \frac{(\rho - \rho\_0) - b\_{NH}}{\rho\_0} \mathbf{g} \, dz',\tag{18}$$

with *bNH* collecting the different contributions of the vertical momentum equation (*w*\_*trends*) contributing in the pressure correction, normalized by gravity:

Stochastic Parameterization for Oceanic Deep Convection 151

$$b\_{NH} = \frac{1}{\text{g}} \sum w\_{-}trands,\tag{19}$$

where the horizontal dissipation of vertical velocity (e.g. Eq. (17)) is computed along sigma coordinates. Our strategy is similar to Delorme et al. (2021), who derived a Quasi-Hydrostatic version of CROCO by including the non-traditional Coriolis effects through buoyancy correction.

Let us note that *bNH* is abusively denoted through a buoyancy variable, however it corresponds to corrections brought by the noise to the usual hydrostatic pressure. Such a correction should not be interpreted as an actual modification of the density/buoyancy field of the stratified ocean. Thus, the stochastic contribution is not included in the specific treatment of baroclinic-barotropic mode coupling of CROCO, which aims at accounting for the non-uniform density field for the propagation of gravity waves (Gill 1982), ultimately reducing the usual mode-coupling error associated with mode-splitting schemes (Shchepetkin and McWilliams 2005). In other words, we do not expect this '*non-hydrostatic buoyancy*' to affect gravity wave's propagation.

Finally, CROCO uses a third-order predictor-corrector (LF-AM3) time-stepping scheme for tracers and baroclinic momentum. This scheme consists of a Leapfrog (LF) predictor with 3rd-order Adams-Moulton (AM) interpolation. It also uses splitexplicit techniques to robustly couple the slow, baroclinic and the fast, barotropic modes associated with the time evolving non-linear free surface. A complete description of the several stages of CROCO time-stepping can be found in Section 5 of Shchepetkin and McWilliams (2005). This predictor-corrector, split-explicit scheme implies some tendency terms of the baroclinic mode are computed twice to step forward baroclinic momentum and tracer equations from time step *t* to time step *t* + *Δt*. The first computation is performed at the *prediction* stage, and the second computation is performed at the *correction* stage. These tendencies include pressure gradients. To avoid double counting the stochastic pressure correction and for stability reasons, the modified non-hydrostatic buoyancy is computed only at the *correction* stage.

## *3.2 Numerical Experiments*

The numerical experiments we used to test our stochastic non-hydrostatic pressure correction are oceanic deep convection events. The configuration is inspired by free convection studies (e.g. Souza et al. 2020) where an horizontally uniform surface cooling is applied to a constantly stratified, horizontally uniform ocean. Although our interest is on deep ocean convection rather than mixed layer free convection, we have adopted an horizontally uniform setting (as usually done in free convection) instead of a horizontally structured system as proposed earlier by, e.g. Marshall and Schott (1999), where surface cooling is confined within a specified region (usually a disc). This setting allows the analysis of interacting

convective plumes with non-convective environment. However, such configurations are usually conducted at coarser resolution and oriented toward process studies of the geostrophic organization of convective plumes. Here, our focus is on parameterization and we wish to start with simplified settings in order to capture the essence of deep convection dynamics; interactions with a prescribed background, non-convective environment is left for further work.

From this horizontally uniform and vertically constant stratified initial condition, the model is stepped forward in time on a 100×100×100 discretized grid points with isotropic resolution of 10 meters, and exposed to a constant (in time and space) cooling rate of *QT* = −<sup>500</sup> W m−<sup>2</sup> heat flux (Fig. 1). This leads to a cooling of upper ocean layers, which ultimately become unstable through static instabilities as a result of a negative buoyancy frequency *(N*<sup>2</sup> = −*<sup>g</sup> ρ ∂zb) <* 0, thus undergoing convection. The current settings are run with no Coriolis forcing, i.e. *<sup>f</sup>* <sup>=</sup> <sup>0</sup> <sup>s</sup>−1; inclusion of Coriolis effects will be the subject of further work. The model is initialized with stochastic perturbations on ocean upper layers temperature decaying with depth (following Souza et al. 2020) to trigger the formation of convective plumes

$$T(\mathbf{x}, \mathbf{y}, z)|\_{t=0} = T(z) + \sum\_{(m,n)=0}^{10} \left( e^{2\pi(\mathbf{k}\cdot\mathbf{x} + \phi\_{n,m})} \right) \mathcal{N}(0, 1) \* \sqrt{\sigma^2} \* e^{40z/Nz} \tag{20}$$

with *T (z)* <sup>=</sup> *<sup>T</sup>* <sup>|</sup>*z*=<sup>0</sup> <sup>−</sup>*αz*, *<sup>T</sup>* <sup>|</sup>*z*=<sup>0</sup> <sup>=</sup> <sup>3</sup> *<sup>K</sup>* and *<sup>α</sup>* is a constant defined as *<sup>α</sup>* <sup>=</sup> <sup>1</sup>*.*9*e*−<sup>6</sup> *g*∗*(αT /ρ*0*)* (*αT* <sup>=</sup> <sup>0</sup>*.*<sup>2048</sup> *<sup>K</sup>*−<sup>1</sup> is the thermal coefficient, *<sup>g</sup>* <sup>=</sup> <sup>9</sup>*.*<sup>81</sup> *m s*−<sup>2</sup> is gravity and *<sup>ρ</sup>*<sup>0</sup> <sup>=</sup> 1024 *kg m*−<sup>3</sup> is reference density). The second term on the RHS of (20) is the stochastic perturbation defined as the sum of plane waves with random phase *φm,n*

**Table 1** Summary of the experiments and their numerical details. NBQ stands for the nonhydrostatic, non-Boussinesq CROCO kernel of Auclair et al. (2018); NH, Hydro and Q-NH stand for Non-Hydrostatic, Hydrostatic and Quasi-Nonhydrostatic; WENO5 and C4 for the 5-th order and the 4th-order centred advection schemes; KPP for the K-Profil Parameterization of Large et al. (1994)


and of amplitude N ∗ √ *σ*2, where N is a Gaussian white noise distribution and *<sup>σ</sup>*<sup>2</sup> <sup>=</sup> <sup>10</sup>−<sup>8</sup> *<sup>K</sup>*<sup>2</sup> represents the variance of the stochastic perturbations. The random phases are drawn from an uniform distribution over the range [0, 1].

This configuration has been integrated forward in time to produce several numerical experiments in order to assess the performance of our implementation. It includes a pure Non-Hydrostatic (NH), which make use of the non-hydrostatic non-Boussinesq capabilities of CROCO (NBQ, Auclair et al. 2018), and a pure hydrostatic (Hydro) reference experiments. We then compare the solution produced by our Quasi-Nonhydrostatic (Q-NH) experiment, which includes the stochastic pressure correction, with the solutions produced by Hydro and NH.

For both Q-NH and Hydro, the KPP (Large et al. 1994) closure scheme is used to represent vertical sub-grid scale mixing. As stated in introduction, this scheme mimics vertical fluxes through a dissipative down-gradient operator. In the case of static instability associated with convective events, the dissipation coefficient is set to *Kθ* <sup>=</sup> <sup>0</sup>*.*1 m2s−<sup>1</sup> in CROCO. Sensitivity tests (not shown) using TKE (Gaspar et al. 1990) closure scheme instead revealed that the choice of the closure scheme has little effect on the solution produced by our Hydro experiment.

Horizontal dissipation of vertical velocity in Q-NH is set to *νw* <sup>=</sup> 1 m2s−1, which corresponds to high values of dissipation estimated through a Smagorinskylike approach *νSmago* <sup>=</sup> *<sup>α</sup>* <sup>2</sup> *<sup>Δ</sup>*<sup>2</sup> *xy (∂xw)*<sup>2</sup> <sup>+</sup> *(∂yw)*<sup>2</sup> , with *Δxy* = 10 *m* the horizontal resolution of our configuration, and *α* = 0*.*2. Table 1 summarizes the different experiments, along with some numerical details.

Finally, note that all three experiments are conducted at the same isotropic resolution of 10m. Evaluating the performance of our Q-NH model for climate scale regimes (i.e. with horizontal resolution much coarser than vertical resolution) will be the subject of further work.

## **4 Results**

We show on Fig. 2 snapshots of the vertical velocities as simulated by NH, Hydro and Q-NH after 3 days of simulation. Obviously, the NH experiment produces weaker and larger scale structures as compared to the two other experiments. It

**Fig. 2** Horizontal (top) and vertical (bottom) sections of vertical velocities for the Non-Hydrostatic (**NH**, left), the Hydrostatic (**Hydro**, center) and the Quasi-Nonhydrostatic (**Q-NH**, right) run. Snapshots are shown after 3 days of simulation with all other components (forcing, dissipation, stratification, resolution) held constant

is noticeable, however, that the amplitude of *w* in Q-NH is reduced as compared to Hydro, with larger scale structures of the convective plumes. In the Hydro experiment, plumes are localized near the grid scale and exhibit almost an order of magnitude larger vertical velocities as compared to the NH reference. The reduced vertical velocities in Q-NH and broadening of the associate spatial scale of the plumes can be interpreted as a result of the entrainment/detrainment mechanism, which is here simply represented as a purely horizontal viscous stress on vertical velocities.

Aside from vertical velocities, it is also instructive to analyse the consequence for the temperature profile, an indication of the capability of convection in producing deep water masses. Figure 3 shows the horizontally averaged temperature vertical profile, along with the rate of the Mixed Layer Depth (MLD) deepening, for the different experiments. A first remarkable result is the similarity between the temperature profiles produced by all the experiments within the MLD (i.e. *z <* −300 m). Additional tests (not shown) indeed reveal the very weak sensitivity to numerical implementation (i.e. non-hydrosatic, hydrostatic with different vertical mixing schemes), as well as the level of dissipation. The most noticeable differences between the experiments appear at the base of the mixed layer. In particular, the consequences of too strong vertical velocities in Hydro is to produce too deep water masses with a too strong penetrative convection (i.e. the envelop at the base of the mixed layer where water masses are warmer than their initial state). Although vertical velocities in Q-NH remain significantly larger than those produced by NH, the effects of horizontal viscous forces on the vertical velocities is to significantly damp the penetration of convective plumes bellow the mixed layer, inducing a strong reduction of water masses formation. The reduction of penetrative convection induces significant biases in the deepening rate of the MLD (Fig. 3, right panel).

**Fig. 3** Horizontally averaged temperature profiles (left) and deepening rate of the mixed layer depth (MLD, right) for the different runs at the end of the 3-day long simulations. Deepening MLD rate are compared to the analytical estimates of Marshall and Schott (1999) and Souza et al. (2020)

In Q-NH, MLD seats between the theoretical predictions of Marshall and Schott (1999) and that of Souza et al. (2020), where the former do not consider penetrative convection in their scaling while the latter do. That MLD is too shallow in Q-NH, as compared to NH, is likely a consequence of too strong dissipation imposed to the system. We note, however, that we have only considered the dissipative, rectification contribution (i.e. smooth-in-time) of the stochastic transport as a result of the strict separation assumption between martingale and smooth-in-time components. Further work are required to evaluate how the Brownian part of the stochastic transport impact the deepening of the MLD.

## **5 Conclusion and Perspectives**

In this study, we detailed the first steps toward a full stochastic parameterization of deep ocean convection along with their implementation in the general circulation model CROCO. Our preliminary results, which consist of an approximation of the horizontal noise structure as homogeneous and isotropic, led us to recover part of the derivation provided by Klingbeil and Burchard (2013) in a deterministic case. Our results are encouraging, and we are now in a position of extending the current analysis to a fully consistent stochastic framework.

The first step in this direction will be to implement the stochastic pressure noise contribution which comes in pair with the idealized Laplacian horizontal viscosity action on vertical velocities in the context of an hydrostatic simulation of deep ocean convection (i.e. Eq. (9)). Following previous work of Pierre Dérian and Etienne Mémin (*'Hyper-viscosity' noise for transport under location uncertainty*), we will consider a simplified expression of the stochastic transport of vertical velocity to construct our stochastic pressure noise. This approach is meant to obtain the stochastic transport associated with Laplacian or hyper-viscosity dissipation as usually implemented in OGCM. With these considerations, it is possible to express the stochastic transport of (9) as:

$$
\sigma \mathbf{d} \mathbf{B}\_l \cdot \nabla \ w = \sum\_k \gamma\_k \lambda\_k e\_k(\mathbf{x}) \tag{21}
$$

with *ek* a basis, defined here as Daubechies wavelets, *γk* denotes independent normally distributed variables, and *λk* are the wavelet coefficients defined as:

$$
\lambda\_k = \left\langle \sqrt{2\epsilon dt} |\boldsymbol{\nu}^{1/2} \nabla \boldsymbol{w}| ; \boldsymbol{e}\_k \right\rangle\_{\mathcal{L}^2} \,, \tag{22}
$$

with a scaling factor controlling the ratio of variance created by the noise to energy dissipation. Accounting for the stochastic pressure would also require considerations for the Brownian components of the stochastic transport in the horizontal momentum advection. These steps are part of further works to achieve a full, consistent implementation of LU transport in CROCO.

**Acknowledgments** This work is supported by the ERC project 856408-STUOD.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Comparison of Stochastic Parametrization Schemes Using Data Assimilation on Triad Models**

**Alexander Lobbe, Dan Crisan, Darryl Holm, Etienne Mémin, Oana Lang, and Bertrand Chapron**

**Abstract** In recent years, stochastic parametrizations have been ubiquitous in modelling uncertainty in fluid dynamics models. One source of model uncertainty comes from the coarse graining of the fine-scale data and is in common usage in computational simulations at coarser scales. In this paper, we look at two such stochastic parametrizations: the Stochastic Advection by Lie Transport (SALT) parametrization introduced by Holm (Proc A 471(2176):20140963, 19, 2015) and the Location Uncertainty (LU) parametrization introduced by Mémin (Geophys Astrophys Fluid Dyn 108(2):119–146, 2014). Whilst both parametrizations are available for full-scale models, we study their reduced order versions obtained by projecting them on a complex vector Fourier mode triad of eigenfunctions of the curl. Remarkably, these two parametrizations lead to the same reduced order model, which we term the *helicity-preserving stochastic triad* (HST). This reduced order model is then compared with an alternative model which preserves the energy of the system, and which is termed the *energy preserving stochastic triad* (EST). These low-dimensional models are ideal benchmark models for testing new Data Assimilation algorithms: they are easy to implement, exhibit diverse behaviours depending on the choice of the coefficients and come with natural physical properties such as the conservation of energy and helicity.

A. Lobbe (-) · D. Crisan · D. Holm · O. Lang

Imperial College London, Mathematics, London, UK e-mail: alex.lobbe@imperial.ac.uk

E. Mémin

B. Chapron

Ifremer – Institut Français de Recherche pour l'Exploitation de la Mer, Plouzané, France

**Supplementary Information** The online version contains supplementary material available at https://github.com/alobbe/stochastic-triads.

Campus Universitaire de Beaulieu, Inria - Institut National de Recherche en Sciences et Technologies du Numérique, Rennes, France

B. Chapron et al. (eds.), *Stochastic Transport in Upper Ocean Dynamics II*, Mathematics of Planet Earth 11, https://doi.org/10.1007/978-3-031-40094-0\_7

## **1 Introduction**

The introduction of stochasticity in fluid dynamics has recently been the subject of intense research effort. This approach involves using random processes to model, for example, unresolved scales, or to take into account neglected physical effects. A stochastic formulation for the fluid flow introduces a probabilistic basis for modelling unresolved scales. This is different from the deterministic causal modelling which is difficult to achieve in practice due, for instance, to unknown initial conditions. In addition, some phenomena such as energy backscattering are directly accessible as stochastic processes. Another usage of stochastic modelling is to generate ensembles of realizations of the model. This facilitates the analysis of model uncertainty quantification for different low-resolution computational simulations and their usage to approximate the true state of the fluid, instead of using a single high-resolution numerical simulation.

Some stochastic schemes have been proposed in the literature by considering a variety of ad-hoc perturbations. However, a principled approach is desirable. The formulation of stochastic dynamical systems based on physical principles has recently been proposed in various settings. For a review and classification of approaches to stochastic parameterisation based on physical principles, see [4]. The present work treats two additional new approaches. The first one, called stochastic advection by Lie transport (SALT) relies on the variational principle for fluid dynamics [16]. The second one, called modelling under location uncertainty (LU) is derived from Newton's principle [23]. Both frameworks introduce stochasticity into the Lagrangian specification of the flow field, rather than directly into the Eulerian frame.

In the deterministic case, it is known that three-dimensional fluid flows may trigger a cascade of dynamics across multiple length and time scales. This multiscale behavior poses considerable challenges in the computational simulation using standard Navier-Stokes equations (see e.g. [7], [24], [5]). When modelling turbulence numerically, specialised discretisation methods are needed to decompose the underlying partial differential equations into a very large number of ordinary differential equations. Alternative approaches have been introduced where the Navier-Stokes dynamics in the Fourier space is mimicked using a finite number of variables, say *u*1*, u*2*,...,u<sup>N</sup>* . The Fourier space is divided into *N* shells, and each shell <sup>s</sup>*<sup>i</sup>* comprises the set of wave vectors *<sup>s</sup>* with magnitude <sup>|</sup>*s*| ∈ *(s*02*<sup>i</sup> , s*02*<sup>i</sup>*+1*)*. Each *ui* satisfies an ODE and it represents the magnitude of the velocity field on a length scale of *s*−<sup>1</sup> *<sup>i</sup>* ([15], [7]). The quadratic nonlinearity in the Navier-Stokes equations produces triads of interacting vector Fourier modes within each shell. Shell models involving multiple triads have had considerable success in modelling energy and helicity cascades, as well as modelling intermittency in chaotic dynamical systems [6, 10, 9]. Simplified shell models with only a few triads date back to the 1970s and have provided major insight into fluid modal interaction. Even the dynamical system representation of Euler's fluid equations on a single triad has been quite insightful, see e.g. [28, 29].

More recently, the problem of correctly parameterizing effects of small-scale physical processes together with the need for probabilistic ensemble forecasting and uncertainty quantification has led to modern stochastic approaches in the study of turbulence using reduced order *shell models*. In this work we will explore reduced order models for SALT and LU models obtained by projecting onto helical basis functions [6, 28, 29]. These helical basis functions, defined as eigenfunctions of the curl operator, enable one to construct reduced order stochastic models of fluid flow with a simplified nonlinear interaction. As we will see, under projection onto the basis of helical triad modes, both LU and SALT result in the *same* reduced order model and this projected model conserves helicity, but it does not conserve energy. Because of this coincidence in projecting the SALT and LU models onto the helical basis, a second reduced order scheme with a strong energy conservation property inspired by [17] and known as the *energy preserving stochastic triad* (EST) model will be proposed for comparison.

While the EST model is not of transport type, it will provide comparison between two different classes of stochastic dynamical systems. The two classes treated here are (1) the *helicity preserving stochastic triad* (HST) (comprising both LU and SALT on the helical basis) and (2) the *energy preserving stochastic triad* (EST) of [17] projected onto the helical basis. The solution behaviour of the HST model will be compared to that for the EST model for several data assimilation objectives formulated on the helical triad modes. For classical deterministic models one obtains a system of ordinary differential equations. However, for stochastic dynamics a set of stochastic differential equations (SDEs) is obtained [8, 14, 25].

The goal of the data assimilation procedure in this context is twofold: firstly, it is used to calibrate the uncertainty of the model (the amplitude of the noise). Secondly, once the calibration is complete, the particle filtering methodology can be used to reduce the uncertainty. We want the distribution of the fluctuations to be properly approximated. In the absence of stochasticity, all particles would go in the same direction and the initial spread would rapidly disappear because of the hyperbolic character of the model. In the absence of a reasonable spread, the particle filter methodology will eventually collapse. For this reason, we need to introduce stochasticity into the system that correctly characterises the fluctuation dynamics. In particular, we want to find the type of noise amplitude and the stochastic parameters for which the distribution of the output samples is reasonably *uniform*.

**Structure of the Paper** In Sect. 2 we introduce the triad models for incompressible flows modelled by the Euler equation in its deterministic and stochastic form. To this end, we introduce the stochastic parametrisation paradigms. Building upon these models for the 3D Euler equation, we then present reduced order triad models derived from the original equations. The derivation follows the classical approach for triad models from the literature, that has been successfully employed in the deterministic case. Our full derivations, complete also for the stochastically parametrised models, can be found in Appendix 2. The Data Assimilation experiments are carried out in Sect. 3. We first briefly explain the standard particle filter methodology, and then in Sect. 3.1 we present the findings of our numerical studies. In particular, Sect. 3.1 presents the results of the main numerical studies in this work. These are:


In Sect. 4 we describe our conclusions on this topic. We conclude the paper with a number of appendices: in Appendix 1 one can find a list of notations and standard identities, in Appendix 2 we present a detailed derivation of shell models (deterministic and stochastic), in Appendix 3 we introduce some supplementary numerics related to the noise amplitude calibration.

**Code Availability** The code corresponding to the numerical experiments in this paper is archived in [22]. The GitHub repository is located at https://github.com/alobbe/stochastic-triads.

## **2 Reduced Order Models for Incompressible Fluids**

## *2.1 Reduced Order Models for the 3D Euler Equation*

The 3D Euler equations model incompressible inviscid fluid dynamics. These equations may be written by using the Leray operator P to project onto the divergence-free part of its operand as

$$\begin{aligned} \frac{\partial \mathbf{v}}{\partial t} &= \mathcal{P} \left( \mathbf{v} \times \mathbf{curl} \mathbf{v} \right) \\ &= \mathcal{P} \left( \frac{\delta E}{\delta \mathbf{v}} \times \frac{\delta C}{\delta \mathbf{v}} \right) \end{aligned} \tag{1}$$
  $\text{with conserved Energy } E(\mathbf{v}) = \int\_{\mathbb{R}^3} \frac{1}{2} \mathcal{P} \mathbf{v} \cdot \mathbf{v} d^3 \mathbf{x}$   $\text{and conserved helicity } C(\mathbf{v}) = \frac{1}{2} \int\_{\mathbb{R}^3} \mathbf{v} \cdot \mathbf{curl} \, \mathbf{v} \, d^3 \mathbf{x}$ 

where *δ/δ***v** represents variational derivative with respect to the fluid velocity **v**.

Following [9, 10] we use a Galerkin expansion in orthogonal *vector* modes that are eigenfunctions of the curl operator. Assume the fluid is contained in a periodic box <sup>D</sup> <sup>⊂</sup> <sup>R</sup><sup>3</sup> of side length *L >* 0. Then the velocity **<sup>v</sup>** and vorticity *<sup>ω</sup>* := ∇ × **<sup>v</sup>** may be expanded in circularly polarised or helical modes **h**±*(***k***)* exp*(i***k**·**x***)*, with the wave vectors **<sup>k</sup>** <sup>∈</sup> <sup>K</sup> := *(*2*π/L)*Z3. The modes shall be orthogonal, i.e.

$$\int\_{\mathcal{D}} \mathbf{h}\_{\delta\_{\mathcal{P}}}(\mathbf{p}) \exp(i \mathbf{p} \cdot \mathbf{x}) \cdot \left[\mathbf{h}\_{\delta\_{\mathcal{Q}}}(\mathbf{q}) \exp(i \mathbf{q} \cdot \mathbf{x})\right]^{\*} \,\mathrm{d}\mathbf{x} = C \delta\_{\mathbf{p}, \mathbf{q}} \delta\_{\delta\_{\mathcal{P}}, \delta\_{\mathcal{Q}}} \; ; \; C > 0 \text{ const.} \tag{2}$$

The complex vector amplitudes **h**±*(***k***)*should satisfy **k**·**h**±*(***k***)* = 0 and *i***k**×**h**±*(***k***)* = ±|**k**|**h**±*(***k***)*. A convenient choice of basis for the **h**±*(***k***)* is then given by

$$\mathbf{h}\_{\pm}(\mathbf{k}) := \boldsymbol{\nu} \times \boldsymbol{\kappa} \pm i\boldsymbol{\nu}, \quad \text{with } \boldsymbol{\kappa} := \mathbf{k}/k, \ \boldsymbol{\nu} := \mathbf{k} \times \boldsymbol{\Gamma}/|\mathbf{k} \times \boldsymbol{\Gamma}|, \ \boldsymbol{\Gamma} := \text{const},\tag{3}$$

for which |**h**±*(***k***)*| <sup>2</sup> := **<sup>h</sup>**±*(***k***)* · **<sup>h</sup>**±*(***k***)*<sup>∗</sup> <sup>=</sup> <sup>2</sup> and **<sup>h</sup>**±*(***k***)* · **<sup>h</sup>**∓*(***k***)*<sup>∗</sup> <sup>=</sup> 0.

At this point, one notices the key features of the helical modes **h**±*(***k***)* exp*(i***k** · **x***)* which greatly simplifies analysis of modal expansions of the 3D Euler and related equations, such as 3D Navier-Stokes. Namely, the helical modes **h**±*(***k***)* exp*(i***k** · **x***)*, are divergence-free eigenfunctions of the curl operator. Specifically,

$$\nabla \cdot \mathbf{h}\_s(\mathbf{k}) e^{i\mathbf{k} \cdot \mathbf{x}} = i\mathbf{k} \cdot \mathbf{h}\_s(\mathbf{k}) e^{i\mathbf{k} \cdot \mathbf{x}} = 0 \tag{4}$$

and

$$\nabla \times \mathbf{h}\_{\delta}(\mathbf{k})e^{i\mathbf{k}\cdot\mathbf{x}} = i\mathbf{k} \times \mathbf{h}\_{\delta}(\mathbf{k})e^{i\mathbf{k}\cdot\mathbf{x}} = s|\mathbf{k}|\mathbf{h}\_{\delta}(\mathbf{k})e^{i\mathbf{k}\cdot\mathbf{x}}.\tag{5}$$

See [9, 10, 28, 29] for more information about how this Galerkin decomposition into divergence-free eigenfunctions of the curl are used as a standard tool in analysis of 3D solution behaviour of the deterministic Euler fluid equations and Navier-Stokes fluid equations. In particular, the helical mode expansions in Eqs. (6) comprise the source of the popular *shell models* as finite-dimensional expansions of turbulent fluid dynamics. Thus, this expansion provides a useful framework for studying lowdimension stochastic models of 3D Navier-Stokes turbulence.

In terms of the basis of helical modes **h**±*(***k***)* exp*(i***k** · **x***)*, the divergence-free fluid velocity **v***(***x***,t)* and vorticity *ω(***x***,t)* are expressed in [28, 29] in terms of complex vector amplitudes **<sup>u</sup>***(***k***, t), (***k***,t)* <sup>∈</sup> <sup>C</sup>3, respectively,

$$\begin{split} \mathbf{v}(\mathbf{x},t) := \sum\_{\mathbf{p}} \mathbf{u}(\mathbf{p},t)e^{i\mathbf{p}\cdot\mathbf{x}} := \sum\_{\mathbf{p}} \sum\_{s\_{\rho}=\pm} a\_{s\_{\rho}}(\mathbf{p},t) \mathbf{h}\_{s\_{\rho}}(\mathbf{p}) e^{i\mathbf{p}\cdot\mathbf{x}}, \\ \boldsymbol{\omega}(\mathbf{x},t) := \sum\_{\mathbf{q}} \boldsymbol{\varpi}(\mathbf{q},t) e^{i\mathbf{q}\cdot\mathbf{x}} := \sum\_{\mathbf{q}} \sum\_{s\_{q}=\pm} s\_{q}|\mathbf{q}| \, a\_{s\_{q}}(\mathbf{q},t) \mathbf{h}\_{s\_{q}}(\mathbf{q}) e^{i\mathbf{q}\cdot\mathbf{x}}. \end{split} \tag{6}$$

Here, the choice

$$\mathbf{u}(\mathbf{k},t) := a\_+(\mathbf{k},t)\mathbf{h}\_+(\mathbf{k}) + a\_-(\mathbf{k},t)\mathbf{h}\_-(\mathbf{k}) = \sum\_{s\_k=\pm} a\_{s\_k}(\mathbf{k},t)\mathbf{h}\_{s\_k}(\mathbf{k}),\tag{7}$$

with *a*∗ *<sup>s</sup> (***k***)* = *as(*−**k***)* [29], was made, so that (5) implies

$$\varpi(\mathbf{k},t) := |\mathbf{k}| \left( a\_+(\mathbf{k},t)\mathbf{h}\_+(\mathbf{k}) - a\_-(\mathbf{k},t)\mathbf{h}\_-(\mathbf{k}) \right) = |\mathbf{k}| \sum\_{s\_k=\pm} s\_k \, a\_{s\_k}(\mathbf{k},t)\mathbf{h}\_{s\_k}(\mathbf{k}).\tag{8}$$

The conservation laws for the Euler fluid kinetic energy and helicity—expressed as integrals over the spatially periodic box D—can be evaluated in Fourier space via Parseval's theorem, as follows,

1 2 D |**v***(***x***,t)*| <sup>2</sup> *<sup>d</sup>*3*x*=<sup>1</sup> 2 **k u***(***k***,t)* · **u**∗*(***k***,t)*= **k** *sk*=± *ask (***k***,t)* · *a*<sup>∗</sup> *sk (***k***,t),* D **<sup>v</sup>***(***x***,t)* · curl**v***(***x***,t)d*3*<sup>x</sup>* <sup>=</sup> **k u***(***k***,t)* · ∗*(***k***,t)* <sup>=</sup> **k** *sk*=± *ksk ask (***k***, t)a*<sup>∗</sup> *sk (***k***,t)* **h***sk (***k***)* · **h**<sup>∗</sup> *sk (***k***)* = 2 **k** *sk*=± *ksk ask (***k***, t)a*<sup>∗</sup> *sk (***k***,t).* (9)

Expanding the terms of the Euler equation in curl form (57), we obtain the Euler equations for the coefficients *ask (***k***,t)*. For all **k** ∈ K, *sk* ∈ {+*,* −} we have

$$\partial\_t a\_{s\_k}(\mathbf{k}, t) = -\frac{1}{4} \sum\_{\mathbf{p} + \mathbf{q} + \mathbf{k} = 0} \sum\_{s\_p, s\_q} (s\_p |\mathbf{p}| - s\_q |\mathbf{q}|) a\_{s\_p}^\*(\mathbf{p}, t) a\_{s\_q}^\*(\mathbf{q}, t) \mathbf{h}\_{s\_p}^\*(\mathbf{p}) \times \mathbf{h}\_{s\_q}^\*(\mathbf{q}) \cdot \mathbf{h}\_{s\_k}^\*(\mathbf{k}). \tag{10}$$

For the explicit derivation of Eq. (10) see section "Deterministic Euler" in Appendix 2.

The elementary interactions in Fourier space take place between triads of wave vectors such that **k** + **p** + **q** = 0, as exemplified in Eq. (10) above. There are two degrees of freedom per wave vector, *(a*+*, a*−*)*, so eight different types of interaction are allowed according to the value of the triplet *(sk, sp, sq )* = *(*±1*,* ±1*,* ±1*)*. Consider a fixed triple of wave vectors **k***,* **p***,* **q** ∈ K such that **k** + **p** + **q** = 0 and a fixed triple *sk, sp, sq* ∈ {+*,* −}. This gives rise to three coefficients *ask (***k***, t), asp (***p***, t), asq (***q***,t)*, which we compactly summarise into the complex vector

Comparison of Stochastic Parametrization Schemes Using Data Assimilation... 165

$$\mathbf{a} = (a\_{s\_k}, a\_{s\_p}, a\_{s\_q}) \in \mathbb{C}^3\text{ .}$$

The dynamics of **a** is determined by the three equations obtained from (10)

$$\frac{\mathbf{d}a\_{s\underline{k}}}{\mathbf{d}t} = \mathbf{g}(\mathbf{s}\_p|\mathbf{p}|-\mathbf{s}\_q|\mathbf{q}|)a\_{s\_p}^\* a\_{s\_q}^\* + R,\tag{11}$$

$$\frac{\mathbf{d}a\_{s\_p}}{\mathbf{d}t} = \mathbf{g}(\mathbf{s}\_q|\mathbf{q}|-\mathbf{s}\_k|\mathbf{k}|)a\_{s\_q}^\* a\_{s\_k}^\* + R,\tag{12}$$

$$\frac{\mathbf{d}a\_{s\_q}}{\mathbf{d}t} = \mathbf{g}(\mathbf{s}\_k|\mathbf{k}| - \mathbf{s}\_p|\mathbf{p}|)a\_{s\_k}^\* a\_{s\_p}^\* + R,\tag{13}$$

We pick out a summand and cycle through *k, p, q*

$$\frac{\mathbf{d}a\_{s\_k}}{\mathbf{d}t} = \mathbf{g}(\mathbf{s}\_p|\mathbf{p}|-\mathbf{s}\_q|\mathbf{q}|)a\_{s\_p}^\* a\_{s\_q}^\*,\tag{14}$$

$$\frac{\mathbf{d}a\_{s\_p}}{\mathbf{d}t} = \mathbf{g}(\mathbf{s}\_q|\mathbf{q}|-\mathbf{s}\_k|\mathbf{k}|)a\_{s\_q}^\* a\_{s\_k}^\*,\tag{15}$$

$$\frac{\mathbf{d}a\_{s\_q}}{\mathbf{d}t} = \mathbf{g}(\mathbf{s}\_k|\mathbf{k}| - \mathbf{s}\_p|\mathbf{p}|)a\_{s\_k}^\* a\_{s\_p}^\*,\tag{16}$$

with the constant complex scalar

$$\log := -\frac{1}{4} \mathbf{h}\_{s\_p}^\*(\mathbf{p}) \times \mathbf{h}\_{s\_q}^\*(\mathbf{q}) \cdot \mathbf{h}\_{s\_k}^\*(\mathbf{k}).\tag{17}$$

The equations corresponding to the single triad interaction of type *(sk, sp, sq )* with **k** + **p** + **q** = 0 thus have the complex vector form also derived in [28],

$$\frac{d\mathbf{a}}{dt} = \mathbf{g}\mathbf{a}^\* \times \mathbb{D}\mathbf{a}^\* = \mathbf{g}(\mathbf{a} \times \mathbb{D}\mathbf{a})^\*,\tag{18}$$

with the constant diagonal matrix

$$\mathbb{D} := \text{diag}\left(\mathbf{s}\_k |\mathbf{k}|, \mathbf{s}\_p |\mathbf{p}|, \mathbf{s}\_q |\mathbf{q}|\right). \tag{19}$$

The form of the factor *g* defined in (17) above can be calculated from (3) to show that it depends on the shape and the orientation of the wave-vector triad, but not on its scale; since the real and imaginary parts of the complex helical vector amplitudes **h***sk* are unit vectors. Moreover, D**a** can be seen to represent the *(sk, sp, sq )* components of the vorticity vector amplitude through Eq. (8) above. Two conservation laws for real-valued triad energy and helicity follow immediately from Eq. (18), as

$$\frac{d}{dt}(\mathbf{a}\cdot\mathbf{a}^\*) = 0 \quad \text{and} \quad \frac{d}{dt}(\mathbf{a}\cdot\mathbb{D}\mathbf{a}^\*) = 0\,. \tag{20}$$

The dynamical system in Eq. (18) is similar to rigid body dynamics, but replaced by complex angular momentum *(***a***)*, complex angular velocity *(*D**a***)* and real moment of inertia <sup>I</sup> <sup>=</sup> <sup>D</sup>−<sup>1</sup> relating the two complex quantities.<sup>1</sup>

## *2.2 Stochastic Parametrizations for the 3D Euler Equation*

In the following we introduce a reduced order model for the stochastic parametrizations introduced through the Stochastic Advection by Lie Transport paradigm as well as the Location Uncertainty paradigm. We explain below the rationale of these parametrizations:

#### **2.2.1 Modelling Under the Stochastic Advection by Lie Transport Principle**

The SALT equations were derived in [16] using a Stratonovich stochastic version of Hamilton's variational principle [18] in combination with Kraichnan's scalar turbulence model based on Stratonovich stochastic Lagrangian paths [20]. The application of Hamilton's principle with an imposed stochastic Lie transport constraint implied an Euler-Poincaré equation for the fluid motion [18]. The 3D SALT Euler equations for divergence-free fluid velocity **v***(***x***,t)* are given by

$$\mathbf{d}\mathbf{v} + (\mathbf{d}\mathbf{x}\_l \cdot \nabla)\mathbf{v} + v\_j \nabla \mathbf{d}\mathbf{x}\_l^j = -\nabla \mathbf{d}p \,,$$

$$\text{with} \quad \mathbf{d}\mathbf{x}\_l = \mathbf{v}dt + \sum\_l \boldsymbol{\xi}\_l(\mathbf{x}) \diamond \mathbf{d}W\_l^l \,. \tag{21}$$

As discussed in [16], this motion equation yields a Kelvin-Noether circulation theorem for the stochastic system

$$\oint\_{c(\mathbf{x}\_l)} \mathbf{v} \cdot d\mathbf{x} = -\oint\_{c(\mathbf{x}\_l)} \nabla \mathrm{d}p \cdot d\mathbf{x} = 0. \tag{22}$$

This stochastic Kelvin circulation theorem is has the same form as that for the deterministic system, except that each line element of the material loop in Kelvin's theorem follows the Stratonovich stochastic Lagrangian path, **x***<sup>t</sup>* .

The real vectors *ξ <sup>i</sup>* comprise the time-independent noise amplitudes which are to be determined from data assimilation. The *W<sup>i</sup>* are independent (uni-dimensional) standard Brownian motions and ◦ denotes stochastic integration in the Stratonovich sense.<sup>2</sup> The curl form of the SALT Euler motion equation in (21) is obtained

<sup>1</sup> Rigid body dynamics with complex angular momentum has also been discussed previously in [3].

<sup>2</sup> An exposition of Brownian motion, stochastic calculus and the Stratonovich integral is to be found, for example, in the monograph [19].

from (56) and given by

$$\mathbf{d}\mathbf{v} - \mathbf{d}\mathbf{x}\_l \times \mathbf{curl}\mathbf{v} + \nabla(\mathbf{v} \cdot \mathbf{dx}\_l) = -\nabla \mathbf{d}p. \tag{23}$$

The motion equation (23) and its curl yielding the SALT vorticity equation implies a formula for the evolution of the helicity of the flow, , defined as

$$
\Lambda := \int\_{\mathcal{D}} \mathbf{v} \cdot \mathbf{curl} \mathbf{v} \, d^3 \mathbf{x} \,. \tag{24}
$$

Upon applying the divergence theorem, one finds

$$\mathbf{d}\Lambda = -\int\_{\partial\mathcal{D}} \widehat{\mathbf{n}} \cdot \left( (\mathbf{v} \cdot \mathbf{curl} \mathbf{v}) \, \mathbf{dx}\_l + \mathbf{curl} \mathbf{v} \, \mathrm{d}p \right) dS \,. \tag{25}$$

Thus, in a periodic 3D domain, or in an infinite 3D domain with asymptotically vanishing boundary conditions, the SALT motion equation in (21) or (23) preserves the helicity, , defined in (24). However, a glance at the SALT motion equation in (23) informs us that it will not preserve the kinetic energy, since even with the usual fluid boundary conditions div**v** = 0 implies

$$\frac{1}{2}\operatorname{d}\|\mathbf{v}\|\_{L^2}^2 := \int\_{\mathcal{D}} \mathbf{v} \cdot \operatorname{d}\mathbf{x}\_l \times \operatorname{curl} \mathbf{v} \, d^3 \mathbf{x} \neq \mathbf{0} \,. \tag{26}$$

#### **2.2.2 Modeling Under the Location Uncertainty Principle**

The Location Uncertainty principle consists in decomposing the flow trajectory **<sup>x</sup>**: <sup>×</sup> <sup>R</sup><sup>+</sup> <sup>→</sup> over a bounded domain, <sup>⊂</sup> <sup>R</sup><sup>3</sup>

$$\mathbf{dx}\_{l} = \mathbf{v}\left(\mathbf{x}\_{l}, t\right)\,\mathrm{d}t + \sigma\left(\mathbf{x}\_{l}, t\right)\,\mathrm{d}\mathbf{W}\_{l} \tag{27}$$

in terms of **v** *(***x***t, t)*, a smooth-in-time component of the (Lagrangian) velocity and a noise *σ (***x***t, t)* d**W***<sup>t</sup>* , which has here to be understood in the Itô sense and that accounts for the unresolved processes. The Wiener process, **W***<sup>t</sup>* is a *H*valued (cylindrical) Brownian motion, where H is the Hilbert space of square integrable functions. The noise is then properly defined as the application of an Hilbert-Schmidt symmetric integral kernel *σtf (x)* =  <sup>S</sup> *<sup>σ</sup>*˘ *(x, <sup>y</sup>, t) <sup>f</sup> (y)* <sup>d</sup>*<sup>y</sup>* to the *H*-valued cylindrical Wiener process **W** as

$$(\sigma\_l \mathrm{d}\mathbf{W}\_l)^i \left(\mathbf{x}\right) = \int\_{\mathcal{S}} \check{\sigma}\_{lk} \left(\mathbf{x}, \,\mathbf{y}, t\right) \mathrm{d}W\_l^k \left(\mathbf{y}\right) \,\mathrm{d}\mathbf{y},\tag{28}$$

The role of the integrable kernel *σ*˘ is to impose a spatial correlation on the smallscale component. It leads to the covariance tensor *Q*

$$\begin{aligned} \mathcal{Q}\_{lj}(\mathbf{x}, \mathbf{y}, t, s) &= \mathbb{E}\left[ (\sigma\_l \mathbf{d} \mathbf{W}\_l \, (\mathbf{x}))^l \, (\sigma\_l \mathbf{d} \mathbf{W}\_s \, (\mathbf{y}))^j \right] \\ &= \delta \,(t - s) \, \text{d}t \int\_{\mathcal{S}} \check{\sigma}\_{lk} \, (\mathbf{x}, \mathbf{z}, t) \, \check{\sigma}\_{kj} \, (\mathbf{z}, \mathbf{y}, s) \, \text{d}\mathbf{z}, \end{aligned}$$

of the centered Gaussian process *σt*d**W***<sup>t</sup>* ∼ N *(*0*,* **Q**d*t)*. The diagonal components of the covariance tensor per unit of time, referred to as the variance tensor, **a**, is a positive definite matrix defined as **a***(***x***, t)δ(t* − *t )*d*t* = **Q***(***x***,* **x***,t,t )*, that quantifies the strength of the noise and has the dimension of a viscosity in m2s−1. The operator *Q* being compact auto-adjoint positive definite operator on *H*, it admits eigenfunctions *ξ <sup>n</sup> (***·***, t)* with (strictly) positive eigenvalues *λn (t)* satisfying *<sup>n</sup>*∈<sup>N</sup> *λn (t) <sup>&</sup>lt;* +∞. As a consequence, the noise and the variance tensor *<sup>a</sup>* can be expressed through the spectral representation

$$\sigma\_I \mathbf{dW}\_I(\mathbf{x}) \;= \sum\_{n \in \mathbb{N}} \lambda\_n^{1/2}(t) \,\mathfrak{k}\_n(\mathbf{x}, t) \,\mathrm{d}\beta\_n \tag{29}$$

$$\mathfrak{a}\left(\mathfrak{x},t\right) \underset{n \in \mathbb{N}}{=} \sum\_{n \in \mathbb{N}} \lambda\_n\left(t\right) \mathfrak{k}\_n\left(\mathfrak{x},t\right) \mathfrak{k}\_n^\dagger\left(\mathfrak{x},t\right). \tag{30}$$

The rate of change of a volume *Vt* of the scalar *q* is given by the stochastic Reynolds transport theorem, introduced in [23]

$$\operatorname{d} \int\_{V\_I} q \left( \mathbf{x}, t \right) \, \mathrm{d} \mathbf{x} = \int\_{V\_I} \left\{ \mathbf{D}\_l q + q \nabla \cdot \left[ \mathbf{v}^\star \, \mathrm{d}t + \sigma\_l \, \mathrm{d} \mathbf{W}\_l \right] \right\} (\mathbf{x}, t) \, \mathrm{d} \mathbf{x}, \tag{31}$$

with the transport operator

$$\mathbf{D}\_{l}q = \mathbf{d}\_{l}q + \left[\mathbf{v}^{\star}\,\mathrm{d}t + \sigma\_{l}\,\mathrm{d}\mathbf{W}\_{l}\right] \cdot \nabla q - \frac{1}{2}\nabla \cdot (\mathbf{a}\nabla q)\,\mathrm{d}t.\tag{32}$$

In this formula, the first component of the right-hand side is the increment in time at a fixed location of the process *q*, that is d*tq* = *q (***x***t, t* + d*t)* − *q (***x***t, t)*, playing the role of a derivative in time for a non differentiable process. The effective velocity **v** is defined as

$$\mathbf{v}^{\star} = \mathbf{v} - \frac{1}{2}\nabla \cdot \mathbf{a} + \sigma\_I^{\ast} \left(\nabla \cdot \sigma\_I\right),\tag{33}$$

where the velocity component **<sup>v</sup>***<sup>s</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup>∇ **· a** results from the noise inhomogeneities. For incompressible homogeneous noise as considered in this work **v** = **v**. Besides, the diffusion term exactly balances the noise brought by the noise. With Stratonovich convention and a homogeneous noise the transport operator takes a simplified form similar to the material derivative:

$$\mathbf{D}\_l q = \mathbf{d}\_l q + (\mathbf{v} \, \mathrm{d}t + \sigma\_l \, \circ \, \mathrm{d}\mathbf{W}\_l) \cdot \nabla q. \tag{34}$$

For a divergence free homogeneous noise, the Euler equation, in the LU framework, can then be defined as:

$$\mathbf{d}\_l \mathbf{v} + (\mathbf{v} \,\mathrm{d}t + \sigma\_l \,\circ \,\mathrm{d}\mathbf{W}\_l) \cdot \nabla \mathbf{v} = -\nabla dp\_l, \quad \nabla \cdot \mathbf{v} = 0,\tag{35}$$

where *dpt* denotes the pressure composed of a finite variation term and a martingale pressure term. With the Leray projection, P, this pressure term can be removed and we obtain, the inertial form of the Euler equation:

$$\mathbf{d}\_l \mathbf{v} + \mathbb{P}\left( (\mathbf{d} \mathbf{x}\_l \cdot \nabla) \mathbf{v} \right) = 0, \qquad \nabla \cdot \mathbf{v} = 0. \tag{36}$$

## *2.3 Triad Model Comparison*

The reduced order model for the full-scale 3D SALT Euler and 3D LU Euler for a single triad interaction equation is obtained by projecting the continuous stochastic Euler models onto the helical modes, in the same fashion as for the deterministic equation (18). Therefore, we introduce an additional Stratonovich stochastic term into the transport velocity in (7) as

$$\mathbf{dx}\_{l}(\mathbf{k},t) := \left(a\_{+}(\mathbf{k},t)\mathbf{h}\_{+} + a\_{-}(\mathbf{k},t)\mathbf{h}\_{-}\right)\mathbf{d}t + \sum\_{l} \left(b\_{+}^{j}(\mathbf{k})\mathbf{h}\_{+} + b\_{-}^{l}(\mathbf{k})\mathbf{h}\_{-}\right) \diamond dW\_{l}^{l},\tag{37}$$

where the **<sup>k</sup>**-dependent complex vector **<sup>b</sup>***(***k***)* := *(bsk , bsp , bsq )<sup>T</sup>* <sup>∈</sup> <sup>C</sup><sup>3</sup> represents the time-independent noise amplitude which is to be determined from data assimilation, similar to the continuous stochastic models (21). Enumerating the equation for a single triad then yields, after rearranging using exchange symmetry in *(***k***,* **p***,* **q***)*, the matrix equation

$$\mathbf{d} \begin{bmatrix} a\_{s\_k} \\ a\_{s\_p} \\ a\_{s\_q} \end{bmatrix} = \mathbf{g} \begin{bmatrix} 0 & -qs\_qs\_{s\_q} & ps\_pa\_{s\_p} \\ qs\_qa\_{s\_q} & 0 & -ks\_ka\_{s\_k} \\ -ps\_pa\_{s\_p} & ks\_ka\_{s\_k} & 0 \end{bmatrix}^\* \begin{bmatrix} a\_{s\_k}\mathbf{d}\mathbf{t} + b\_{s\_k}\diamond dW\_l \\ a\_{s\_p}\mathbf{d}\mathbf{t} + b\_{s\_p}\diamond dW\_l \\ a\_{s\_q}\mathbf{d}\mathbf{t} + b\_{s\_q}\diamond dW\_l \end{bmatrix}^\*. \tag{38}$$

Upon applying the previous steps for the deterministic case to the stochastic velocity in (37), the single triad interaction dynamics for the SALT case would emerge as, cf. Eq. (18),

$$\mathbf{da} = \mathbf{g}\left(\mathbf{a}(\mathbf{k}, t)dt + \mathbf{b}(\mathbf{k}) \diamond d\, W\_l\right)^\* \times \mathbb{D}\mathbf{a}^\*\,. \tag{39}$$

The details of this computation can be found in section "LU Euler" in Appendix 2 for the 3D LU Euler model and in section "SALT Euler" in Appendix 2 for the 3D SALT Euler model.

Remarkably, the HST equation for triad interaction (39) still preserves the triad helicity **<sup>a</sup>** · <sup>D</sup>**a**∗. Hence we name this model the *helicity preserving stochastic triad* (HST) model. Note that in both equations we use as single source of noise (One Brownian motion drives the entire system).

It is readily checked that the HST triad evolution (39) preserves the helicity. Let's have a look at the diffusion coefficients.

$$\mathbf{a}^\* \times \mathbb{D}\mathbf{b} = \begin{bmatrix} a\_p^\* s\_q q b\_q - a\_q^\* s\_p p b\_p \\ a\_q^\* s\_k k b\_k - a\_k^\* s\_q q b\_q \\ a\_k^\* s\_p p b\_p - a\_p^\* s\_k k b\_k \end{bmatrix}, \quad \mathbf{b} \times \mathbb{D}\mathbf{a}^\* = \begin{bmatrix} b\_p s\_q q a\_q^\* - b\_q s\_p p a\_p^\* \\ b\_q s\_k k a\_k^\* - b\_k s\_q q a\_q^\* \\ b\_k s\_p p a\_p^\* - b\_p s\_k k a\_k^\* \end{bmatrix}. \tag{40}$$

Taking the difference

$$\mathbf{a}^\* \times \mathbb{D}\mathbf{b} - \mathbf{b} \times \mathbb{D}\mathbf{a}^\* = \begin{bmatrix} a\_p^\*(s\_q q + s\_p p) b\_q - a\_q^\*(s\_p p + s\_q q) b\_p \\ a\_q^\*(s\_k k + s\_q q) b\_k - a\_k^\*(s\_q q + s\_k k) b\_q \\ a\_k^\*(s\_p p + s\_k k) b\_p - a\_p^\*(s\_k k + s\_p p) b\_k \end{bmatrix}. \tag{41}$$

Writing *<sup>ρ</sup>* := Tr <sup>D</sup> we get

$$\mathbf{a}^\* \times \mathbb{D}\mathbf{b} - \mathbf{b} \times \mathbb{D}\mathbf{a}^\* = \begin{bmatrix} a\_p^\*(\rho - s\_k k) b\_q - a\_q^\*(\rho - s\_k k) b\_p \\ a\_q^\*(\rho - s\_p p) b\_k - a\_k^\*(\rho - s\_p p) b\_q \\ a\_k^\*(\rho - s\_q q) b\_p - a\_p^\*(\rho - s\_q q) b\_k \end{bmatrix}. \tag{42}$$

So that the difference term becomes

$$\mathbf{a}^\* \times \mathbb{D}\mathbf{b} - \mathbf{b} \times \mathbb{D}\mathbf{a}^\* = (\rho \operatorname{Id} - \mathbb{D})(\mathbf{a} \times \mathbf{b})^\*. \tag{43}$$

Since the projections of the LU and SALT models onto a single triad are indistinguishable, we introduce a different model that conserves energy on a single triad to enable a comparison between energy conserving and helicity conserving models.

**Energy-Preserving Stochastic Triad (EST) Model** We introduce below a modified version of the HST triad equation (39) that introduces stochasticity into the vorticity instead of into the transport velocity and thereby conserves the energy. This is inspired by the full-scale model introduced in [17]. The reduced model is as follows

$$\mathbf{da} = -\,\mathrm{g}\mathbf{a}^\* \times \mathbb{D} \Big(\mathbf{a}(\mathbf{k}, t) \,\mathrm{d}t + \mathbf{b}(\mathbf{k}) \diamond \mathrm{d}W\_l\Big)^\*.\tag{44}$$

We call this model the energy preserving stochastic triad (EST). Written in matrix form Eq. (44) becomes

$$\mathbf{d}\begin{bmatrix}a\_{s\_k} \\ a\_{s\_p} \\ a\_{s\_q} \end{bmatrix} = \mathbf{g} \begin{bmatrix} 0 & -a\_{s\_q} & a\_{s\_p} \\ a\_{s\_q} & 0 & -a\_{s\_k} \\ -a\_{s\_p} & a\_{s\_k} & 0 \end{bmatrix}^\* \begin{bmatrix} ka\_k(a\_{s\_k}\,\mathrm{d}\mathbf{t} + b\_{s\_k}(\mathbf{k}) \diamond dW\_l) \\ ps\_p(a\_{s\_p}\,\mathrm{d}\mathbf{t} + b\_{s\_p}(\mathbf{p}) \diamond dW\_l) \\ qs\_q(a\_{s\_q}\,\mathrm{d}\mathbf{t} + b\_{s\_q}(\mathbf{q}) \diamond dW\_l) \end{bmatrix}^\*. \tag{45}$$

The exchange symmetry between the two models HST and EST in the placement of the noise in Eqs. (39) and (44) is apparent already in the exchange symmetry between velocity and vorticity in Euler's fluid equations (1).

**Deviation from the Conservation Laws** We can write the equations for the deviation from the conservation laws, which is present in both models. The SALT model deviates from the energy conservation by

$$\mathbf{d}\_{\mathbf{l}} E\_{\rm HST} = \mathbf{g} \mathbf{b} \cdot (\mathbb{D} \mathbf{a}^\* \times \mathbf{a}^\*) \diamond \mathbf{d} W\_{\rm l} \tag{46}$$

whereas the LU model deviates from the helicity conservation by

$$\mathbf{d}\_{I}H\_{\rmEST} = \mathbf{g}\mathbb{D}\mathbf{b} \cdot (\mathbb{D}\mathbf{a}^\* \times \mathbf{a}^\*) \diamond \mathbf{d}W\_{I}.\tag{47}$$

This is seen, since, to get the energy we dot the HST equation with **a**∗ and to get helicity we dot the EST equation with D**a**<sup>∗</sup> and use the standard identities

$$\mathbf{a}^\* \cdot (\mathbf{b} \times \mathbb{D}\mathbf{a}^\*) = \mathbf{b} \cdot (\mathbb{D}\mathbf{a}^\* \times \mathbf{a}^\*) \tag{48}$$

$$\mathbb{D}\mathbf{a}^\* \cdot (\mathbf{a}^\* \times \mathbb{D}\mathbf{b}) = \mathbb{D}\mathbf{b} \cdot (\mathbb{D}\mathbf{a}^\* \times \mathbf{a}^\*).\tag{49}$$

Therefore, **b** respects the right scaling and no further scale adjustments between the SALT and LU noise scaling need to be performed in order to compare the models.

## **3 Data Assimilation Comparison**

In this section, we perform a comparative study of the two reduced order models (HST and EST) introduced above by using data assimilation tools. The particular methodology that we make use of is that of particle filters. We will first briefly explain the particle filtering methodology in a generic framework:

Let *<sup>X</sup>* and *<sup>Z</sup>* be two processes defined on a given probability space *(,* <sup>F</sup>*,* <sup>P</sup>*)*. The process *X* is usually called the *signal process* or the *truth* and *Z* is the *observation process*. In this paper, *X* is the pathwise solution of a (deterministic) shell model. The pair of processes *(X, Z)* forms the basis of the nonlinear filtering problem which consists in finding the best approximation of the posterior distribution of the signal *Xt* given the observations *Z*1*, Z*2*,...,Ztn* . <sup>3</sup> The posterior distribution of the signal at time *t* is denoted by *πt* . We let *dX* be the dimension of the state space and *dZ* be the dimension of the observation space. This mixed continuous-discrete time framework can be embedded into a fully discrete framework, whereby one is interested in computing the conditional probability law of the signal at the time corresponding to the observation time. in other words one wants to compute the conditional distribution *πtn* of *X(tn)* given the data *Z(t*1*), Z(t*2*), . . . , Z(tn)*. The process *X* is assumed to be a Markov process, and we will denote by K*<sup>n</sup>* its transition kernel, that is

$$\mathcal{K}\_{\mathfrak{n}} : \mathbb{R}^{d\_X} \times \mathcal{B}(\mathbb{R}^{d\_X}) \to [0, 1], \; \mathcal{K}\_{\mathfrak{n}}(\mathfrak{x}, B) = \mathbb{P}(X\_{l\_{\mathfrak{n}}} \in B | X\_{l\_{\mathfrak{n}-1}} = \mathfrak{x}) \tag{50}$$

for any Borel measurable set *<sup>B</sup>* <sup>∈</sup> <sup>B</sup>*(*R*dX )* and *<sup>x</sup>* <sup>∈</sup> <sup>R</sup>*dX* . The process *<sup>Z</sup>* models noisy measurements of the truth, using the so-called *observation operator* H : <sup>R</sup>*dX* <sup>→</sup> <sup>R</sup>*dZ* :

$$Z\_n = \mathcal{H}^\ell(X\_{l\_n}) + V\_n \tag{51}$$

where {*Vn*}*n*≥<sup>0</sup> are independent identically distributed random variables that represent the measurement noise and H is a Borel-measurable function. In this paper we will assume that {*Vn*}*n*≥<sup>0</sup> have standard normal distributions, but the same methodology can be applied to more general distributions. Observations are incorporated into the system at *assimilation times*. The following recursion formula holds (see [2])

$$
\pi\_n = g\_n \star \pi\_{n-1} \mathcal{K}\_n \tag{52}
$$

where by '' we denoted the projective product (see e.g. Definition 10.4 in [2]).

In the following, we compare approximations of the posterior distribution of the signal using *particle filters*. These are sequential Monte Carlo methods which generate approximations of the posterior distribution *πt* using sets of *particles*. That is, they generate approximations that are (random) measures of the form

$$
\pi\_n \approx \sum\_{\ell} \mathbf{w}\_n^{\ell} \delta(\mathbf{x}\_n^{\ell}),
$$

where *δ* is the Dirac delta distribution, w1 *<sup>t</sup> ,* w<sup>2</sup> *<sup>t</sup> ,...* are the *weights* of the particles and *x*<sup>1</sup> *<sup>t</sup> , x*<sup>2</sup> *<sup>t</sup> ,...* are their corresponding positions. Particle filters are used to make inferences about the signal process by using Bayes' theorem, the time-evolution induced by the signal *X*, and the observation process *Z*.

<sup>3</sup> For a mathematical introduction on the subject, see e.g. [2].

In a standard particle filter, the particles evolve between assimilation times according to the law of the signal. As we explain below, at each assimilation time the observation is incorporated into the system through the *likelihood function*:

$$\mathbf{g}\_I^{\sharp\_I} \colon \mathbb{R}^{d\_X} \to \mathbb{R}\_+, \ g\_I^{\sharp\_I}(\mathbf{x}) = \mathbf{g}\_I(\mathbf{z}\_I - \mathcal{H}^\rho(\mathbf{x})) \quad \text{such that} \quad \mathbb{P}(\mathbf{Z}\_I \in d\boldsymbol{z}\_I | X\_I = \mathbf{x}) = \mathbf{g}\_I^{\sharp\_I}(\mathbf{x}) d\boldsymbol{z}\_I \tag{53}$$

and all particles are weighted depending on the likelihood of their corresponding position, given the observation. More precisely, the particle  is given the weight *wl <sup>n</sup>* = *g Zn <sup>n</sup> (x)*. Heuristically, the particle weight measures how close the particle trajectory is to the signal trajectory. A selection procedure is then applied to the set of weighted particles. Particles with higher conditional likelihood (guided by the observation) have higher weights and will be multiplied, while those which have small likelihoods will be eliminated. For the basic particle filter, this is done by sampling with replacement from the population of particles, with corresponding probabilities proportional to their weights.

A Monte Carlo implementation of the transition kernel of the signal may not always yield good approximations. In many situations one replaces the original transition kernel with likelihood informed importance proposals, leading to much better approximations. One situation when this is necessary is when the original process is actually deterministic (aside for the initial position which is assumed to be random). This is the case in our paper.

To overcome the collapse of the particle filter when using deterministic transition kernels, one can use a Markov Chain Monte Carlo procedure that leaves the deterministic dynamics invariant. This procedure can be costly and might not always introduce enough spread into the sample. In this paper, we propose a different approach, which we illustrate numerically in Sect. 3.1.2 below. In particular, we propose two different transition kernels based on the physical conservation properties:


## *3.1 Numerical Studies*

#### **3.1.1 Numerical Implementation**

The models are discretised using the stochastic SSPRK3 scheme which is documented, for example, in [11]. In our specific case, for instance, the HST model is discretised as

$$\begin{aligned} \mathbf{q}\_1^n &= \mathbf{a}\_n + g(\mathbf{a}\_n^\* \times \mathbb{D}\mathbf{a}\_n^\*)\Delta t + g(\mathbf{b} \times \mathbb{D}\mathbf{a}\_n^\*)\Delta W \\ \mathbf{q}\_2^n &= (3/4)\mathbf{a}\_n + (1/4)(\mathbf{q}\_1^n + g((\mathbf{q}\_1^n)^\* \times \mathbb{D}(\mathbf{q}\_1^n)^\*)\Delta t + g(\mathbf{b} \times \mathbb{D}(\mathbf{q}\_1^n)^\*)\Delta W) \\ \mathbf{a}\_{n+1} &= (1/3)\mathbf{a}\_n + (2/3)(\mathbf{q}\_2^n + g((\mathbf{q}\_2^n)^\* \times \mathbb{D}(\mathbf{q}\_2^n)^\*)\Delta t + g(\mathbf{b} \times \mathbb{D}(\mathbf{q}\_2^n)^\*)\Delta W) \end{aligned}$$

where *t* denotes the timestep and *W* the increment of the driving Brownian motion. Further, **a***<sup>n</sup>* is the approximate complex vector amplitude at time *t* = *n
t*. The EST model is discretised completely analogously. For the numerical simulations we chose the following triad throughout. We set

$$\mathbf{k} = [1, 0, 0], \quad \mathbf{p} = [0, -1, 1], \quad \mathbf{q} = [-1, 1, -1], \tag{54}$$

with parities *sk* = 1*, sp* = −1*, sq* = −1 and the initial value **a**<sup>0</sup> = <sup>√</sup> 1 3 [1*,* 1*,* 1]. We set the parameter = [1*,* 1*,* 1] and used a time stepsize of *t* = 0*.*0005.

#### **3.1.2 Data Assimilation for the Deterministic Model**

We illustrate the failure of the particle filter with deterministic transition kernel in Fig. 1. In this case, the particle filtering is performed for an ensemble of *n* = 25 particles evolving according to the deterministic triad dynamics. The initial ensemble is spread around the initial value **a**<sup>0</sup> of the signal according to a Gaussian distribution with standard deviation 1*/* <sup>√</sup><sup>600</sup> and, in particular, does not contain the true initial point. Data assimilation is performed every 10 time units and the observations are taken from the modal energies of the truth with an observation error *<sup>η</sup>* distributed as *<sup>η</sup>* <sup>∼</sup> <sup>N</sup> *(***0***,*C*)* with covariance <sup>C</sup> <sup>=</sup> diag*(*0*.*0052*,* <sup>0</sup>*.*052*,* <sup>0</sup>*.*052*)*. We observe that both the bias and the RMSE keep increasing with time to values much larger than the observation error. Moreover, the number of *distinct* particles decreases rapidly: after 30 steps, a single particle remains that is not the true particle since the true particle was not part of the initial cloud. The particle filter does not work.

#### **3.1.3 Reduced Order Model Realisations**

The deterministic model in Fig. 1a exhibits continually oscillating triad amplitudes. Plotted are the modal energies. Writing **a** = [*ak, ap, aq* ] we call the real value *ak a*<sup>∗</sup> *k* the *energy of mode k*. Similarly for the two other modes.

We simulate the model realisations for different noise scenarios. We simulate the effect of noise in each single mode. Let the noise amplitude vector be **b** = [*bk, bp, bq* ]. Then we simulate the two models for **b** = [*bk* = 0*.*1*, bp* = 0*, bq* = 0], **b** = [*bk* = 0*, bp* = 0*.*1*, bq* = 0], and **b** = [*bk* = 0*, bp* = 0*, bq* = 0*.*1]. The trajectories of the modal energies for *n* = 20 realisations of the driving noise for each scenario are shown in Fig. 2. We also simulate the case of full noise for

**Fig. 1** Deterministic Triad Model. The horizontal axis shows time. (**a**) Evolution of the modal energies (colored lines) as well as the total energy (dashed line) and helicity (dash-dotted line). The deterministic model exhibits continually oscillating modal energies. The simulation also confirms the conservation of energy and helicity. (**b**–**e**) Data assimilation for the deterministic model using particle filter. (**b**) Evolution of the energy of mode **p** of the signal (grey line) and evolution of the energy of mode **p** for the particle ensemble (blue lines). Noisy observations (black stars) are made and assimilated every 10 time units. (**c**) The number of unique particle positions in the filtering ensemble. (**d**) The bias of the particle ensemble wrt. the observations. (**e**) The RMSE of the particle ensemble wrt. the observations

**Fig. 2** Model realisations for both stochastic triad models. Plotted are the modal energies (colored lines), total energy (black line) and helicity (grey line). The respective thick lines are the ensemble means, and the thin lines represent the different stochastic realisations. (**a** + **e**) The noise coefficient **b** = [0*.*1*,* 0*,* 0]. (**b** + **f**) The noise coefficient **b** = [0*,* 0*.*1*,* 0]. (c+g) The noise coefficient **b** = [0*,* 0*,* 0*.*1]. (**d** + **h**) The noise coefficient **b** = [0*.*1*,* 0*.*05*,* 0*.*01]

**Fig. 3** Evolution of statistical moments of the modal energies for both stochastic models in the full noise case. The statistics are computed pointwise in time from an ensemble of 1000 realisations up to a final time of 150. (**a** + **e**) Ensemble mean. (**b** + **f**) Ensemble standard deviation. (**c** + **g**) Ensemble skew. (**d** + **h**) Ensemble kurtosis

the noise amplitude vector **b** = [*bk* = 0*.*1*, bp* = 0*.*05*, bq* = 0*.*01] which was calibrated to the data assimilation objective using the procedure explained in section "Calibration of the Noise Amplitude" in Appendix 3. The ensemble of *n* = 20 realisations of the driving noise in the full noise case is depicted in Fig. 2d and h. In all cases, it can be observed that the mean energy amplitudes of the modes are dampened in both stochastic models. Furthermore, we can experimentally verify the conservation of triad energy for the EST model and the conservation of triad helicity for the HST model.

#### **3.1.4 Model Statistics**

Figure 3 shows various statistics of the generated ensembles of *n* = 1000 particles for the HST and EST triad models in the full noise case introduced above. We plot the ensemble mean, standard deviation, skew, and kurtosis.

The effect of large noise coefficients is exemplified in Fig. 4. We observe that the HST model explodes whereas the EST model is more tolerant to large noise coefficients, and even in the extreme case, does not become unstable in the mean. The ensemble means are computed from *n* = 500 realisations, using the noise coefficient **b** = [0*.*0*,* 1*.*0*,* 0*.*0]. Moreover, to stress the EST model, we also ran the same experiment with a noise coefficient of **b** = [0*.*0*,* 10*.*0*,* 0*.*0] for the EST model alone.

The mean ensemble for a large number of particles, *n* = 20*,*000, is shown in Fig. 5. We can observe that, compared to Fig. 3a and e the oscillations after time 40 are reduced for a very large number of particles. Hence, we believe that the system stabilizes in the mean to stationary modal energies as the limiting effect of the noise.

**Fig. 4** The effect of large noise coefficients on the mean. Evolution of the mean modal energies (colored lines), mean total energy (black), and mean helicity (grey) for the HST (**a**) and EST (**b**) model with noise coefficient **b** = [0*,* 1*,* 0]. (**c**) Evolution of the mean modal energies, mean total energy, and mean helicity for the EST model with the strong noise coefficient **b** = [0*,* 10*,* 0]

**Fig. 5** Evolution of mean modal energies for a very large number of realisations for the EST (**a**) and HST (**b**) models. The mean is computed from 20*,*000 particles in the full noise case

#### **3.1.5 Data Assimilation**

Using the two stochastic models in the full noise case described above, we perform the data assimilation tests using the following framework:

The signal process (the *truth*) is given by the deterministic triad model. The observations are the modal energies of the deterministic model, observed every 10 time units, and perturbed by noise of the form

$$
\eta \sim \mathcal{N}(\mathbf{0}, \mathbf{C}), \tag{55}
$$

where the covariance matrix <sup>C</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> is chosen to be the diagonal matrix <sup>C</sup> <sup>=</sup> diag*(*0*.*0052*,* 0*.*052*,* 0*.*052*)*.

We use the sequential importance resampling (SIR) particle filter to assimilate the periodically observed signal process under the influence of observation noise. The particles evolve according to the stochastic triad models. Figures 6 and 7 show the results of filtering the ensemble of *n* = 100 particles of the EST and HST triad models, respectively. The ensembles are assessed in terms of the bias and RMSE statistics. We analyse the comparison details below:

**Fig. 6** Filtering experiment for EST model using SIR particle filter. (**a**–**c**) Ensemble evolution for 100 particles in mode k (**a**, red), p (**b**, blue), and q (**c**, green). The signal (grey) is the deterministic model and the observations (black stars) are noisy and taken and assimilated every 10 time units. (**d**–**f**) Bias of the filtering ensemble. (**g**–**h**) RMSE of the filtering ensemble

#### Mode k

This is the least energetic of all the modes (hence the reason why we observe it with the least amount of measurement noise). The cloud of particles is well placed around the truth even with the small sample. The bias remains small for both the HST and the EST versions and it reduces significantly when observations are assimilated more frequently (see Fig. 9 in Appendix 3) as well as when we use a large number (500) of particles (see Fig. 10 in Appendix 3). The RMSE remains small in all cases and decreases (though not substantially) when the DA step is small.

#### Modes p and q

These are the two energetic modes of the system. We used here a measurement noise that is one order of magnitude larger. Despite this, the results remain equally good. The cloud of particles provide a good envelope for the truth at all times. This validates the choice of the stochasticity: the uncertainty is properly modelled.

**Fig. 7** Filtering experiment for HST model using SIR particle filter. (**a**–**c**) Ensemble evolution for 100 particles in mode k (**a**, red), p (**b**, blue), and q (**c**, green). The signal (grey) is the deterministic model and the observations (black stars) are noisy and taken and assimilated every 10 time units. (**d**–**f**) Bias of the filtering ensemble. (**g**–**h**) RMSE of the filtering ensemble

For both models the bias can become very large, reaching 30% of the size of the oscillations for the HST model and 25% of the size of the oscillations for the EST model. As expected, it is drastically reduced when observations are assimilated more frequently. The RMSE for mode p is also large but substantially smaller for mode q. The addition of more intermediate DA steps or more particles has a less pronounced effect for the q mode.

**Remark 2** We record the Effective Sample Size (ESS) for a typical run (for both EST and HST) in Fig. 11. As usual, the ESS is computed just before the application of the resampling procedure. The ESS is seen to decay dramatically from 100 down to single digits numbers in most instances in time.

**Remark 3** We record the results over 10 independent runs of the filtering experiment for the EST model with 500 ensemble members in Fig. 12. More precisely, in graphs 12a, b and c each, we plot the mean across the 10 independent filtering runs together with the evolution of the signal and the individual ensemble means for each mode. The mean bias as well as the envelope obtained from the independent runs are shown in graphs 12d, e and f. The same is shown for the RMSE in graphs 12g, h and i. Compared with a single run of the same experiment reported in Fig. 10 we observe the approximations are now near perfect (the statistical error has been drastically reduced).

## **4 Conclusions**

The introduction of stochasticity into the deterministic triad models leads to two new stochastic models. Stochasticity is introduced in a principled way (rather than adhoc). It starts with a full scale fluid dynamic model which is randomly perturbed. At the full scale, the stochastic parametrisation models the small-scale effects in fluid dynamics modelling. In particular, it efficiently captures the high-frequency smallscale dynamics and correctly correlates it with the slow, large-scale fluid motion. In addition, it is constrained to conserve either the helicity or the kinetic energy of the system. This inspires two different stochastic triad models of Euler type which we compare using data assimilation procedures based on particle filtering. The methodology we employ can be used as a benchmark when analysing new types of stochastic parametrisations: ours is the first study that assesses the efficiency of stochastic parametrisations from a data assimilation perspective.

The introduction of stochasticity ensures that the correct spread (one that preserves the physical properties of the system) is introduced in the ensemble of particles. In its absence, the particle filter degenerates quite rapidly: after a few DA steps, a single particle survives the culling procedure which does offer a good approximation of the truth. A purely deterministic transition kernel does not work, generating a rapid degeneracy of the particle filter.

The two stochastic systems (one which preserves helicity and the other one which preserves energy) are analysed using a standard particle filter. There is no need for additional procedures (such as tempering, nudging, or jittering). They perform equally well from the viewpoint of DA: both the RMSE and the bias are drastically reduced and stabilised when the noise is carefully calibrated. The two different stochastic kernels require different noise calibrations in order to perform well in similar data assimilation scenarios. This is somehow expected, given that the underlying stochastic parametrisations preserve different physical quantities.

## **Appendix 1: Notation and Basic Identities**

## *Notation*

In this work we use the following notation. We write Z, R and C for the sets of integers, real numbers and complex numbers, respectively. Boldface letters denote three-dimensional complex vectors. For two complex vectors **a** and **b** with components *aj* and *bj* , their dot product is denoted by

$$\mathbf{a} \cdot \mathbf{b} = \sum\_{j=1}^{3} a\_j b\_j = a\_j b\_j \in \mathbb{C}.$$

This paper uses the Einstein convention of summing over repeated indices. The norm of the complex vector **a** is defined as

$$|\mathbf{a}| = \sqrt{\mathbf{a} \cdot \mathbf{a}^\*} = \sqrt{a\_j a\_j^\*} \ge 0,$$

with the superscript symbol ∗ denoting complex conjugation. Further, the cross product of two vectors **a** and **b** is given by

$$\mathbf{a} \times \mathbf{b} = (a\_2b\_3 - a\_3b\_2, a\_3b\_1 - a\_1b\_3, a\_1b\_2 - a\_2b\_1) \in \mathbb{C}^3.$$

The gradient of a scalar field *<sup>φ</sup>* : <sup>D</sup> <sup>⊆</sup> <sup>R</sup><sup>3</sup> <sup>→</sup> <sup>C</sup> at a point **<sup>x</sup>** <sup>∈</sup> <sup>D</sup> is denoted by

$$\nabla \phi(\mathbf{x}) = \begin{pmatrix} \partial\_1 \phi(\mathbf{x}), \,\partial\_2 \phi(\mathbf{x}), \,\partial\_3 \phi(\mathbf{x}) \end{pmatrix} \in \mathbb{C}^3.$$

The divergence of a vector field *<sup>ψ</sup>* : <sup>D</sup> <sup>⊆</sup> <sup>R</sup><sup>3</sup> <sup>→</sup> <sup>C</sup><sup>3</sup> with components *ψj* at a point **x** ∈ D is defined as

$$\nabla \cdot \boldsymbol{\Psi}(\mathbf{x}) = \partial\_1 \psi\_1(\mathbf{x}) + \partial\_2 \psi\_2(\mathbf{x}) + \partial\_3 \psi\_3(\mathbf{x}) = \partial\_j \psi\_j(\mathbf{x}) \in \mathbb{C}$$

and the curl of *ψ* at **x** is given by

$$\nabla \times \boldsymbol{\Psi}(\mathbf{x}) = \left(\partial\_2 \psi\_3(\mathbf{x}) - \partial\_3 \psi\_2(\mathbf{x}), \partial\_3 \psi\_1(\mathbf{x}) - \partial\_1 \psi\_3(\mathbf{x}), \partial\_1 \psi\_2(\mathbf{x}) - \partial\_2 \psi\_1(\mathbf{x})\right) \in \mathbb{C}^3 \dots$$

## *Vector Identities*

For three vectors **a**, **b** and **c**, we have the following algebraic vector identities

$$\begin{aligned} \mathbf{a} \cdot (\mathbf{b} \times \mathbf{c}) &= \mathbf{b} \cdot (\mathbf{c} \times \mathbf{a}) = \mathbf{c} \cdot (\mathbf{a} \times \mathbf{b}), \\ \mathbf{a} \times (\mathbf{b} \times \mathbf{c}) &= (\mathbf{a} \cdot \mathbf{c})\mathbf{b} - (\mathbf{a} \cdot \mathbf{b})\mathbf{c}, \\ (\mathbf{a} \times \mathbf{b}) \cdot (\mathbf{c} \times \mathbf{d}) &= (\mathbf{a} \cdot \mathbf{c})(\mathbf{b} \cdot \mathbf{d}) - (\mathbf{b} \cdot \mathbf{c})(\mathbf{a} \cdot \mathbf{d}), \\ \mathbf{a} \times \mathbf{a} &= \mathbf{0}. \end{aligned}$$

Moreover, we have the vector calculus identity

$$(\mathbf{a} \cdot \nabla)\mathbf{b} + b\_f \nabla a\_f = -\mathbf{a} \times (\nabla \times \mathbf{b}) + \nabla(\mathbf{a} \cdot \mathbf{b}).\tag{56}$$

# **Appendix 2: Derivation of Triad Models**

## *Deterministic Euler*

We compute the projection of the terms corresponding to the deterministic Euler vorticity equation in curl form:

$$
\partial\_t \mathbf{v} - \mathbf{v} \times (\nabla \times \mathbf{v}) + \frac{1}{2} \nabla |\mathbf{v}|^2 = -\nabla p. \tag{57}
$$

onto the helical basis. For the time-derivative we get

$$\int\_{\mathcal{D}} \partial\_{l} \mathbf{v}(\mathbf{x}, t) \cdot \mathbf{h}\_{s\_{k}}^{\*} (\mathbf{k}) e^{-l \mathbf{k} \cdot \mathbf{x}} \, \mathbf{dx} = \sum\_{\mathbf{p}} \sum\_{s\_{p}} \partial\_{l} a\_{s\_{p}}(\mathbf{p}, t) \mathbf{h}\_{s\_{p}}(\mathbf{p}) \cdot \mathbf{h}\_{s\_{k}}^{\*} (\mathbf{k}) \int\_{\mathcal{D}} e^{i(\mathbf{p} - \mathbf{k}) \cdot \mathbf{x}} \, \mathbf{dx}$$

$$= L^{3} \sum\_{s\_{p}} \partial\_{l} a\_{s\_{p}}(\mathbf{k}, t) \mathbf{h}\_{s\_{p}}(\mathbf{k}) \cdot \mathbf{h}\_{s\_{k}}^{\*} (\mathbf{k})$$

$$= L^{3} \partial\_{l} a\_{s\_{k}}(\mathbf{k}, t) \mathbf{h}\_{s\_{k}}(\mathbf{k}) \cdot \mathbf{h}\_{s\_{k}}^{\*} (\mathbf{k})$$

$$= 2L^{3} \partial\_{l} a\_{s\_{k}}(\mathbf{k}, t) .$$

The vorticity term gives

$$\begin{split} &\int\_{\mathcal{D}} \left( \mathbf{v}(\mathbf{x},t) \times \boldsymbol{\omega}(\mathbf{x},t) \right) \cdot \mathbf{h}\_{s\_{k}}^{\*} (\mathbf{k}) e^{-i\mathbf{k}\cdot\mathbf{x}} \, \mathbf{d} \mathbf{x} \\ &= \sum\_{\mathbf{p},\mathbf{q}} \sum\_{s\_{p},s\_{q}} a\_{s\_{p}}(\mathbf{p},t) s\_{q} \langle \mathbf{q} | a\_{s\_{q}}(\mathbf{q},t) \mathbf{h}\_{s\_{p}}(\mathbf{p}) \times \mathbf{h}\_{s\_{q}}(\mathbf{q}) \cdot \mathbf{h}\_{s\_{k}}^{\*} (\mathbf{k}) \int\_{\mathcal{D}} e^{i(\mathbf{p}+\mathbf{q}-\mathbf{k}) \cdot \mathbf{x}} \, \mathbf{d} \mathbf{x} \\ &= L^{3} \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_{p},s\_{q}} a\_{s\_{p}}^{\*}(\mathbf{p},t) s\_{q} \langle \mathbf{q} | a\_{s\_{q}}^{\*}(\mathbf{q},t) \mathbf{h}\_{s\_{p}}^{\*}(\mathbf{p}) \times \mathbf{h}\_{s\_{q}}^{\*}(\mathbf{q}) \cdot \mathbf{h}\_{s\_{k}}^{\*}(\mathbf{k}) . \end{split} \tag{58}$$

Note that we can write (58) in a form which is symmetric in **p** and **q** since, renaming the indices,

$$\begin{split} &\sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_{p},s\_{q}} a\_{s\_{p}}^{\*}(\mathbf{p},t) s\_{q} |\mathbf{q}| a\_{s\_{q}}^{\*}(\mathbf{q},t) \mathbf{h}\_{s\_{p}}^{\*}(\mathbf{p}) \times \mathbf{h}\_{s\_{q}}^{\*}(\mathbf{q}) \cdot \mathbf{h}\_{s\_{k}}^{\*}(\mathbf{k}) \\ &= \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_{p},s\_{q}} a\_{s\_{q}}^{\*}(\mathbf{q},t) s\_{p} |\mathbf{p}| a\_{s\_{p}}^{\*}(\mathbf{p},t) \mathbf{h}\_{s\_{q}}^{\*}(\mathbf{q}) \times \mathbf{h}\_{s\_{p}}^{\*}(\mathbf{p}) \cdot \mathbf{h}\_{s\_{k}}^{\*}(\mathbf{k}) \\ &= - \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_{p},s\_{q}} a\_{s\_{q}}^{\*}(\mathbf{q},t) s\_{p} |\mathbf{p}| a\_{s\_{p}}^{\*}(\mathbf{p},t) \mathbf{h}\_{s\_{p}}^{\*}(\mathbf{p}) \times \mathbf{h}\_{s\_{q}}^{\*}(\mathbf{q}) \cdot \mathbf{h}\_{s\_{k}}^{\*}(\mathbf{k}). \end{split}$$

Therefore,

$$\int\_{\mathcal{D}} (\mathbf{v}(\mathbf{x},t) \times \boldsymbol{\omega}(\mathbf{x},t)) \cdot \mathbf{h}\_{s\_k}^\*(\mathbf{k}) e^{-i\mathbf{k}\cdot\mathbf{x}} \, d\mathbf{x} \tag{59}$$

$$\mathbf{h} = \frac{L^3}{2} \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_p, s\_q} (s\_q|\mathbf{q}|-s\_p|\mathbf{p}) a\_{s\_p}^\*(\mathbf{p},t) a\_{s\_q}^\*(\mathbf{q},t) \mathbf{h}\_{s\_p}^\*(\mathbf{p}) \times \mathbf{h}\_{s\_q}^\*(\mathbf{q}) \cdot \mathbf{h}\_{s\_k}^\*(\mathbf{k}). \tag{60}$$

Moreover, the gradient terms in (57) vanish upon expansion into helical modes. Thus, the Euler equation (57) in helical basis becomes

$$\partial\_t a\_{s\_k}(\mathbf{k}, t) = -\frac{1}{4} \sum\_{\mathbf{p} + \mathbf{q} + \mathbf{k} = 0} \sum\_{s\_p, s\_q} (s\_p|\mathbf{p}| - s\_q|\mathbf{q}|) a\_{s\_p}^\*(\mathbf{p}, t) a\_{s\_q}^\*(\mathbf{q}, t) \mathbf{h}\_{s\_p}^\*(\mathbf{p}) \times \mathbf{h}\_{s\_q}^\*(\mathbf{q}) \cdot \mathbf{h}\_{s\_k}^\*(\mathbf{k}).$$

# *SALT Euler*

We expand the 3D SALT Navier-Stokes equation (23) using (6) and (37). Assume *bs(*−**p***)* = *b*<sup>∗</sup> *<sup>s</sup> (***p***)*.

$$\int\_{\mathcal{D}} \mathbf{d}\mathbf{v}(\mathbf{x},t) \cdot \mathbf{h}\_{s\_k}^\*(\mathbf{k}) e^{-i\mathbf{k}\cdot\mathbf{x}} \,\mathrm{d}\mathbf{x} = \sum\_{\mathbf{p}} \sum\_{s\_p} \mathrm{d}a\_{s\_p}(\mathbf{p},t) \mathbf{h}\_{s\_p}(\mathbf{p}) \cdot \mathbf{h}\_{s\_k}^\*(\mathbf{k}) \int\_{\mathcal{D}} e^{i(\mathbf{p}-\mathbf{k})\cdot\mathbf{x}} \,\mathrm{d}\mathbf{x}$$
 
$$= 2L^3 \mathrm{d}a\_{s\_k}(\mathbf{k},t)$$

And

$$\begin{split} &\int\_{\mathcal{D}} \mathrm{d}\mathbf{x}\_{l}(\mathbf{x},t) \times \boldsymbol{\omega}(\mathbf{x},t) \cdot \mathbf{h}\_{s\_{k}}^{\*}(\mathbf{k}) e^{-i\mathbf{k}\cdot\mathbf{x}} \,\mathrm{d}\mathbf{x} \\ &= \sum\_{\mathbf{p},\mathbf{q}} \sum\_{s\_{p},s\_{q}} \Big[ a\_{s\_{p}}(\mathbf{p},t) \mathrm{d}t + b\_{s\_{p}}(\mathbf{p}) \diamond \mathrm{d}W\_{l} \Big] s\_{q}|\mathbf{q}| a\_{s\_{q}}(\mathbf{q},t) \mathbf{h}\_{s\_{p}}(\mathbf{p}) \\ & \qquad \times \mathbf{h}\_{s\_{q}}(\mathbf{q}) \cdot \mathbf{h}\_{s\_{k}}^{\*}(\mathbf{k}) \int\_{\mathcal{D}} e^{i(\mathbf{p}+\mathbf{q}-\mathbf{k})\cdot\mathbf{x}} \,\mathrm{d}\mathbf{x} \\ &= L^{3} \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_{p},s\_{q}} \Big[ a\_{s\_{p}}^{\*}(\mathbf{p},t) \mathrm{d}t + b\_{s\_{p}}^{\*}(\mathbf{p}) \diamond \mathrm{d}W\_{l} \Big] s\_{q}|\mathbf{q}| a\_{s\_{q}}^{\*}(\mathbf{q},t) \mathbf{h}\_{s\_{p}}^{\*}(\mathbf{p}) \\ & \qquad \times \mathbf{h}\_{s\_{q}}^{\*}(\mathbf{q}) \cdot \mathbf{h}\_{s\_{k}}^{\*}(\mathbf{k}). \end{split}$$

Renaming the indices, we can write

$$\begin{split} &L^{3} \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_{p},s\_{q}} \left[ a\_{s\_{p}}^{\*}(\mathbf{p},t) \mathrm{d}t + b\_{s\_{p}}^{\*}(\mathbf{p}) \circ \mathrm{d}W\_{l} \right] s\_{q} |\mathbf{q}| a\_{s\_{q}}^{\*}(\mathbf{q},t) \mathbf{h}\_{s\_{p}}^{\*}(\mathbf{p}) \times \mathbf{h}\_{s\_{q}}^{\*}(\mathbf{q}) \cdot \mathbf{h}\_{s\_{k}}^{\*}(\mathbf{k}) \\ &= L^{3} \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_{p},s\_{q}} \left[ a\_{s\_{q}}^{\*}(\mathbf{q},t) \mathrm{d}t + b\_{s\_{q}}^{\*}(\mathbf{q}) \circ \mathrm{d}W\_{l} \right] s\_{P} |\mathbf{p}| a\_{s\_{p}}^{\*}(\mathbf{p},t) \mathbf{h}\_{s\_{q}}^{\*}(\mathbf{q}) \times \mathbf{h}\_{s\_{p}}^{\*}(\mathbf{p}) \cdot \mathbf{h}\_{s\_{k}}^{\*}(\mathbf{k}) \\ &= -L^{3} \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_{p},s\_{q}} \left[ a\_{s\_{q}}^{\*}(\mathbf{q},t) \mathrm{d}t + b\_{s\_{q}}^{\*}(\mathbf{q}) \circ \mathrm{d}W\_{l} \right] s\_{P} |\mathbf{p}| a\_{s\_{p}}^{\*}(\mathbf{p},t) \mathbf{h}\_{s\_{p}}^{\*}(\mathbf{p}) \times \mathbf{h}\_{s\_{q}}^{\*}(\mathbf{q}) \cdot \mathbf{h}\_{s\_{q}}^{\*}(\mathbf{k}). \end{split}$$

Thus, we arrive at

$$\begin{split} &\int\_{\mathcal{D}} \mathbf{dx}\_{l}(\mathbf{x},t) \times \boldsymbol{\omega}(\mathbf{x},t) \cdot \mathbf{h}^{\*}\_{s\_{k}}(\mathbf{k}) e^{-i\mathbf{k}\cdot\mathbf{x}} \, \mathrm{d}\mathbf{x} \\ &= \frac{L^{3}}{2} \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=0} \sum\_{s\_{p},s\_{q}} \Big[ (s\_{q}|\mathbf{q}|b^{\*}\_{s\_{p}}(\mathbf{p}) \diamond \mathrm{d}W\_{l}a^{\*}\_{s\_{q}}(\mathbf{q},t) - s\_{p}|\mathbf{p}|b^{\*}\_{s\_{q}}(\mathbf{q}) \diamond \mathrm{d}W\_{l}a^{\*}\_{s\_{p}}(\mathbf{p},t)) \\ &+ (a^{s}\_{s\_{p}}(\mathbf{p},t) \mathrm{d}ts\_{q}|\mathbf{q}|a^{\*}\_{s\_{q}}(\mathbf{q},t) - a^{s}\_{s\_{q}}(\mathbf{q},t) \mathrm{d}ts\_{p}|\mathbf{p}|a^{\*}\_{s\_{p}}(\mathbf{p},t) \Big] \bigg] \mathbf{h}^{\*}\_{s\_{p}}(\mathbf{p}) \times \mathbf{h}^{\*}\_{s\_{q}}(\mathbf{q}) \cdot \mathbf{h}^{\*}\_{s\_{k}}(\mathbf{k}). \end{split}$$

Therefore,

$$\begin{split} \mathsf{Ind}\_{\mathcal{S}\boldsymbol{\mathfrak{L}}}(\mathbf{k},t) &= \frac{1}{4} \sum\_{\mathbf{p}+\mathbf{q}+\mathbf{k}=\mathbf{0}} \sum\_{s\_{p},s\_{q}} \Bigg[ (s\_{q}|\mathbf{q}|b\_{s\_{p}}^{\ast}\langle\mathbf{p}\rangle \circ \mathrm{d}W\_{l}a\_{s\_{q}}^{\ast}\langle\mathbf{q},t\rangle - s\_{p}|\mathbf{p}|b\_{s\_{q}}^{\ast}\langle\mathbf{q}\rangle \circ \mathrm{d}W\_{l}a\_{s\_{p}}^{\ast}\langle\mathbf{p},t\rangle) \\ &+ (a\_{s\_{p}}^{\ast}\langle\mathbf{p},t\rangle \mathrm{d}s\_{q}|\mathbf{q}\rangle a\_{s\_{q}}^{\ast}\langle\mathbf{q},t\rangle - a\_{s\_{q}}^{\ast}\langle\mathbf{q},t\rangle \mathrm{d}s\_{p}|\mathbf{p}|a\_{s\_{p}}^{\ast}\langle\mathbf{p},t\rangle) \Bigg] \mathrm{h}\_{s\_{p}}^{\ast}\langle\mathbf{p}\rangle \times \mathrm{h}\_{s\_{q}}^{\ast}\langle\mathbf{q}\rangle \cdot \mathrm{h}\_{s\_{q}}^{\ast}\langle\mathbf{q}\rangle. \end{split}$$

# *LU Euler*

Written in terms of the SALT model, the LU model is

$$\mathbf{d}\mathbf{v} + \mathbf{d}\mathbf{x}\_l \cdot \nabla \mathbf{v} + \mathbf{v}^j \nabla \mathbf{d}\mathbf{x}\_l^j - \mathbf{v}^j \nabla (\xi \circ \mathbf{d}W\_l)^j = -\nabla \mathbf{d}p.\mathbf{x}$$

Expanding the additional term gives

$$\nabla(\pounds\diamond \mathrm{d}W\_{l})^{j} = \nabla\sum\_{\mathbf{q}}(b\_{\pm}(\mathbf{q})\mathbf{h}\_{\pm}(\mathbf{q})e^{i\mathbf{q}\cdot\mathbf{x}}\diamond \mathrm{d}W\_{l})^{j} = \sum\_{\mathbf{q}}i\mathbf{q}b\_{\pm}(\mathbf{q})\mathbf{h}\_{\pm}^{j}(\mathbf{q})e^{i\mathbf{q}\cdot\mathbf{x}}\diamond \mathrm{d}W\_{l}.$$

Thus we get

$$\mathbf{v}^{j}\nabla(\mathbf{\xi}\circ\mathrm{d}W\_{l})^{j} = \sum\_{\mathbf{p},\mathbf{q}}\sum\_{s\_{p},s\_{q}}a\_{s\_{p}}(\mathbf{p},t)\mathbf{h}\_{s\_{p}}^{j}(\mathbf{p})i\,\mathbf{q}b\_{s\_{q}}(\mathbf{q})\mathbf{h}\_{s\_{q}}^{j}(\mathbf{q})e^{i(\mathbf{p}+\mathbf{q})\cdot\mathbf{x}}\circ\mathrm{d}W\_{l}.$$

Now projecting and renaming, we have

 D **<sup>v</sup>***<sup>j</sup>* <sup>∇</sup>*(<sup>ξ</sup>* ◦ <sup>d</sup>*Wt) <sup>j</sup>* · *<sup>a</sup>*<sup>∗</sup> *sk (***k***,t)***h**<sup>∗</sup> *sk (***k***)e*−*i***k**·**<sup>x</sup>** <sup>d</sup>**<sup>x</sup>** <sup>=</sup> <sup>=</sup> **p***,***q** *sp,sq a*∗ *sk (***k***, t)asp (***p***,t)*[*bsq (***q***)* ◦ d*Wt*]*(i***q** · **h**<sup>∗</sup> *sk (***k***))(***h***sp (***p***)* · **<sup>h</sup>***sq (***q***))* D *e(***p**+**q**−**k***)*·**<sup>x</sup>** d**x** <sup>=</sup> *<sup>L</sup>*<sup>3</sup> **p**+**q**+**k**=0 *sp,sq a*∗ *sk (***k***, t)a*<sup>∗</sup> *sp (***p***,t)*[*b*<sup>∗</sup> *sq (***q***)* ◦ d*Wt*]*(*−*i***q** · **h**<sup>∗</sup> *sk (***k***))(***h**<sup>∗</sup> *sp (***p***)* · **h**<sup>∗</sup> *sq (***q***))* <sup>=</sup> *<sup>L</sup>*<sup>3</sup> **p**+**q**+**k**=0 *sp,sq (f pq <sup>k</sup> )* ∗*a*∗ *sk (***k***, t)(a*<sup>∗</sup> *sp (***p***,t)*[*b*<sup>∗</sup> *sq (***q***)* ◦ d*Wt*] + *a*<sup>∗</sup> *sq (***q***,t)*[*b*<sup>∗</sup> *sp (***p***)* ◦ d*Wt*]*)*

with

$$f\_k^{pq} = (-i(\mathbf{p} + \mathbf{q}) \cdot \mathbf{h}\_{s\_k}(\mathbf{k})) (\mathbf{h}\_{s\_p}(\mathbf{p}) \cdot \mathbf{h}\_{s\_q}(\mathbf{q})) .$$

Note that, due to the triad condition **p** + **q** + **k** = 0,

$$f\_k^{pq} = f\_p^{kq} = f\_q^{kp} = 0$$

so that the difference term between SALT and LU vanishes in the helical projection and the two projected models coincide.

## **Appendix 3: Supplementary Numerics**

## *Calibration of the Noise Amplitude*

To calibrate the noise amplitude for the data assimilation experiments, we rely on two forecast verification metrics. The rank histogram (or Talagrand histogram) and the continuous ranked probability score (CRPS). We evaluated these for both models on 64 different noise amplitude vectors. The metrics were recorded by running the data assimilation/particle filtering experiment described in the main text for the noise amplitude vectors **b** = [*bk, bp, bq* ] resulting from all possible combinations of *bk* ∈ {0*.*05*,* 0*.*1*,* 0*.*2*,* 0*.*5}, *bp* ∈ {0*.*025*,* 0*.*05*,* 0*.*1*,* 0*.*2} and *bq* ∈ {0*.*01*,* 0*.*02*,* 0*.*04*,* 0*.*1}. We present the mean CRPS scores for the top 5 tested noise

**Table 1** CRPS scores (lower is better) for the five best—in terms of overall mean CRPS score tested noise amplitude vectors **b** (first column). The mean CRPS score over the three modal energies is shown for the EST (second column) and HST (third column) models. The overall mean CRPS score for the respective noise amplitude is shown in the *Mean* column (fourth column). Finally, we provide references to the figures showing the associated rank histograms (last column)


**Fig. 8** Rank histograms for the 5 best noise amplitude vectors in terms of CRPS score (see Table 1) for the EST and HST models. Each individual graph shows the rank histogram of an ensemble of 15 particles run up to a final time of 1400, with data assimilation performed every 10 time units. The top histogram in each subfigure represents the ensemble for mode **k**, the middle histogram represents the ensemble for mode **p** and the bottom one for mode **q**

amplitude vectors in Table 1. The mean is taken across both models. Based on this calibration, we chose the case **b** = [0*.*10*,* 0*.*05*,* 0*.*01] for the data assimilation experiment, as the other ones have inferior rank histograms, so we achieve a good balance between the visual judgment of rank histograms and the CRPS score (Fig. 8).

## *Data Assimilation Verification*

See Figs. 9, 10, 11, and 12.

**Fig. 9** Filtering experiment for EST model using SIR particle filter with a small data assimilation interval of 5 time units. (**a**–**c**) Ensemble evolution for 100 particles in mode k (**a**, red), p (**b**, blue), and q (**c**, green). The signal (grey) is the deterministic model and the observations (black stars) are noisy and taken and assimilated every 5 time units. (**d**–**f**) Bias of the filtering ensemble. (**g**–**h**) RMSE of the filtering ensemble

**Fig. 10** Filtering experiment for EST model using SIR particle filter with a large particle ensemble of 500 members. (**a**–**c**) Ensemble evolution for 500 particles in mode k (**a**, red), p (**b**, blue), and q (**c**, green). The signal (grey) is the deterministic model and the observations (black stars) are noisy and taken and assimilated every 5 time units. (**d**–**f**) Bias of the filtering ensemble. (**g**–**h**) RMSE of the filtering ensemble

**Fig. 11** Typical ESS for the filtering experiments. See Remark 2 in the main text. (**a**) ESS for EST experiment. (**b**) ESS for HST experiment

**Fig. 12** Statistics of the mean ensemble over 10 independent runs of the particle filter with 500 ensemble members for the EST model. See Remark 3 in the main text

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **An Explicit Method to Determine Casimirs in 2D Geophysical Flows**

**Erwin Luesink and Bernard Geurts**

**Abstract** Conserved quantities in geophysical flows play an important role in the characterisation of geophysical dynamics and aid the development of structurepreserving numerical methods. A significant family of conserved quantities is formed by the Casimirs i.e., integral conservation laws that are in the kernel of the underlying Poisson bracket. The Casimirs hence determine the geometric structure of the geophysical fluid equations among which the enstrophy is well known. Often Casimirs are proposed on heuristic grounds and later verified to be part of the kernel of the Poisson bracket. In this work, we will explicitly construct Casimirs by rewriting the Poisson bracket in vorticity-divergence coordinates thereby providing explicit construction of Casimirs for 2D geophysical flow dynamics.

# **1 Introduction**

Models of geophysical flows involve a fluid with a free interface under the influence of gravity on a rotating domain, often forced by variations in temperature, salinity and density. In this paper, we will consider the thermal rotating shallow water equations, which is a two dimensional model that includes all of the features above.

E. Luesink (-)

B. Geurts

This work was performed in the project SPRESTO (structure-preserving regularization and stochastic forcing for nonlinear hyperbolic PDEs), supported by a NWO TOP 1 grant.

Multiscale Modelling and Simulation, Department of Applied Mathematics, Faculty EEMCS, University of Twente, Enschede, The Netherlands e-mail: e.luesink@utwente.nl

Multiscale Modelling and Simulation, Department of Applied Mathematics, Faculty EEMCS, University of Twente, Enschede, The Netherlands

Multiscale Physics, Center for Computational Energy Research, Department of Applied Physics, Eindhoven University of Technology, Eindhoven, The Netherlands

The variations of temperature and salinity are collected in a single scalar field that we call the buoyancy. The thermal rotating shallow water equations can be derived in several different ways from the rotating Euler equations with stratification. The best known method is to assume that the domain is shallow, allowing the replacement of the prognostic equation for the vertical velocity by a hydrostatic pressure relation from which the vertical velocity can be inferred. The result is the system of inviscid primitive equations. Upon assuming columnar motion, e.g., motivated by the Taylor-Proudman theorem, meaning that the horizontal velocity field does not depend on the vertical coordinate, one can vertically average the primitive equations to obtain the thermal rotating shallow water equations.

The focus in this contribution is with the preservation of key structures that characterise the governing equations and their dynamics. In geophysical fluid dynamics (GFD) models in 2D so-called conservation of Casimirs forms a major framework for the modelling and construction of methods for their numerical treatment. We present an explicit constructive method with which the Casimirs of a general class of GFD models can be explicitly computed. This provides an alternative to earlier work by Cotter and Holm [2013] that yields explicit Casimirs for 2D systems.

This entire sequence of assumptions and approximations can be performed at the level of the Lagrangian formulation of the models, as shown in Holm and Luesink [2021]. Doing so is helpful because the geometric structure associated with such a variational formulation implies various important dynamical quantities and conservation laws. For two-dimensional geophysical fluids, we find that the key dynamical quantities are the potential vorticity and the potential buoyancy, where the latter is the ratio of the buoyancy to the depth. Both quantities play a key role in cyclogenesis, as shown in Holm et al. [2021]. In the present work, we will show that the potential vorticity and potential buoyancy can also be used to give an alternative formulation of the fluid equations. This formulation is particularly useful for the identification of Casimirs, which are conserved integral quantities. The Casimirs in turn help interpret mechanisms in geophysical turbulence. For instance, in the incompressible setting without buoyancy effects in two dimensions, the enstrophy is one of the Casimirs and plays a central role in the double cascade predicted by Kraichnan [1967]. From a geometric point of view, the Casimirs are the functionals that form the kernel of the Poisson bracket. Usually, the form of the Casimirs is assumed or guessed and then verified by checking whether the Poisson bracket indeed vanishes. We will use two subsequent changes of variables to provide a constructive derivation for the form of the Casimirs.

The organization of this paper is as follows. In Sect. 2 we present the main equations relevant to 2D geophysical flow and sketch the central importance of the retaining symmetries and conserved properties of the models. Among these properties, Casimirs are a major structure for the GFD models in 2D. Explicit methods to determine these Casimirs will be sketched in Sect. 3. Concluding remarks are collected in Sect. 4.

## **2 Geophysical Flows**

In Holm and Luesink [2021] it is shown that starting from the Lagrangian for the rotating, stratified Euler equations in three dimensions, one can derive the Lagrangian for the thermal rotating shallow water equations by assuming hydrostatic pressure and applying vertical averaging. The benefit of performing this sequence of approximations at the level of the Lagrangian is that the geometric structure is not affected. This means that the thermal rotating shallow water equations can be formulated using a Lie-Poisson bracket. As was shown in Marsden and Weinstein [1983], there exist multiple equivalent formulations in terms of Lie-Poisson brackets of the two-dimensional Euler equations. Similar formulations can be derived for the compressible two-dimensional fluid models. This was first done in Holm and Long [1989]. The retained Lagrangian structure in 2D is basic to a possible systematic approach to determining the conserved quantities associated with the thermal rotating shallow water equations. This systematic approach will be elaborated on in this paper.

It is possible to derive the equations for geophysical fluid dynamics on arbitrary smooth manifolds starting from the Euler equations, as shown in Holm et al. [1998]. This level of generality requires tools of differential topology, such as the Lie derivative, the pullback and the pushforward, see for instance Abraham and Marsden [1978], Marsden and Ratiu [2013], Holm et al. [2009]. However, if the domain is a two-dimensional compact subset  embedded in R<sup>3</sup> or R<sup>2</sup> with Cartesian coordinates and appropriate boundary conditions, we can formulate the equations of motion also using vector proxies of exterior calculus. The computations are in arbitrary orthonormal coordinate systems. To this end, we start from the dimensionless Lagrangian for the thermal rotating shallow water (TRSW) equations, which are given by

$$L\_{TRSW}(\mathfrak{u}, \eta, b) = \int\_{\Omega} \left( \frac{1}{2} |\mathfrak{u}|^2 + \frac{1}{\mathrm{Ro}} \mathfrak{u} \cdot \mathfrak{R} - \frac{1}{2 \operatorname{Fr}^2} (1 + \mathfrak{s}b)(\eta - 2h) \right) \eta \, d\mu. \tag{1}$$

In this Lagrangian, *u* is the velocity field, *η* = *αζ* + *h* is the total depth, *ζ* is the free surface elevation, *h* is the bottom topography, *R* is the vector potential for the Coriolis parameter and *b* is the buoyancy variable. Furthermore, the dimensionless numbers are the aspect ratio *σ*, the Rossby number Ro, the Froude number Fr, the wave amplitude *α* and the stratification parameter s. The stratification parameter represents the importance of the buoyancy variable, which itself is of order one. The vector potential for the Coriolis parameter satisfies ∇⊥·*R* = *f (x, y)*, where *f (x, y)* is the usual Coriolis parameter. The ⊥ operator corresponds to a Hodge star operator, which for Cartesian coordinates is defined by *(x, y)*<sup>⊥</sup> = *(*−*y, x)*. The volume form is given by *dμ*, which in Cartesian coordinates is expressed as *dμ* = *dx dy*. The Lagrangian is a functional on X*(-)* × *V* <sup>∗</sup>*(-)*, the product space of the space of vector fields X*(-)* and the space of advected quantities *V* ∗*(-)*. That is, *u* ∈ X*(-)* and *η, b* ∈ *V* <sup>∗</sup>*(-)*. More information and details on these spaces can be found in Luesink [2021]. The thermal rotating shallow water equations associated with the Lagrangian (1) are given in the advective formulation by

$$\frac{\partial}{\partial t}\mathfrak{u} + (\mathfrak{u}\cdot\nabla)\mathfrak{u} + \frac{1}{\mathrm{Ro}}f\mathfrak{u}^{\perp} = -\frac{\alpha}{\mathrm{Fr}^{2}}\nabla((1+\mathfrak{s}b)\xi) + \frac{\mathfrak{s}}{2\,\mathrm{Fr}^{2}}(a\xi - h)\nabla b,$$

$$\frac{\partial}{\partial t}\eta + \nabla\cdot(\eta\mathfrak{u}) = 0,\tag{2}$$

$$\frac{\partial}{\partial t}b + (\mathfrak{u}\cdot\nabla)b = 0.$$

Equations (2) describe a compressible fluid with thermal effects in a rotating frame in two dimensions. Note that the equations take the same form on the sphere. The bold symbols denote vector valued quantities, which assumes the existence of a basis. Since we are working with two dimensional manifolds, such a basis is available. We will now recall an important vector calculus identity that is central to the manipulations that we will perform. This identity follows from differential topology, as shown in Holm et al. [2009], where vector fields and differential forms in arbitrary coordinate systems can be defined. The vector-valued coefficients of these objects can be identified when the underlying space is Euclidean with the standard inner product. This means that *u* is simultaneously the coefficient of a vector field and a differential 1-form. This can lead to confusion, so in what follows *u* denotes a vector field in arbitrary coordinates and *v* denotes a differential 1-form. The Lie derivative can be defined as the derivative of the pullback

$$
\mathcal{L}\_{\boldsymbol{\mu}} \boldsymbol{v} = \frac{d}{dt}\Big|\_{t=0} \phi\_{\boldsymbol{\mu}}^{\*}(\boldsymbol{v}), \tag{3}
$$

where *φ* is the flow associated with the vector field *u*. The asterisk on *φ* means that *v* is pulled back along *φ*. Here, *φ* is the solution to the differential equation *dφ/dt* = *u(φ, t)* with arbitrary initial condition, i.e., representing a trajectory of a test-particle in the flow field *u*. The arbitrariness is what allows Lagrangian formulations of fluid dynamics to be related to Eulerian formulations. The Lie derivative can also be defined by means of Cartan's formula

$$
\mathcal{L}\_u v = i\_u dv + d(i\_u v), \tag{4}
$$

where *iuv* denotes the interior product of *u* with *v* and *d* is the exterior derivative, see e.g. Abraham and Marsden [1978], Marsden and Ratiu [2013], Holm et al. [2009]. Setting the two definitions (3) and (4) of the Lie derivative equal to one another in R<sup>2</sup> provides the following vector calculus identity

$$(\mathfrak{u}\cdot\nabla)\mathfrak{v} + (\nabla\mathfrak{u})\cdot\mathfrak{v} = (\nabla^{\perp}\cdot\mathfrak{v})\mathfrak{u}^{\perp} + \nabla(\mathfrak{u}\cdot\mathfrak{v}),\tag{5}$$

where the left hand side results from the pullback definition (3) and the right hand side follows from the Cartan formula (4).

Using (5), an alternative formulation to (2) of the thermal rotating shallow water equations is given by

$$\begin{aligned} \frac{\partial}{\partial t} \mathfrak{u} + \left(\omega + \frac{1}{\mathrm{Re}} f\right) \mathfrak{u}^\perp &= -\frac{\alpha}{\mathrm{Fr}^2} \nabla((\mathfrak{l} + \mathfrak{s}b)\xi) - \frac{1}{2} \nabla|\mathfrak{u}|^2 + \frac{\mathfrak{s}}{2\,\mathrm{Fr}^2} (\alpha \xi - h) \nabla b, \end{aligned}$$

$$\begin{aligned} \frac{\partial}{\partial t} \eta + \nabla \cdot (\eta \mathfrak{u}) &= 0, \\ \frac{\partial}{\partial t} b + (\mathfrak{u} \cdot \nabla) b &= 0, \end{aligned} \tag{6}$$

where *ω* = ∇<sup>⊥</sup> · *u* is the scalar vorticity. This formulation is often called the vector invariant formulation of fluid dynamics. There are several situations where the vector invariant formulation has advantages over the advective formulation. The vector invariant form offers an easier basis for numerical discretisations because it hides the nonlinearity. Additionally, on the right hand side of the velocity equation in (6), one recognises the gradient of the Bernoulli function *<sup>B</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> |*u*| <sup>2</sup> <sup>+</sup> *<sup>α</sup>* Fr<sup>2</sup> *(*1+s*b)ζ* , further explaining the physical mechanisms responsible for the flow of fluid.

Both (2) and (6) can be formulated as a Hamiltonian system with a Lie-Poisson bracket. This is a crucial property of these models as it prepares for a clear formulation of conserved quantities, i.e., the desired Casimirs. By Legendre transforming the Lagrangian (1), we obtain the Hamiltonian associated with the thermal rotating shallow water equations. The Legendre transform is given by

$$H(\mathfrak{m}, \eta, b) = \langle \mathfrak{m}, \mathfrak{u} \rangle - L(\mathfrak{u}, \eta, b), \tag{7}$$

where *m* = *δL/δu* is the momentum variable in terms of the functional derivative of the Lagrangian. The Legendre transform only relates the velocity and the momentum and does not affect the advected quantities. The Hamiltonian is therefore a functional on the space X∗*(-)* × *V* <sup>∗</sup>*(-)*, where X∗*(-)* is the dual space of X*(-)* with respect to *L*2-pairing on *-*. For the thermal rotating shallow water model, the momentum is given by *m* = *η(u* + *R)* and the Hamiltonian is

$$H\_{TRSW}(\mathfrak{m},\eta,b) = \int\_{\Omega} \left( \frac{1}{2\eta^2} |\mathfrak{m}|^2 + \frac{1}{2\operatorname{Fr}^2} (1 + \mathfrak{s}b)(\eta - 2h) \right) \eta \,d\mu. \tag{8}$$

The Poisson bracket that will allow us to formulate the equations of motion has the following general form, see e.g. Holm et al. [2009],

$$\{F, G\} = \int\_{\Omega} \begin{pmatrix} \delta F / \delta \mathfrak{m} \\ \delta F / \delta \eta \\ \delta F / \delta b \end{pmatrix} \cdot \mathbb{J}(\mathfrak{m}, \eta, b) \begin{pmatrix} \delta G / \delta \mathfrak{m} \\ \delta G / \delta \eta \\ \delta G / \delta b \end{pmatrix} d\mu \,, \tag{9}$$

where *F* and *G* are two arbitrary functionals on the space X∗*(-)* × *V* <sup>∗</sup>. The matrix J is operator valued and depends on the state. J can be explicitly derived using reduction theorems, following Marsden and Weinstein [1974], Holm et al. [1998]. We speak of Hamiltonian dynamics if the evolution of a functional *F* is expressed by

$$\frac{d}{dt}F = -\{F, H\},\tag{10}$$

where *H* is the Hamiltonian of the system. We obtain the equations of motion (2) by choosing *<sup>G</sup>* <sup>=</sup> *<sup>H</sup>* and using <sup>J</sup>*(m, η, b)* as in Holm et al. [2021] to find

$$
\frac{\partial}{\partial t} \begin{pmatrix} m\_l \\ \eta \\ b \end{pmatrix} = \underbrace{\begin{pmatrix} m\_j \partial\_l + \partial\_j m\_l \ \eta \partial\_l \ -b\_{,l} \\ \partial\_j \eta \end{pmatrix}}\_{=\mathbf{J}(\mathfrak{m}, \eta, b)} \begin{pmatrix} \delta H\_{TRSW}/\delta m\_j \\ \delta H\_{TRSW}/\delta \eta \\ \delta H\_{TRSW}/\delta b \end{pmatrix} \tag{11}
$$

and reproduces Eqs. (2) after using the definition of the momentum, rewriting the momentum equation using the continuity equation and applying the Lie derivative identity (5) to the vector potential of the Coriolis parameter. We have used Einstein's summation convention of summing over repeated indices and the notation *b,i* refers to *i*-th component of the gradient of *b* (the comma denotes a spatial derivative). Note that the dependence on the state *(m, η, b)* is linear. This means that the Poisson bracket is in fact a Lie-Poisson bracket. The kernel of a Lie-Poisson bracket is nontrivial and is precisely the kernel of the linear operator J. The kernel is spanned by functionals known as Casimirs and the goal of the remainder of this work is to derive these Casimir functionals explicitly.

In Holm and Long [1989], the momentum bracket (11) is transformed by means of a change of variables to the rotational form. The change of variables is *(m*1*, m*2*, η, b)* → *(u, v, η, b)*, where *<sup>u</sup>* <sup>=</sup> *<sup>m</sup>*<sup>1</sup> *η* and *<sup>v</sup>* <sup>=</sup> *<sup>m</sup>*<sup>2</sup> *<sup>η</sup>* , which is invertible as long as 0 *<η<* ∞. This change of variables is not the same as just formulating the Hamiltonian and the Lie-Poisson bracket in terms of velocity because of the role of the rotating frame. Performing the transformation of Holm and Long [1989], we obtain the following formulation

$$\frac{\partial}{\partial t} \begin{pmatrix} u \\ v \\ \eta \\ b \end{pmatrix} = \underbrace{\begin{pmatrix} 0 & -q \ \partial\_{\chi} & -\frac{b\_{\chi}}{\eta} \\ q & 0 & \partial\_{\mathbf{y}} & -\frac{b\_{\chi}}{\eta} \\ \partial\_{\chi} & \partial\_{\mathbf{y}} & 0 & 0 \\ \underbrace{\frac{b\_{\chi}}{\eta} & \frac{b\_{\chi}}{\eta}}\_{\eta} & 0 & 0 \end{pmatrix}}\_{=\mathbb{J}(u,v,\eta,b)} \begin{pmatrix} \delta H\_{TRSW}/\delta u = \eta u \\ \delta H\_{TRSW}/\delta v = \eta v \\ \delta H\_{TRSW}/\delta \eta = B \\ \delta H\_{TRSW}/\delta b = T \end{pmatrix},\tag{12}$$

where *<sup>q</sup>* <sup>=</sup> <sup>1</sup> *<sup>η</sup> (ω*<sup>+</sup> <sup>1</sup> Ro*f )* is the potential vorticity, *<sup>B</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> |*u*| <sup>2</sup> <sup>+</sup> <sup>1</sup> 2 Fr2 *(*1+s*b)(η*−2*h)* is the Bernoulli function and *<sup>T</sup>* <sup>=</sup> <sup>s</sup> 2 Fr2 *(η*<sup>2</sup> <sup>−</sup> <sup>2</sup>*ηh)*. This Hamiltonian formulation corresponds to (6) and extends the bracket of Holm and Long [1989] to include thermal variations. This bracket is linear and it is a simple exercise to show that it is skew-symmetric with respect to the *L*2-pairing. To prove that it satisfies the Jacobi identity, we repeat the argument of Holm and Long [1989] and state that the bracket in (12) is an invertible transformation of variables in the Lie-Poisson bracket (11).

We now wish to express the bracket (12) in another set of variables to achieve a formulation in independent scalar variables. This is accomplished using the transformation *(u, v, η, b)* → *(ω, D, η, ηb)*, where *ω* = ∇<sup>⊥</sup> · *u* is the vorticity and *D* =∇· *u* is the divergence. If the domain  has no holes, then the vector Laplace equation has a unique solution. This is the Helmholtz theorem for Euclidean space. If the domain does have holes (or islands) then appropriate boundary conditions are required to eliminate harmonic functions in order to guarantee uniqueness of solutions. In either case, provided with appropriate boundary conditions when necessary, one can uniquely split up the vector field *u* into potentials via the Helmholtz decomposition

$$
\mathfrak{u} = \nabla^{\perp} \psi + \nabla \chi,\tag{13}
$$

where *ψ* is the stream function and *χ* is the velocity potential. Uniqueness of solutions to the Laplace equation is required to reconstruct *u*, since the potentials satisfy Poisson's equations

$$\begin{aligned} \omega &= \nabla^{\perp} \cdot (\nabla^{\perp} \psi) &= \Delta \psi, \\ D &= \nabla \cdot (\nabla \chi) &= \Delta \chi. \end{aligned} \tag{14}$$

This permits us to write the velocity field in terms of vorticity and divergence as follows

$$
\mathfrak{u} = \nabla^{\perp} \Delta^{-1} \omega + \nabla \Delta^{-1} D,\tag{15}
$$

which means that changing variables in the Hamiltonian amounts to substituting (15) for the velocity. The first term in (15) is analogous to the two-dimensional version of the Biot-Savart law, but since the fluid is compressible it does not determine the full velocity field. The potential buoyancy variable *r* = *b/η* is introduced to remove the fraction terms ±*bx /η* and ±*by /η* in the Poisson bracket (12). This is convenient from a notation point of view, but also means that all the variables that appear inside the J matrix in the Poisson bracket are "potential variables", i.e., potential vorticity and potential buoyancy. To change variables in the Cartesian coordinate case, we sandwich the Poisson bracket of Eq. (12) with the functional Jacobian derivative operator and its *L*2-adjoint as follows

$$
\begin{pmatrix}
\partial\_{\mathbf{x}}\ \partial\_{\mathbf{y}}\ \mathbf{0}\ \mathbf{0} \\
\mathbf{0}\ \mathbf{0}\ \mathbf{1}\ \mathbf{0} \\
\mathbf{0}\ \mathbf{0}\ \mathbf{0}\ \mathbf{1}\
\end{pmatrix}
\begin{pmatrix}
\mathbf{0}\ -\mathbf{q}\ \partial\_{\mathbf{x}}\ -\frac{\mathbf{b}\_{\mathbf{x}}}{\eta} \\
\mathbf{q}\ \mathbf{0}\ \partial\_{\mathbf{y}}\ -\frac{\mathbf{b}\_{\mathbf{y}}}{\eta} \\
\partial\_{\mathbf{x}}\ \partial\_{\mathbf{y}}\ \mathbf{0}\ \mathbf{0} \\
\frac{\mathbf{b}\_{\mathbf{x}}}{\eta}\ \frac{\mathbf{b}\_{\mathbf{y}}}{\eta}\ \mathbf{0}\ \mathbf{0}
\end{pmatrix}
\begin{pmatrix}
\partial\_{\mathbf{y}}\ -\partial\_{\mathbf{x}}\ \mathbf{0}\ \mathbf{0} \\
\mathbf{0}\ \mathbf{0}\ \mathbf{1}\ \mathbf{0} \\
\mathbf{0}\ \mathbf{0}\ \mathbf{0}\ \mathbf{1}
\end{pmatrix}
\tag{16}
$$

Similar manipulations can be performed for different coordinate systems, provided that one takes an orthonormal coordinate basis. For the sphere this would be a latitude-longitude basis. Performing this change of coordinates in the bracket (12) leads to

$$\left\{F,G\right\}=-\int\_{\Omega} \underbrace{\begin{pmatrix} \delta F/\delta \omega\\ \delta F/\delta D\\ \delta F/\delta \eta\\ \delta F/\delta b \end{pmatrix}}\_{\begin{pmatrix} \delta F/\delta D\\ \delta F/\delta b \end{pmatrix}} \nabla \cdot \underbrace{\begin{pmatrix} q \times -q & 0 \ r \times\\ q & q \times 1 & r\\ 0 & -1 & 0 & 0\\ r \times -r & 0 & 0 \end{pmatrix}}\_{=\mathbb{J}(\omega,D,\eta,b)} \nabla \begin{pmatrix} \delta G/\delta \omega\\ \delta G/\delta \eta\\ \delta G/\delta b \end{pmatrix} d\mu \,. \tag{17}$$

This bracket formulation has a number of interesting properties. First of all, the differential operators can be factored out of the matrix upon introducing the convention ×∇ = ∇⊥. Secondly, the matrix only features the potential vorticity *q* and the potential buoyancy *r*. The kernel of this bracket is the kernel of the matrix operator J*(ω, D, η, b)*, which is skew-symmetric in both *q* and *r*. The most interesting property in our opinion is that this bracket governs a number of twodimensional fluid models through submatrices of J.

The full four by four setting corresponds to the thermal rotating shallow water equations. If there is no underlying rotation, one simply adapts the definition of the potential vorticity *q* to obtain the thermal shallow water equations. When there are no buoyancy variations, the bracket can be restricted to the three by three case, which corresponds to the rotating shallow water equations. Again, upon adapting the potential vorticity variable *q*, one can obtain the shallow water equations in case rotation is absent. In the incompressible case where buoyancy variations still play a role, the divergence is zero. Then one can use the submatrix consisting of the *(*1*,* 1*), (*1*,* 4*), (*4*,* 1*), (*4*,* 4*)* elements, which corresponds to the thermal rotating Euler equations. If buoyancy variations do not play a role, the matrix is simply the *(*1*,* 1*)* element, which can describe the two-dimensional rotating Euler equations and the quasi-geostrophic (QG) equations. Important to note is that the thermal


**Table 1** The models that can be described by submatrices of the Poisson bracket (17)

QG (TQG) model described in Warneford and Dellar [2013], Zeitlin [2018], Holm et al. [2021] does not fit into the Lie-Poisson bracket formulation because its *(*1*,* 1*)* position features *q* − *b* rather than just *q*, see Holm et al. [2021]. This is a result of the fact that the TQG model is derived as a perturbation around thermal geostrophic balance, rather than around geostrophic balance. To summarise, we list all models that can be described by (17) in order of complexity in Table 1.

The transition from incompressible models to compressible models is a steep increase in complexity, since the compressible models involve an additional two equations compared to the incompressible case. The divergence and the depth variable are always paired together, since changes in *D* imply changes in *η* and viceversa. This bracket is particularly useful in the explicit computation of Casimirs, which is shown in Sect. 3. It also has other uses. In ocean dynamics, gravity waves propagate at speeds that are orders of magnitude higher than the typical flow velocity. If the Rossby number, the Froude number, the wave amplitude and the stratification parameter satisfy O*(*Ro*)* = O*(*Fr*)* = O*(α)* = O*(*s*)*, then one can derive the thermal geostrophic balance condition. This condition provides an algebraic relation for the balanced velocity field. The balanced velocity field is divergence free. Since the bracket (17) features the divergence variable explicitly, it is natural to perform an asymptotic expansion in a small parameter where

$$\begin{aligned} \omega &= \omega\_0 + \epsilon \omega\_1 + o(\epsilon), \\ D &= \epsilon D\_{\mathbb{I}} + o(\epsilon). \end{aligned} \tag{18}$$

The expansion (18) applied to the thermal rotating shallow water Lagrangian (1) and truncated at order *o(*1*)* yields the thermal extension of the L1 model of Salmon [1983]. This derivation is shown in detail in Holm et al. [2021].

## **3 Explicitly Determining the Casimirs**

The bracket (17) is particularly helpful in the explicit computation of the Casimirs since the variables are all independent scalars. We are looking for functionals *C(ω, D, η, b)* such that {*F,C*} = 0 for any functional *F*. The following computations follow a procedure of step-by-step elimination. Expanding the bracket and requiring {*F,C*} = 0 for any *F* yields the equation

$$\begin{split} 0 = \{F, C\} &= -\int\_{\Omega} \frac{\delta F}{\delta \omega} \nabla \cdot \left( q \nabla^{\perp} \frac{\delta C}{\delta \omega} - q \nabla \frac{\delta C}{\delta D} + r \nabla^{\perp} \frac{\delta C}{\delta b} \right) \\ &\quad + \frac{\delta F}{\delta D} \nabla \cdot \left( q \nabla \frac{\delta C}{\delta \omega} + q \nabla^{\perp} \frac{\delta C}{\delta D} + \nabla \frac{\delta C}{\delta \eta} + r \nabla \frac{\delta C}{\delta b} \right) \\ &\quad - \frac{\delta F}{\delta \eta} \nabla \cdot \left( \nabla \frac{\delta C}{\delta D} \right) \\ &\quad + \frac{\delta F}{\delta b} \nabla \cdot \left( r \nabla^{\perp} \frac{\delta C}{\delta \omega} - r \nabla \frac{\delta C}{\delta D} \right) d\mu \, . \end{split} \tag{19}$$

Since *F* is an arbitrary functional, we can solve for this functional equation per variational derivative of *F* and get the explicit form of the Casimirs by a process of elimination. In fact, the third line of (19), the one that features *δF/δη*, will be our starting point in the explicit computations. The third line implies that *C* may be at most linear in *D* with constant coefficients, since only then the variational derivative of *C* with respect to *D* is constant in space. Applying the gradient leads to zero, meaning that the third term vanishes under the assumption of linear dependence of *C* on mass density *D*. If the variational derivative of *C* with respect to *D* is not constant with respect to space, then this term does not vanish, as *D* is not necessarily zero. So at this stage we know that

$$C(\omega, D, \eta, b) = \int\_{\Omega} \gamma \, D + f(\omega, \eta, b) \, d\mu \,, \tag{20}$$

where *<sup>γ</sup>* <sup>∈</sup> <sup>R</sup> is a constant and *f (ω, η, b)* is a function that is determined next. Since we have established that *C* must be linear in *D*, all terms that involve variational derivatives of *C* with respect to *D* vanish. Simplifying (19) accordingly, we obtain

$$\begin{split} 0 = \{F, C\} &= -\int\_{\Omega} \frac{\delta F}{\delta \omega} \nabla \cdot \left( q \nabla^{\perp} \frac{\delta C}{\delta \omega} + r \nabla^{\perp} \frac{\delta C}{\delta b} \right) \\ &+ \frac{\delta F}{\delta D} \nabla \cdot \left( q \nabla \frac{\delta C}{\delta \omega} + \nabla \frac{\delta C}{\delta \eta} + r \nabla \frac{\delta C}{\delta b} \right) \\ &+ \frac{\delta F}{\delta b} \nabla \cdot \left( r \nabla^{\perp} \frac{\delta C}{\delta \omega} \right) d\mu \, . \end{split} \tag{21}$$

The third line of (21), the one that features *δF/δb*, vanishes if *C* depends linearly on *ω*. In this case we do not have to insist on constant coefficients. If the coefficient of *ω* is an arbitrary differentiable function *(r)* of the potential buoyancy, i.e., *δC/δω* = *(r)*, we have

$$\nabla \cdot \left( r \nabla^{\perp} \frac{\delta C}{\delta \omega} \right) = \nabla \cdot \left( r \nabla^{\perp} \Psi(r) \right) = \Psi'(r) (\nabla r \cdot \nabla^{\perp} r) + r (\nabla \cdot \nabla^{\perp}) \Psi(r) = 0, \quad (22)$$

because ∇*r,* ∇⊥*r* and ∇*,* ∇<sup>⊥</sup> are orthogonal. Here *(r)* denotes the derivative of with respect to its argument. Recall that the potential buoyancy is defined as *r* = *b/η*. At this stage, we know that *C* must have the form

$$C(\omega, D, \eta, b) = \int\_{\Omega} \gamma D + \omega \Psi(r) + \mathbf{g}(b, \eta) \, d\mu \,. \tag{23}$$

The next step is to determine *g(b, η)*. Knowing the explicit form of the variational derivative of *C* with respect to *ω*, we can simplify (21) further to obtain

$$\begin{split} 0 = \{F, C\} &= -\int\_{\Omega} \frac{\delta F}{\delta \omega} \nabla \cdot \left( q \nabla^{\perp} \Psi(r) + r \nabla^{\perp} \frac{\delta C}{\delta b} \right) \\ &+ \frac{\delta F}{\delta D} \nabla \cdot \left( q \nabla \Psi(r) + \nabla \frac{\delta C}{\delta \eta} + r \nabla \frac{\delta C}{\delta b} \right) d\mu \,. \end{split} \tag{24}$$

We focus on the term that features *δF/δω*. We have an explicitly constructed term and the variational derivative of *C* with respect to *b*. By the same argument as in the previous step, we know ∇ · *(r*∇⊥*(δC/δb))* vanishes if the variational derivative of *C* with respect to *b* is a differentiable function of *r*. So, let *(r)* = *(b/η)* be this differentiable function. Since a variational derivative of *(b/η)* with respect to *b* produces a factor of 1*/η*, we introduce the term *η(b/η)* into *C*. So at this stage, after the three steps, we have an expression for *C* in terms of two arbitrary differentiable functions and

$$C(\omega, D, \eta, b) = \int\_{\Omega} \eta \Phi(r) + \eta q \Psi(r) + \chi D \, d\mu \tag{25}$$

where *<sup>γ</sup>* <sup>∈</sup> <sup>R</sup> is a constant and *ηq* <sup>=</sup> *<sup>ω</sup>* <sup>+</sup> <sup>1</sup> Ro*<sup>f</sup>* . Going back to (24), we can verify whether *C(ω, D, η, b)* is indeed the family of Casimirs of the Poisson bracket (17). This means that the terms multiplying variational derivatives of *F* must vanish. So we substitute (25) into (24) and compute the term multiplied by *δF/δω*

$$\begin{split} &\nabla \cdot \left( q \nabla^{\perp} \frac{\delta \mathcal{C}}{\delta a} - q \nabla \frac{\delta \mathcal{C}}{\delta D} + r \nabla^{\perp} \frac{\delta \mathcal{C}}{\delta b} \right) \\ &= \nabla \cdot \left( q \nabla^{\perp} \Psi(r) + r \nabla^{\perp} \left( \Phi'(r) + q \Psi'(r) \right) \right) \\ &= \Phi''(r) (\nabla r \cdot \nabla^{\perp} r) + r (\nabla \cdot \nabla^{\perp}) \Phi'(r) \\ &\quad + \Psi'(r) (\nabla q \cdot \nabla^{\perp} r) + \Psi'(r) (\nabla r \cdot \nabla^{\perp} q) \\ &\quad + r \Psi''(r) (\nabla r \cdot \nabla^{\perp} q) + r \Psi''(r) (\nabla q \cdot \nabla^{\perp} r) \\ &= 0. \end{split} (26)$$

In this computation we have used orthogonality of ∇*,* ∇<sup>⊥</sup> and ∇*r,* ∇⊥*r* and skewsymmetry, i.e., ∇*r*·∇⊥*q* = −∇*q* ·∇⊥*r* (= −*rx qy* +*qx ry* for Cartesian coordinates). It remains to check whether the term multiplied by *δF/δD* also vanishes. In the following computation, we suppress the dependence of and on *r* for notational convenience. We compute from the second line in (19)

$$\begin{split} \nabla \cdot \left( q \nabla \frac{\delta C}{\delta \omega} + q \nabla^{\perp} \frac{\delta C}{\delta D} + \nabla \frac{\delta C}{\delta \eta} + r \nabla \frac{\delta C}{\delta b} \right) \\ = \nabla \cdot \left( q \nabla \Psi + \nabla \left( \Phi - r \Phi' - r q \Psi' \right) + r \nabla (\Phi' + q \Psi') \right) \\ = \nabla \cdot \left( q \nabla \Psi - r \nabla (q \Psi') - q \Psi' \nabla r + r \nabla (q \Psi') \right) \\ = 0. \end{split} \tag{27}$$

The computation is a sequence of applying the identity ∇*f (r)* = *f (r)*∇*r*. In the first step we have applied this to all the terms that involve . Performing the same manipulations on subsequently, implies the result. Note that in (24) we required cancellations of terms orthogonal to the terms that are required to cancel in (27). The reason that the perpendicular gradient of the variational derivative of *C* with respect to *η* does not appear in (24) is because it is trivially zero due to orthogonality of ∇ and ∇⊥. Hence we can conclude that (25) is the complete description of the Casimirs for the bracket (17) and thus also for (11). We can repeat the computation for each of the models described in Table 1 to obtain the corresponding Casimirs. This is summarised in Table 2. Here one can see that in presence of thermal effects enstrophy is no longer a Casimir.


**Table 2** The Casimirs of the models that can be described by submatrices of the Poisson bracket (17)

# **4 Conclusion**

In geometric approaches to fluid dynamics one often may exploit constructive methods to infer conservation laws via Noether's theorem Abraham and Marsden [1978], Marsden and Weinstein [1983], Marsden and Ratiu [2013], Holm et al. [2009]. Casimirs identify conservation laws that arise as the kernel of the Poisson bracket. We provided an explicit method of determining these Casimir functionals for two-dimensional fluid dynamics by means of two changes of variables for the thermal rotating shallow water equations. We formulated the equations in vector invariant form by using an important vector calculus identity. The Poisson bracket corresponding to this vector invariant form is convenient for further manipulation. By changing coordinates from velocity to vorticity and divergence, we derive a Poisson bracket that only involves (skew) gradients and divergences. By means of this formulation, it is a systematic computation to obtain the Casimirs. The computations were performed for arbitrary coordinate systems, which means that the above computations can easily used for domains such as the sphere.

**Acknowledgments** The authors are grateful for many fruitful discussions with Paolo Cifani, Sagy Ephrati, Arnout Franken and Darryl Holm. We thank the anonymous reviewer for their valuable comments.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Correlated Structures in a Balanced Motion Interacting with an Internal Wave**

**Igor Maingonnat, Gilles Tissot, and Noé Lahaye**

**Abstract** Characterizing the loss of coherence of an internal tide propagating through mesoscale turbulence has been a major challenge in oceanography, particularly due to its implications for the interpretation of satellite data. In this paper, we intend to study the correlations between a balanced motion and the incoherent part of a wave in an idealised configuration. We introduce a new modal decomposition technique, named broad-band proper orthogonal decomposition (BBPOD), which consists in performing a proper orthogonal decomposition (POD) on complex demodulated variables. After connecting BBPOD to the standard SPOD, we show that BBPOD, coupled with the extended POD technique enables us to associate the principal components of the incoherent field to the slow flow structures responsible of this loss of coherence through triadic interactions with the incident wave.

**Keywords** Internal tide interactions · Spectral proper orthogonal decomposition · Broadband proper orthogonal decomposition

# **1 Introduction**

Internal tides, generated by interactions between the barotropic tide and topographic features such as ridges or continental slopes, are ubiquitous in the ocean, playing a crucial role in vertical mixing and energy transport. They propagate over large distances, encountering regions with energetic mesoscale turbulence, and they lose their fixed phase relationship with the astronomical forcing, a phenomenon known as incoherence. This loss of coherence, highly unpredictable, complicates for example our ability to disentangle internal tides and low-frequency turbulent signals from satellite data (Richman et al. 2012).

I. Maingonnat (-) · G. Tissot · N. Lahaye

INRIA Rennes Bretagne Atlantique, IRMAR – UMR CNRS 6625, Rennes, France e-mail: igor.maingonnat@inria.fr

<sup>©</sup> The Author(s) 2024

B. Chapron et al. (eds.), *Stochastic Transport in Upper Ocean Dynamics II*, Mathematics of Planet Earth 11, https://doi.org/10.1007/978-3-031-40094-0\_9

To face these difficulties, there have been various studies to better understand and predict the impact of a balanced (turbulent) jet on the inertia-gravity wave field properties. For instance, Savva and Vanneste (2018) and Ward and Dewar (2010) studied the internal tide scattering. Ponte and Klein (2015) examined the incoherence time scales for different turbulent configurations, and Dunphy et al. (2017) quantified the interaction terms via vertical-mode projection of the linearized Boussinesq equations.

We consider in the present paper a data-driven approach to study from idealised numerical simulations, the structures of the jet, which are correlated with the incoherent contributions of the wave field. Extracting coherent structures from a data set can be performed in the spectral domain by spectral proper orthogonal decomposition (SPOD) (Towne et al. 2018). Some attempts to connect these reduced features with non-linear interactions have been proposed for instance in Karban et al. (2022) by identifying the contribution of the non-linear term correlated with the SPOD mode through extended proper orthogonal decomposition (EPOD, Boree (2003)). Unfortunately, non-linear forcing is a quantity more difficult to interpret and associate with physical mechanisms than a pressure or velocity field for instance. We propose instead a new *broad-band proper orthogonal decomposition (BBPOD),* derived from the SPOD. By taking advantage of the strong time scale separation between the two dynamics and considering small-amplitude wave, this formulation allows us to connect the non-linear interactions between the slow motions of the jet and the incoherent contributions of the wave through the EPOD method. The non-linear interactions are here understood as triadic interactions with the slow motion that will lead to a broadband frequency structure of the incoherent wave field.

The plan is as follows. We will begin by describing the model used for our simulations in Sect. 2. In Sect. 3.1, the original SPOD method is reviewed and a connection to the proposed BBPOD method is made (Sect. 3.2). Sections 4 and 5 will summarize the study and bring some perspectives.

# **2 Model**

The propagation of internal tides through a nearly-balanced jet is examined in a one layer rotating shallow water (RSW) model. The equations are non-dimensionalized as follows. The characteristic time-scale is the inertial time *<sup>T</sup>* <sup>=</sup> *<sup>f</sup>* <sup>−</sup>1, inverse of the Coriolis frequency. The reference length scale *L* is chosen to be of the order of the jet thickness, taken equal to the first Rossby radius of deformation *Rd* , such that the Burger number *Bu* <sup>=</sup> *<sup>R</sup>*<sup>2</sup> *<sup>d</sup> /L*<sup>2</sup> is equal to one. A beta-plane approximation, with parameter *β*, accounts for the effect of rotation, and a radiative damping term *α* is added on the continuity equation, (following e.g. Brunet and Vautard 1996). Adequat artificial hyperviscosity, consistent with energy dissipation and conservation of angular momentum, is also used (see Ochoa et al. 2011).

Let be a bounded subset of R2, the equations without viscosity, defined on <sup>×</sup> <sup>R</sup>+, are

$$
\partial\_l h + B\_\mu \text{div } \mathbf{v} + R\_o((\mathbf{v} \cdot \nabla)h + h \text{ div } \mathbf{v}) = -\alpha h \tag{1}
$$

$$
\partial\_t \mathbf{v} + (1 + \beta \mathbf{y}) \mathbf{v}^\perp + R\_o (\mathbf{v} \cdot \nabla) \mathbf{v} = -\nabla h,\tag{2}
$$

where **<sup>v</sup>** <sup>=</sup> *(u, v)* is the horizontal velocity, *<sup>v</sup>*<sup>⊥</sup> <sup>=</sup> *(*−*v, u)<sup>T</sup>* , *<sup>h</sup>* is the sea surface height (SSH) anomaly and *Ro* = *U/(f L)* is the Rossby number.

A numerical simulation of system (1) and (2) of a plane wave interacting with a zonal jet has been performed. Equations have been discretized by a pseudospectral method and with a Runge-Kutta time-scheme using the code Dedalus (Burns et al. 2020). The domain is a doubly periodic rectangular domain of size [0*,* 20] × [−20*,* 20] discretized on a 128 × 512 grid. The simulation is initialized with an eastward zonal jet at geostrophic equilibrium with a small perturbation to trigger the instabilities. During the whole experiment, a northward propagating plane wave with frequency fixed at *ω* = 2 is generated in a nudging layer in the South of the domain, and an eastward wind forcing is applied to maintain the balanced current. A spin-up phase allows the jet to evolve toward a statistically stationary state. The experiment continues with snapshots stored every 0.1 wave period, such that a sufficiently long series *<sup>q</sup>* <sup>=</sup> *(u, v, h)T* of 12,000 snapshots are collected, representing nearly 4000 wave periods. Note that a sufficient sampling in time is required for extracting the wave field by filtering in Sect. 3.2. An example of snapshot associated with one run of the simulation is shown in Fig. 1.

**Fig. 1** A snapshot taken from the RSW simulation. On the left the vorticity field and on the right the internal wave SSH anomaly *hω*, extracted by bandpass filtering centered around the wave frequency *ω* = 2

This simple model can sustain waves interacting with a turbulent flow. Northward rotating shallow water waves are inertia gravity waves satisfying the dispersion relation

$$k\_\text{y}^2 = \frac{\alpha^2 - (1 + \beta \mathbf{y})^2}{B\_u},\tag{3}$$

with *ky* denoting the spatial wavenumber in the meridional direction, which is function of the tidal frequency *ω*.

# **3 Methods**

The methods to extract correlated structures in a flow field are described in this section. In the following we denote our state vector by *<sup>q</sup>* <sup>=</sup> *(u, v, h)T* .

## *3.1 Spectral Proper Orthogonal Decomposition*

Spectral proper orthogonal decomposition is an extension of POD methods (see Lumley 1967), and aims at extracting coherent structures in spectral space from numerical or experimental data. The data are assumed to be issued from a statistically stationary random process, which is verified thanks to the hypothesis of small-amplitude waves, such that the balanced flow is marginally impacted by the wave propagation. The modes obtained from this decomposition are space-time uncorrelated from each other.

We interpret our state vector as a zero-mean second order random process *q* (which can be obtained by subtracting the time averaged field beforehand) indexed over <sup>×</sup> <sup>R</sup>+, a subset of spatio-temporal variables. Consider *<sup>q</sup>*ˆ*(ω)* the Fourier transform of the process at angular frequency *ω*. We assume that each realization of *<sup>q</sup>*ˆ*(ω)* belongs to *<sup>L</sup>*<sup>2</sup> **<sup>W</sup>***(,* <sup>C</sup>3*)* = {*<sup>g</sup>* : <sup>→</sup> <sup>C</sup>3*,* - *g*∗*(z)***W***(z)g(z)* d*z <* ∞}, the ensemble of square integrable functions in space (periodic at boundary) relatively to a positive definite weight matrix **<sup>W</sup>**: <sup>→</sup> *Mn(*R*).* The superscript · ∗ denotes the transpose-conjugate operation. The matrix **W** is chosen such that the *L*2-norm approximates the energy of the RSW model (1) and (2):

$$\|\|\mathbf{q}\|\|\_{\mathbf{W}}^2 = \frac{1}{2} \int\_{\Omega} (\mathbf{l} + \lambda \bar{h})(u^2 + v^2) \,\mathrm{d}x \mathrm{d}y + \frac{1}{2B\_u} \int\_{\Omega} h^2 \,\mathrm{d}x \mathrm{d}y,\tag{4}$$

where *λ* refers to the deviation of the isopycnal taken equal to *Ro/Bu* and *h*¯ corresponds to the temporal-mean of SSH anomaly. The term 1 + *λh*¯ is assumed to remain strictly positive (which is necessary for ensuring positiveness of the norm), since we do not consider zero or negative sea-surface height.

Let us recall briefly the basic principle of SPOD. By stationarity the autocorrelation function **C***(x, y, t, x , y , t )* <sup>=</sup> <sup>E</sup>*(q(x, y, t)* <sup>⊗</sup> *<sup>q</sup>***∗***(x , y , t ))* satisfies:

$$\mathbf{C}(\mathbf{x}, \mathbf{y}, t, \mathbf{x}, \mathbf{y}, t') = \mathbf{C}(\mathbf{x}, \mathbf{y}, \mathbf{x}', \mathbf{y}', t - t'), \tag{5}$$

where <sup>⊗</sup> denotes the dyadic product of two vectors in <sup>R</sup>3. The objective of SPOD method is therefore to find a deterministic function *ψ* satisfying the Fredholm equation:

$$\int\_{\Omega} \mathbf{S}(\mathbf{x}, \mathbf{y}, \mathbf{x}', \mathbf{y}', \omega) \mathbf{W}(\mathbf{x}', \mathbf{y}') \boldsymbol{\upmu}(\mathbf{x}', \mathbf{y}', \omega) \, \mathrm{d}\mathbf{x}' \mathrm{d}\mathbf{y}' = \boldsymbol{\uplambda}(\boldsymbol{\upomega}) \boldsymbol{\upmu}(\mathbf{x}, \mathbf{y}, \boldsymbol{\upomega}), \tag{6}$$

where **S***(x, y, x , y , ω)* = - <sup>R</sup> **C***(x, y, x , y , τ )e*−*iωτ* d*τ* is the cross spectral density matrix (CSD).

The Karhunen-Loeve theorem (Loève 1955) states that <sup>∀</sup>*<sup>ω</sup>* <sup>∈</sup> <sup>R</sup>, Eq. (6) has an infinite number of solutions *(λj (ω), ψ<sup>j</sup> (ω))*<sup>∞</sup> *<sup>j</sup>*=<sup>1</sup> such that: *(ψ<sup>j</sup> (ω))*<sup>∞</sup> *<sup>j</sup>*=<sup>1</sup> is an orthonormal basis in *L*<sup>2</sup> **<sup>W</sup>***(,* <sup>C</sup>3*)* where we can expand the Fourier transform of the field into structures uncorrelated from each other:

$$\hat{q}(\mathbf{x}, \mathbf{y}, \omega) = \sum\_{j=1}^{\infty} a\_j(\omega) \boldsymbol{\Psi}\_j(\mathbf{x}, \mathbf{y}, \omega) \tag{7}$$

with E*(aj (ω)aj (ω ))* = *λj (ω)δj,j (ω* − *ω )* and ∀*j, λj (ω)* ≥ 0*.* Moreover, a truncated expansion at order *n* will maximize the mean energy (defined by the norm **·**<sup>2</sup> **<sup>W</sup>**), compared to any other decomposition of the same order, and SPOD provides an optimal decomposition of the CSD:

$$\mathbb{E}(\|\hat{\boldsymbol{q}}(\omega)\|\_{\mathbf{W}}^2) = \sum\_{j=1}^{\infty} \lambda\_j(\omega) \tag{8}$$

$$\mathbf{S}(\mathbf{x}, \mathbf{y}, \mathbf{x}', \mathbf{y}', \boldsymbol{\omega}) = \sum\_{j=1}^{\infty} \lambda\_j(\boldsymbol{\omega}) \boldsymbol{\Psi}\_j(\mathbf{x}, \mathbf{y}, \boldsymbol{\omega}) \boldsymbol{\Psi}\_j^\*(\mathbf{x}', \mathbf{y}', \boldsymbol{\omega}). \tag{9}$$

Therefore, by solving the Fredholm equation at the tidal frequency, we are able to separate the fast from the slow component of the process by a spectral decomposition, expanded onto an orthonormal basis optimal energetically.

It can be remarked that when there is an homogeneous direction, it is possible to compute the Fredholm equation in spectral space with respect to this direction. Each wave-number can be computed independently, and each corresponding Fourier component is solution of the original Fredholm problem. For more complete description of SPOD (see Towne et al. 2018; Schmidt and Colonius 2020). In our case, the domain is homogeneous in the *x* direction, but we still consider the 2D problem for the physical relevance of the extended BBPOD problem presented in Sect. 3.2.

## *3.2 Broadband Proper Orthogonal Decomposition*

The BBPOD consists in estimating by complex-demodulation (see Godfrey 1965) the correlation tensor of a band-passed-filtered signal, and its eigenfunctions. The associated algorithm is presented in this section. It will be compared with the Welch's method used in SPOD (see Welch 1967; Towne et al. 2018) and it will be shown that under some hypotheses both algorithms are equivalent. It can be noted that in Welch (1967), the connection with complex demodulation is briefly mentioned, and it is leveraged here with the computation of the eigenvectors and eigenvalues of the correlation tensor for a POD decomposition. We highlight by this equivalence, that due to windowing in the Welch's method, the features extracted by SPOD possess a spectral component with a thick frequency band. For the Broadband POD algorithm, this frequency band is explicit, and chosen through the definition of a filter. In the RSW model, the incoherent wave field possesses a broad band structure and Broadband POD allows us to obtain a complete decomposition of this field.

#### **3.2.1 Complex Demodulation of the Wave Field**

Given a temporal series *xt* , the principle of complex demodulation consists in the following computation:

$$\mathbf{x}\_d = \mathcal{L}(\mathbf{x}\_l e^{-l\alpha t}),\tag{10}$$

where *xd* is called the complex demodulated signal, L is a low-pass filter and *ω* is the demodulated frequency. Complex demodulation enables to extract the slowly varying amplitude envelop and phase deviations of the wave field. These variations are associated with wave incoherence.

First, we decompose the process *q(x, y, t)* in a sum of *q<sup>j</sup> (x, y, t)* and *qω(x, y, t)*, representing the jet and wave contributions, respectively. The wave field can be expressed as:

$$\mathbf{q}\_{\omega}(\mathbf{x}, \mathbf{y}, t) = \mathbf{q}\_{d}(\mathbf{x}, \mathbf{y}, t)e^{i\alpha t},\tag{11}$$

where *ω* is the tidal frequency and *q<sup>d</sup>* is a complex field slowly varying in time, accounting for the incoherence. The latter is then extracted by the complex demodulation of *q* (10), with a filter designed such that it isolates the slow variations of the background flow. As a consequence, through the *e*−*iωt* shift, the spectralband of the extracted wave contains all effects of triadic interactions between the slow motions and the coherent wave. The jet contribution *q<sup>j</sup>* is obtained by directly filtering *q* (zero frequency), including here the time-average. In this study, we assume that *q* is dominated by the superimposition of *q<sup>j</sup>* and *qω*, which interact nonlinearly. Note that for simplicity, a uniform time sampling of the data was chosen but this can be adapted to a non-uniform one, e.g. by interpolation on a regular grid or using filtering algorithms to irregular sampling.

#### **3.2.2 Link with SPOD**

For making the link with SPOD to estimate the CSD of a signal *xt* , we write the temporal filter in a discrete form, with a time spacing  *t* and *tk* = *k t*. A wide class of linear filter can be expressed as the convolution:

$$\mathcal{L}(\mathbf{x}\_{l}) = \sum\_{i=-m}^{m} b\_{l} \mathbf{x}\_{l-i},\tag{12}$$

where *(bi)*−*m*≤*i*≤*<sup>m</sup>* are the discrete coefficients of the filter. Then,

$$(\mathcal{L}(\mathbf{x}\_l e^{-l\alpha t}))\_j = \sum\_{k=-m}^{m} b\_k \mathbf{x}\_{j-k} e^{-l\alpha t\_{j-k}} \tag{13}$$

$$=\sum\_{k=j-m}^{j+m} b\_{j-k} \chi\_k e^{-i\alpha t\_k}.\tag{14}$$

In the Welch method, *xt* is subdivided into possibly overlapping blocks of size N and with an overlap *No*. A Fast Fourier Transform is performed on each windowed block to extract the Fourier component at the tidal frequency, denoted *X<sup>l</sup> <sup>ω</sup>* where *l* is the block index. We define

$$X\_{\omega}^{l} = \sum\_{k=0}^{N} x\_{k + l(N - N\_0)} W\_k e^{-l\omega t\_k},\tag{15}$$

where *Wk* is a window function. By changing variable *k* = *k* + *l(N* − *No)*, we obtain

$$X\_o^l = e^{i\alpha t\_{l(N-N\_o)}} \sum\_{k'=l(N-N\_o)}^{N+l(N-N\_o)} x\_{k'} W\_{k'-l(N-N\_o)} e^{-i\alpha t\_{k'}}.\tag{16}$$

Assuming that the window function is symmetrical in the middle of each block (which is verified for most windows used in the literature), i.e. *Wk* = *WN*−*k*, Eq. (16) gives

214 I. Maingonnat et al.

$$X\_{\alpha}^{l} = e^{l\alpha \mathfrak{f}\_{l(N-N\_o)}} \sum\_{k'=l(N-N\_o)}^{N+l(N-N\_o)} W\_{N+l(N-N\_o)-k'\mathcal{X}k'} e^{-l\alpha \mathfrak{t}\_{k'}}.\tag{17}$$

Finally, by choosing *Wk* <sup>=</sup> *bk* which sets *<sup>m</sup>* <sup>=</sup> *<sup>N</sup>* <sup>2</sup> , relation (14) yields

$$X\_{ao}^{l} = e^{i\alpha t\_{l(N-N\_0)}} (\mathcal{L}(\mathbf{x}\_l e^{-i\alpha t}))\_{\frac{N}{2} + l(N-N\_0)}.\tag{18}$$

Consequently, up to a phase, the Welch method corresponds to the computation of the complex demodulated signal sampled every *N* − *No*. The phase shift cancels when computing the correlation over *Nb* blocks:

$$\sum\_{l=0}^{N\_b} X\_{\alpha}^{l} X\_{\alpha}^{l} \stackrel{\*}{=} \sum\_{l=0}^{N\_b} (\mathcal{L}(\mathbf{x}\_l e^{-l\alpha t}))\_{\frac{N}{2} + l(N - N\_o)} (\mathcal{L}^\*(\mathbf{x}\_l e^{-l\alpha t}))\_{\frac{N}{2} + l(N - N\_o)}.\tag{19}$$

The CSD can thus be obtained by computing the correlation over *Nb* snapshots sampled every *N* − *No* of the complex demodulated signal, which is proving the equivalence between the two methods for appropriate numerical parameters.

As said before, it can be remarked that the window function acts as a filter in the Welch procedure, but without giving an explicit expression of the frequency band. Moreover, if we aim at studying the whole broadband field, the classical SPOD (with its single frequency interpretation when the window function is designed to only estimate a pure harmonic) requires the computation of a set of spatial modes at each discrete frequency in the peak. In comparison, BBPOD computes at only one frequency the dominant modes of the wave field. This has the drawback for the SPOD algorithm to include at each frequency a few modes of nearby frequency due to the convolution with a finite length window, leading to misinterpretation or counting the same mode several times in the reconstruction of the whole wave field. More precisely, BBPOD gathers the first SPOD mode of each frequency providing a simpler representation of the whole wave field and avoids spurious modes in the reconstruction. Another important remark is that the connection of our method with SPOD lies on the fact that the spectral peak is sufficiently narrow corresponding of a sufficiently large window in the Welch method guaranteeing a good estimation of the CSD.

The final step is to compute the eigenvalues and eigenvectors of the CSD, estimated by complex demodulation. The complex demodulated signal at frequency *ω* is then decomposed on this BBPOD basis:

$$\mathfrak{q}\_d(\mathbf{x}, \mathbf{y}, t) = \sum\_j a\_j(\boldsymbol{\omega}, t) \boldsymbol{\Psi}\_j(\mathbf{x}, \mathbf{y}, \boldsymbol{\omega}), \tag{20}$$

with *aj (ω, t)* = - *q*<sup>∗</sup> *<sup>d</sup> (x, y, t)***W***(x, y)ψ<sup>j</sup> (x, y, ω)* d*x*d*y* slowly varying coefficients.

#### **3.2.3 Extended Broadband Proper Orthogonal Decomposition**

Non-linear interactions between slow variations of the background flow and the wave induce incoherent wave contributions through triadic interactions. A major interest of BBPOD is that it allows us to study the correlations between the slow perturbations of the jet, and the incoherence of the fast wave field, by extracting the slowly varying complex demodulated amplitudes. To that end, we propose to apply the concept of EPOD in the framework of BBPOD. EPOD, originally presented in Boree (2003), enables to identify the part of a target field correlated with a given POD mode. The target field we first consider is the balanced motion obtained by low-pass filtering. The p-th extended POD mode of the slow motion *q<sup>j</sup>* correlated to the p-th BBPOD coefficient of the wave *a<sup>ω</sup> <sup>p</sup>* is defined by:

$$\chi\_p^j = \frac{\mathbb{E}(\mathbf{q}^j a\_p(\omega))}{\lambda\_p(\omega)},\tag{21}$$

where the expectation operator is a temporal average over snapshots. The mode *χj <sup>p</sup>* will be referred to as *direct EPOD*, and *ap(ω)χ<sup>j</sup> <sup>p</sup>* represents the part of the jet correlated with the p-th broadband POD coefficient of the wave.

Complementary to direct EPOD, we define as well an *inverse EPOD*, applying BBPOD for *ω* = 0, thus obtaining an orthonormal basis representative of the jet variability, and considering the complex demodulated of the wave contribution *q<sup>d</sup>* as the target field. In this situation, we identify the contribution of the incoherent wave field correlated with the BBPOD coefficients of the jet noted *ap(ω* = 0*)*. Therefore, in the following direct EPOD refer to jet modes while inverse EPOD refer to wave modes.

## **4 Results**

This section details the numerical results carried out in this work. The goal is to understand the non-linear interactions between the balanced motion and the wave field by means of the methods presented in Sect. 3, that will extract the correlations between the two dynamics.

Figure 2 shows the real part of the first three energy-scaled BBPOD modes at the tidal frequency, sorted by decreasing energy (4). Specifically here, we plot the scaled quantities *λjψj* to highlight their respective energy contribution, but the analysis was performed with the normalized modes. To define the low pass filter L, a fourth order Butterworth filter has been taken with a frequency cut-off equals to the typical frequencies of the jet, such that the whole wave-field scattered by the jet is captured. The first mode containing the most energy corresponds to the coherent wave field, that is defined here as the part of the wave correlated to the tidal forcing and thus phase-locked in time, since it corresponds to the time-average

**Fig. 2** Real part of the first three energy-scaled BBPOD modes *λj ψj* associated with *u* at the tidal frequency. Sponge regions are not shown


of the complex-demodulated signal. Other modes account then for the incoherent structures as space-time decorrelated of the mean. They show deflections in opposite direction, with mode-number *mx* = ±2, and with meridional mode-number *my* expected to be determined by the dispersion relation (3) and the structure of the jet (see Bühler 2014). This is a consequence of the homogeneity in the zonal direction, guaranteed by our idealised set-up, which makes SPOD modes equal to Fourier modes in *x*.

Table 1 shows the normalised energy contained in the first five BBPOD modes and in the incoherence field. It is computed for the domain without sponge regions noted = [0*,* <sup>20</sup>] × [−12*,* <sup>12</sup>] and for the subset *N* = [0*,* <sup>20</sup>] × [−2*,* <sup>12</sup>] representing the region of incoherences.

In the total domain, the coherent plane wave accounts for 57.35% of the energy and the first three modes 81% of the energy. This shows that linear effects dominate the wave propagation, because we are in a configuration for which incoherence is relatively weak. As explained in Ward and Dewar (2010), a more energetic jet interacting with a tide with higher frequency would result in a more energetic incoherent wave field for a RSW simulation. As expected, incoherent modes have nearly all their energy in the upper part while it is rather equally distributed for the coherent mode. In this region, incoherences are dominant due to non-linearities

**Fig. 3** Cumulative energy *<sup>N</sup> <sup>j</sup>*=<sup>1</sup> *λj (ω)/*+∞ *<sup>j</sup>*=<sup>1</sup> *λj (ω)*[%] contained in the reconstruction of the incoherent field by BBPOD modes

in the center of the domain, which leads to an increase of the amplitude of the modes with *y*, even though the coherent mode still contains the most energy. Figure 3 shows the energy contained in the reconstructed incoherent wave field in terms of cumulative energy. The first two incoherent modes account for 58% of the total incoherent energy. For 6 modes, 80% of the energy is recovered in the reconstruction. The cumulative energy can be understood as a normalized RMSE measuring the accuracy of the basis to reconstruct the true incoherent wave field *qinc* according to <sup>E</sup>*(qinc*−*<sup>N</sup> <sup>j</sup>*=1*(qinc,ψ<sup>j</sup> )ψ<sup>j</sup>* <sup>2</sup> *<sup>L</sup>*2*())* <sup>E</sup>*(qinc*<sup>2</sup> *<sup>L</sup>*2*())* <sup>=</sup> <sup>1</sup>−*<sup>N</sup> <sup>j</sup>*=<sup>1</sup> *λj (ω)/*+∞ *<sup>j</sup>*=<sup>1</sup> *λj (ω)*. It can be remarked that spatial coherence can be computed with the coherence function *γ* through the CSD.

A SPOD decomposition was also performed in Egbert and Erofeeva (2021) for a realistic HYCOM simulation. Similar results were found for the energy captured by the SPOD modes at the M2 frequency, which is similar to our forcing frequency. This encourages to extend this analysis to a more realistic simulation.

Figure 4 depicts the direct EPOD modes of the balanced motion, correlated with the BBPOD modes of the wave. The modes are weighted as explained before. The first extended mode is showing that the part of the jet correlated with the coherent field is the stationary mean flow, which connects the first order statistic of both dynamics. This results is the expression of the linear propagation of the incident wave through the mean flow (in time and in the x-direction). The second and third EPOD modes represent the meanders of the jet, consisting of a vortex train. The modes are defined up to a complex constant. We can see that *χ<sup>j</sup>* <sup>2</sup> is approximately proportional to *χ<sup>j</sup>* <sup>3</sup>, suggesting that meandering of the jet generates eastward and westward propagating incoherent perturbations.

**Fig. 4** On the left column are presented the real part of the first three BBPOD modes of *u* at the tidal frequency. On the last column are the corresponding direct EPOD modes of *u*. Modes are scaled as follows: *λj ψj*

The opposite procedure is done next by inverse EPOD. Figure 5 shows the leading three jet (energy-scaled) BBPOD modes associated with *ω* = 0 and the associated EPOD modes of the wave. As in direct EPOD, the first mode is showing the correlation between the mean of both dynamics. For the second and third modes,

**Fig. 5** On the first column are presented the first three BBPOD modes of *u* of the jet. On the last column are the associated inverse EPOD modes of *u*. Modes are scaled as follows: *λj ψj*

BBPOD shows jet meandering structures identical to the direct EPOD modes (up to a phase). However, their associated inverse EPOD indicate a standing wave corresponding to the sum of the two direct broadband POD modes that represent left and right deflections, expressing the fact that the EPOD and BBPOD association is not bijective. A given meander will give rise to a wave deflected eastward nearly as much as a wave deflected westward, and their relative contribution is not possible to predict, but the present result suggests that an approximately equal repartition would constitute a fair estimate.

Globally, we can infer that the most energetic triadic interactions between the jet and the plane wave are caused by meandering structures of the jet generating an incoherent field of the form of a standing wave in the zonal direction *x* and propagating northward. The rising of a standing wave suggests in average an equal distribution of the eastward and westward propagating modes. This gives nice perspectives of incorporating this methodology in the context of data assimilation to estimate the incoherent wave field with some knowledge of the slow balanced flow through triadic interactions.

## **5 Summary and Perspectives**

This study investigates the correlated processes at stake within the non-linear interactions of a plane wave propagating through a meso-scale oceanic current. For that purpose, the method of broadband POD is introduced and connected to extended POD. A first methodological point is done to show that BBPOD is equivalent to the common Welch method to estimate the cross spectral density matrix, to which SPOD are the eigenfunctions. More importantly for our study, connecting these two methods allows us to understand that the most energetic features of the wave are obtained by triadic interactions with the most energetic features of the jet. In our idealised configuration, the wave is deflected in both directions by a given meander. The spatial periodicity of the meander sets the zonal deflections, giving us clearer insight on the BBPOD of the wave knowing only the extended modes of the jet. This method can be seen as complementary to methods based on the bispectrum to detect triadic interactions in the data. Here we have an a priori on the nature of the interactions, between a low frequency jet and a fast wave field, allowing to study directly their correlations and to propose a methodology well suited for reconstruction. In a future work, we intend to examine the sensitivity of the results to other jet or wave configurations, and even to richer models. Studying the implications of these methods to reconstruct by correlation the internal wave field by the knowledge of the instantaneous slow motion is also envisaged for data assimilation problematics, either by direct estimation, or for learning an observation operator associated with a Galerkin model defined in the BBPOD space.

## **References**

Boree, Jacques (Aug. 2003). "Extended proper orthogonal decomposition: A tool to analyse correlated events in turbulent flows". In: *Experiments in Fluids* 35, pp. 188–192. DOI: https:// doi.org/10.1007/s00348-003-0656-3.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Linear Wave Solutions of a Stochastic Shallow Water Model**

**Etienne Mémin, Long Li, Noé Lahaye, Gilles Tissot, and Bertrand Chapron**

**Abstract** In this paper, we investigate the wave solutions of a stochastic rotating shallow water model. This approximate model provides an interesting simple description of the interplay between waves and random forcing ensuing either from the wind or coming as the feedback of the ocean on the atmosphere and leading in a very fast way to the selection of some wavelength. This interwoven, yet simple, mechanism explains the emergence of typical wavelength associated with near inertial waves. Ensemble-mean waves that are not in phase with the random forcing are damped at an exponential rate, whose magnitude depends on the random forcing variance. Geostrophic adjustment is also interpreted as a statistical homogenization process in which, in order to conserve potential vorticity, the small-scale component tends to align to the velocity fields to form a statistically homogeneous random field.

# **1 Introduction**

Oceanic global circulation currents show a predominance of near inertial waves (NIW) in their spectrum. These waves result from the repeated forcing of atmospheric winds over the globe together with the influence of global earth rotation at Coriolis frequency. Besides, large-scale numerical representations of oceanic circulation require to introduce subgrid scale models that inescapably damp in the long run velocity fields and wave solutions. In such models high frequency waves are obviously completely smoothed out. But even large scale structures might be also affected if no particular attention is paid to the subgrid model design.

E. Mémin (-) · L. Li · N. Lahaye · G. Tissot

ODYSSEY Team, Centre Inria de l'Université de Rennes, Rennes, France

IRMAR – UMR CNRS 6625, Rennes, France e-mail: etienne.memin@inria.fr

B. Chapron ODYSSEY Team, Centre Inria de l'Université de Rennes, Rennes, France Laboratoire d'Océanographie Physique et Spatiale, Ifremer, Plouzané, France

In the last year, there has been an increasing effort to devise stochastic parameterizations for large-scale flows [2, 10, 11]. The motivations come from the failure of classical subgrid models to represent accurately the effect of the flow state variables at the unresolved scales and the necessity to provide reliable and computationally efficient models at the climatic scale. Uncertainty quantification, ensemble methods for forecasting and data assimilation are also prevailing, and the Bayes principle on which they are built leads naturally to considering stochastic dynamics.

Stochastic dynamics can hardly be devised on ad hoc grounds if one wants to provide generic and flexible systems. Control of the variance's growth and the respect of the physical properties of the underlying turbulent flows is of the utmost importance to define stable and physically relevant systems, in which the unresolved variables or the different physical and numerical approximations performed are faithfully represented. Two main schemes have been recently proposed to that purpose in the literature [12, 20]. The first one is a geometric framework relying on a Hamiltonian formulation, whereas the second one—referred to as modelling under location uncertainty (LU)—is based on Newton's principles and a stochastic formulation of the Reynolds transport theorem. Both schemes have been analyzed and numerically experimented for several geophysical models and configurations [1, 4, 5, 6, 17, 23, 25]. Surface waves and linear models have been proposed in the LU framework [8, 26, 27].

Nevertheless, the analysis of basic geostrophic mechanisms such as geostrophic adjustment or the form of basic wave solutions in linearized simple systems has not been investigated so far. This is the objective of this work. We will in particular analyze a stochastic rotating shallow water model recently proposed in [4, 16]. The focus will be on wave solutions of this system. We will show in particular that ensemble-mean waves ensuing from a random forcing at given frequencies are preserved whereas the others are very quickly damped. The stochastic model will allow us to propose a very simple mechanism leading to the emergence and conservation of NIW in atmosphere and ocean dynamics as a coupled self entrainment process.

The paper is organized as follows. In a first section we briefly recall the proposed stochastic model—rotating shallow water under location uncertainty (RSW-LU). We then identify a stationary solution of this stochastic system. We next analyze and illustrate the plane wave solutions associated to the linearized system under some specific noises. Finally, we provide a picture of the geostrophic adjustment process associated to the random system.

## **2 Review of RSW-LU**

Let us first recall the stochastic transport operator introduced in the LU framework [1, 18, 17, 24]:

$$\mathbb{D}\_{\mathbf{l}}\Theta := \mathbf{d}\_{\mathbf{l}}\Theta + \left(\left(\mathbf{u} - \frac{1}{2}\nabla \cdot \mathbf{a}\right)\mathbf{d} + \sigma\_{\mathbf{l}}\mathbf{d}\mathbf{B}\_{\mathbf{l}}\right) \cdot \nabla\Theta - \frac{1}{2}\nabla \cdot \left(\mathbf{a}\nabla\Theta\right)\mathbf{d}\mathbf{t} = 0,\quad(\text{la})$$

where the tracer, *Θ*, is a stochastic process with an extensive property (e.g. temperature, salinity, buoyancy), d*tΘ(x)* := *Θ(x, t* + *δt)* − *Θ(x,t)* stands for the time-increment of *Θ* at a fixed point *x* with *δt* an infinitesimal time variation, *u* denotes the time-smooth resolved velocity that is both spatially and temporally correlated, *σt*d*B<sup>t</sup>* stands for the highly oscillating unresolved noise component, assumed in the present study to be divergence-free, with its quadratic variation [22] denoted by *a* and <sup>1</sup> <sup>2</sup>**∇ ·** *a* is the so-called Itô-Stokes drift [1] ensuing from the inhomogeneity of the noise. The mathematical definitions of the noise and its quadratic variation are given by

$$\sigma\_I \mathrm{d}\mathcal{B}\_I(\mathbf{x}) = \int\_{\mathcal{D}} \check{\sigma}(\mathbf{x}, \mathbf{y}, t) \mathrm{d}\mathcal{B}\_I(\mathbf{y}) \, \mathrm{d}\mathbf{y} = \sum\_{n \in \mathbb{N}} \lambda\_n^{1/2}(t) \mathfrak{k}\_n(\mathbf{x}, t) \mathrm{d}\beta\_t^n,\tag{1b}$$

$$a(\mathbf{x},t) = \int\_{\mathcal{D}} \check{\sigma}(\mathbf{x}, \mathbf{y}, t) \check{\sigma}^{\dagger}(\mathbf{y}, \mathbf{x}, t) \, \mathrm{d}\mathbf{y} = \sum\_{n \in \mathcal{N}} \lambda\_{n}(t) \left(\boldsymbol{\xi}\_{n} \boldsymbol{\xi}\_{n}^{\dagger}\right)(\mathbf{x}, t), \tag{1c}$$

where *<sup>σ</sup><sup>t</sup>* is an integral operator defined on the Hilbert space *(L*2*(*D*))<sup>d</sup>* with a bounded spatial domain <sup>D</sup> <sup>⊂</sup> <sup>C</sup>*<sup>d</sup>* (*<sup>d</sup>* <sup>=</sup> 2 or 3), *<sup>σ</sup>*˘ <sup>=</sup> *(σ*˘*ij )i,j*=1*,...,d* is a spatially and temporally bounded matrix kernel of *σ<sup>t</sup>* , *λn* and *ξ <sup>n</sup>* are the eigenfunctions and eigenvalues of the composite operator *σtσ*<sup>∗</sup> *<sup>t</sup>* (*σ*<sup>∗</sup> *<sup>t</sup>* denotes the adjoint of *<sup>σ</sup>t*), •† stands for transpose-conjugate operation, *B<sup>t</sup>* is a cylindrical Wiener process [22] and *β<sup>n</sup> t* are independent (one-dimensional) standard Brownian motions.

Under the stochastic transport notations (1), the governing equations of the energy-preserved RSW-LU system [4] read

*(Conservation of momentum)*

$$\mathbb{D}\_l \mathfrak{u} + f\_0 \mathfrak{u}^\perp \, \mathrm{d}t = -\mathrm{g} \, \nabla \eta \, \mathrm{d}t,\tag{2a}$$

*(Conservation of mass)*

$$\mathbb{D}\_l h + h \nabla \cdot \mathbf{u} \,\mathrm{d}t = 0,\tag{2b}$$

*(Incompressible constraints)*

$$
\sigma\_l \mathbf{d} \mathbf{B}\_l = \nabla^\perp \varphi \mathbf{d} B\_l, \qquad \nabla \cdot \nabla \cdot \mathbf{a} = 0,\tag{2c}
$$

*(*Conservation of energy*)*

$$\mathrm{d}\_{l} \int\_{\mathcal{D}} \frac{1}{2} \rho \left( h|\boldsymbol{u}|^{2} + gh^{2} \right) \mathrm{d}\mathbf{x} = 0,\tag{2d}$$

where *u* = [*u, v*] *<sup>T</sup>* denotes the two-dimensional horizontal velocity with *u*<sup>⊥</sup> = [−*v, u*] *<sup>T</sup>* , **∇** = [*∂x , ∂y* ] *<sup>T</sup>* stands for the horizontal gradient with **∇**<sup>⊥</sup> = [−*∂y , ∂x* ] *T* , *h(x,t)* = *H* + *η(x,t)* is the water thickness with *H* the flat bottom height and *η* the free-surface position, *f*<sup>0</sup> is a constant approximation of the Coriolis parameter, *g* is the gravity constant, *ρ* is the water density and *ϕ*d*Bt* denotes a random (scalar) stream function defined in a similar way as in (1b). As shown in [4], the incompressible conditions (2c) for both noise and Itô-Stokes drift ensure the path-wise conservation of the total energy (2d). We remark that the analytical properties for the RSW-LU system have been investigated in [16] and some numerical applications of (2) have been performed in [4].

## **3 Stationary Solution**

We focus now on stationary solutions associated to the previous system. To that end, neglecting the time increments d*tu* and d*th* in (2a) and (2b) and recalling that **∇***η* = **∇***h* due to the flat bottom assumption, we obtain

$$\left( \left( \mathfrak{u} - \frac{1}{2} \nabla \cdot \mathfrak{a} \right) \cdot \nabla \mathfrak{u} - \frac{1}{2} \nabla \cdot (\mathfrak{a} \nabla \mathfrak{u}) + f\_0 \mathfrak{u}^\perp + g \nabla h \right) \mathrm{d}t + \sigma\_I \mathrm{d} \mathbf{B}\_I \cdot \nabla \mathfrak{u} = 0,\qquad(\text{3a})$$

$$\int \left( \left( \mu - \frac{1}{2} \nabla \cdot \mathbf{a} \right) \cdot \nabla h - \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla h) + h \nabla \cdot \mathbf{u} \right) \mathrm{d}t + \sigma\_l \mathrm{d}B\_l \cdot \nabla h = 0,\qquad(3b)$$

$$
\sigma\_l \mathrm{d}B\_l = \nabla^{\perp} \varphi \mathrm{d}B\_l, \quad \mathfrak{a} = \nabla^{\perp} \varphi (\nabla^{\perp} \varphi)^r. \tag{3c}
$$

From Bichteler-Dellacherie decomposition of a semi-martingale, the martingale terms (i.e. the Brownian terms) and the finite variation terms (i.e. the differentiable terms) can be safely separated. Decomposing in such way the mass equation (3b) and considering the corresponding martingale part leads to *σt*d*B<sup>t</sup>* **· ∇***h* = 0. The random surface height gradient is aligned with the large-scale surface height gradient

$$
\sigma\_l \mathbf{d} \mathbf{B}\_l = \alpha \nabla^\perp h,\tag{4}
$$

The noise being divergence free yields that *α* is constant along the level sets of *h*:

$$
\nabla \boldsymbol{\alpha} \cdot \nabla^{\perp} h = 0. \tag{5}
$$

It is hence necessary of the form *α* = *φ(h)* + *C*, with *C* a constant, and *φ* a differentiable function. For such a noise the variance tensor is given by

$$\mathbf{a} = \alpha^2 \mathbf{V}^\perp h \mathbf{V}^\perp h^r,\tag{6}$$

while the ISD reads

$$
\nabla \cdot \mathbf{a} = \alpha^2 \nabla \cdot (\nabla^\perp h \nabla^\perp h^r) = \alpha^2 (\nabla^\perp h \cdot \nabla) \nabla^\perp h,\tag{7}
$$

and we notice that *a***∇***h* = 0. The mass equation then boils down to

$$\left(\boldsymbol{\mu} - \frac{1}{2}\boldsymbol{\nabla} \cdot \boldsymbol{a}\right) \cdot \boldsymbol{\nabla}h + h\boldsymbol{\nabla} \cdot \boldsymbol{\mu} = 0.\tag{8a}$$

In the same way, we get from a semi-martingale decomposition of (3a) that

$$
\sigma\_l \mathbf{d} \mathbf{B}\_l \cdot \nabla \mathbf{u} = 0,\tag{8b}
$$

which yields

$$\left(\boldsymbol{\mu} - \frac{1}{2}\boldsymbol{\nabla} \cdot \boldsymbol{a}\right) \cdot \boldsymbol{\nabla}\boldsymbol{u} + f\_0 \boldsymbol{u}^\perp + g\boldsymbol{\nabla}h = 0.\tag{8c}$$

At this point it is interesting to interpret the random streamfunction in terms of nondimensional units to infer the importance of the Itô-Stokes drift (ISD) in comparison to geostrophic flows. To that end, we assume that *<sup>ϕ</sup>*d*Bt* <sup>∼</sup> <sup>√</sup>*T (gH/f*0*)* <sup>√</sup>*<sup>T</sup>* , where *T* denotes the characteristic time scale and *T* stands for a small-scale decorrelation time with  the strength of uncertainty (the greater  the stronger the noise is). From definition (3c), the noise's quadratic variation scales then as *<sup>a</sup>* <sup>∼</sup> *T (gH )*2*/(f*0*L)*<sup>2</sup> with *<sup>L</sup>* the characteristic length scale. As a consequence, the ratio of the ISD advection term with the geostrophic gradient pressure scales as

$$\frac{(\nabla \cdot \mathbf{a}) \cdot \nabla \mathbf{u}}{\mathbf{g} \nabla h} \sim \epsilon \frac{\mathbf{g} \, H}{f\_0^2 L^2} = \epsilon \frac{L\_d^2}{L^2} = \epsilon \mathbf{B}\_\mathbf{U},\tag{9}$$

where *Ld* <sup>=</sup> <sup>√</sup>*gH /f*<sup>0</sup> is the Rossby deformation radius, and Bu denotes the Burger number, which stands for the ratio between vertical density stratification and the earth's rotation in the horizontal (Bu <sup>=</sup> *(NH/L)*<sup>2</sup> <sup>=</sup> *(Ld /L)*2). If the ISD advection has the same importance as the gradient pressure term, then we must have <sup>√</sup> <sup>=</sup> *L/Ld* . This means that when the scale of motions is greater than the deformation radius, the rotation will dominate and the noise must be important to have the ISD playing a role. At the opposite, when the scale of motions is smaller than the Rossby radius, the small-scale flow component does not need to be important to be as significant as the gradient pressure term. From this point of view, the deformation radius can be interpreted as the limiting scale under which the statistically modified advection due to the inhomogeneity of the small-scale component plays a role.

In the following, we will assume to be at a scale much larger than the deformation radius, so that the action of the ISD becomes negligible. In that case, the stationary system finally simplifies as

$$g(\mathbf{u}\cdot\nabla)\mathbf{u} + f\_0\mathbf{u}^\perp + \mathbf{g}\nabla h = \nabla \cdot (\mathbf{u}h) = 0.\tag{10}$$

This system is structurally the same as the nonlinear stationary system, at the exception of a scaling constraint on the ISD, which makes negligible the nonlinear advection term. As a matter of fact, noticing that for *u* ∝ **∇**⊥*h*, the advection term corresponds to the ISD (7), it follows that the geostrophic balance, *u* = −*(g/f*0*)***∇**⊥*h*, is a stationary solution of such system for a null ISD. We show below it is the only non-trivial stationary solution of such system.

Let us write the velocity as a superposition of the geostrophic component and an ageostrophic component:

$$
\mu = -\frac{g}{f\_0} \nabla^\perp h + \mathfrak{v},
\tag{11}
$$

where *v* is defined through the Helmholtz decomposition from a potential function, *Φ*, and a stream function *Ψ* that both depend on the surface elevation:

$$\boldsymbol{\upsilon} = \nabla^{\perp} \boldsymbol{\Psi}(\boldsymbol{h}) + \nabla \boldsymbol{\Phi}(\boldsymbol{h}), \tag{12}$$

$$=\nabla^{\perp}h\,\,\Psi'(h) + \nabla h\,\,\Phi'(h).\tag{13}$$

As **∇**⊥*h* belongs to the null space of the velocity gradient (8b), **∇***h* either belongs also to the null space or it is an eigenvector of the velocity gradient tensor *(***∇***u)*. From the momentum equation (10) we have

$$\left(\nabla h \phi'(h)\right) \cdot \nabla \boldsymbol{v} = -f \,\boldsymbol{v}^+.$$

Then if **∇***h* belongs to the null space of the velocity gradient, *v* directly cancels out. If is an eigenvector of the velocity gradient with eigenvalue *λ*, the above equation reads

$$
\lambda \nabla h \Phi'(h) = -f \left( \nabla h \Psi'(h) + \nabla^{\perp} h \Phi'(h) \right),
$$

which implies *Φ (h)* = *Ψ (h)* = 0 and hence *v* = 0.

Physically, we see hence that the considered nonlinear system admits a stationary solution for a negligible ISD. Let us note that no such constraint is available in the deterministic setting. In the following we will interpret this stationary solution in the linearized stochastic shallow water system in terms of wave solutions and geostrophic adjustment.

## **4 Stochastic Rotating Shallow Water Waves**

In order to look at the different wave solutions associated to the stochastic shallow water system (2), we proceed in the same way as in the deterministic case [19, 28]. In particular, we assume that the noise's structure in (1b) is independent of the resolved prognostic variable *u*. In that case, the associated linearized system of (2) reads

$$\operatorname{d}\_{I}\mathfrak{u} + \left(f\_{0}\mathfrak{u}^{\perp} + \operatorname{g}\nabla\eta - \frac{1}{2}\nabla \cdot (\operatorname{a}\nabla\mathfrak{u})\right)\mathfrak{d} + \sigma\_{I}\operatorname{d}\mathcal{B}\_{I} \cdot \nabla\mathfrak{u} = 0,\tag{14a}$$

$$\operatorname{ad}\_{\mathcal{I}}\eta + \left(H\nabla \cdot \mathfrak{u} - \frac{1}{2}\nabla \cdot (\mathfrak{a}\nabla \eta)\right)\mathrm{d}t + \sigma\_{\mathcal{I}}\mathrm{d}B\_{\mathcal{I}} \cdot \nabla \eta = 0,\tag{14b}$$

$$\nabla \cdot \sigma\_l \mathbf{d} \mathbf{B}\_l = \nabla \cdot \mathbf{a} = 0,\tag{14c}$$

$$\mathrm{d}\_{l} \int\_{\mathcal{D}} \frac{1}{2} \rho \left( H|\mathbf{u}|^{2} + \mathbf{g} (H + \eta)^{2} \right) \mathrm{d}\mathbf{x} = 0. \tag{14d}$$

It can be checked that this system conserves the total energy (14d) in the same way as the initial nonlinear stochastic system (2) does. For noise defined through Hilbert-Schmidt correlation tensor this system admits a mild solution [22]. Existence of strong solution could also be shown from the nonlinear system [16]. In order to build some simple analytical solutions of this linearized system and to better understand the physical behaviours of the resulting waves, only very specific noise models will be considered in the following. We first build the ensemble-mean wave solutions under homogeneous noise, then investigate the path-wise solutions under constant noise and under homogeneous noise but with very smooth structures.

## *4.1 Ensemble-Mean Waves Under Homogeneous Noise*

Let us first recall the definition of the homogeneous and incompressible noise:

$$\sigma\_{l}(\mathbf{x})\mathrm{d}\mathcal{B}\_{l} = \int\_{\mathcal{D}} \nabla^{\perp} \varphi(\mathbf{x} - \mathbf{y}) \mathrm{d}\mathcal{B}\_{l}(\mathbf{y}) \, \mathrm{d}\mathbf{y}$$

$$= \sum\_{m} i k\_{m}^{\perp} \widehat{\varphi}(\mathbf{k}\_{m}) \mathrm{d}\beta\_{l}(\mathbf{k}\_{m}) \exp(i \mathbf{k}\_{m} \cdot \mathbf{x}), \tag{15a}$$

$$a = \sum\_{m} |\widehat{\varphi}(\mathbf{k}\_m)|^2 \mathbf{k}\_m^\perp (\mathbf{k}\_m^\perp)^T,\tag{15b}$$

where *i* denotes the imaginary unit, *k<sup>m</sup>* = [*km, m*] *<sup>T</sup>* <sup>∈</sup> <sup>R</sup><sup>2</sup> is the *<sup>m</sup>*-th wavenumber vector, • stands for Fourier transform (in space) coefficient and *βt* <sup>∈</sup> <sup>C</sup> are independent Brownian motions satisfying *βt(*−*km)* = *βt(km)* with Re{*βt*} and Im{*βt*} be independent. This noise is homogeneous, and thus associated to a constant matrix *a*. Its ISD is null and fits naturally the condition on the stationary solution found in the previous section.

For sake of simplicity, we assume hereafter that the noise is defined by only one Fourier mode (associated to a wavenumber *k<sup>σ</sup>* ) combined with its complex conjugate, namely

$$\sigma\_I(\mathbf{x}) \mathrm{d}B\_I = \mathrm{Re}\{ik\_\sigma^\perp \alpha \exp(ik\_\sigma \cdot \mathbf{x}) \, \mathrm{d}\beta\_l\}, \quad \mathbf{a} = |\alpha|^2 k\_\sigma^\perp (k\_\sigma^\perp)^T,\tag{16}$$

where *<sup>α</sup>* <sup>=</sup> *ϕ(k<sup>σ</sup> )* is assumed to be deterministic and real. This monochromatic noise can be directly extended to a multi-scale noise model (15a). Results with a multiscale version of the noise will be shown in the numerical section.

Taking now the expectation (E) of the linearized random system (14), we have

$$
\partial\_t \mathbb{E}[\mathfrak{u}] + f\_0 \mathbb{E}[\mathfrak{u}]^\perp + \mathrm{g} \nabla \mathbb{E}[\eta] - \frac{1}{2} \nabla \cdot (\boldsymbol{a} \nabla \mathbb{E}[\mathfrak{u}]) = 0,\tag{17a}
$$

$$\partial\_t \mathbb{E}[\eta] + H\nabla \cdot \mathbb{E}[\mu] - \frac{1}{2} \nabla \cdot (a \nabla \mathbb{E}[\eta]) = 0,\tag{17b}$$

where *∂t* denotes the partial time derivative. In order to infer the mean wave solutions, we look for a deterministic ansatz of the form

$$\mathbb{E}[\widetilde{\widetilde{q}}\,]\,(\mathbf{x},t) = \widehat{q}\_0 \, \exp\left(i\,(\mathbf{k}\cdot\mathbf{x}-\omega t)\right),\tag{18}$$

where *q* = [*u, η*] *<sup>T</sup>* = Re *q* is a compact notation for the prognostic variables of the RSW-LU (14),*q*<sup>0</sup> is the initial constant vector (also assumed to be deterministic) and *<sup>ω</sup>* is the time-frequency. We remark that <sup>E</sup>[ *<sup>q</sup>*] = <sup>E</sup>[*q*] due to the commutativity of expectation with linear operators.

Injecting next the previous ansatz together with the constant matrix *a* (16) into the system (17), we get <sup>L</sup>*q*<sup>0</sup> <sup>=</sup> <sup>0</sup> with

$$\mathbf{L} = \left(-i\boldsymbol{\omega} + \frac{1}{2}|\boldsymbol{\alpha}|^2 (\mathbf{k}\_{\sigma} \times \mathbf{k})^2\right) \mathbf{I}\_3 + \begin{bmatrix} 0 & -f\_0 \ i\boldsymbol{g}k \\ f\_0 & 0 & i\boldsymbol{g}\ell \\ i\boldsymbol{H}k \ i\boldsymbol{H}\ell & 0 \end{bmatrix},\tag{19}$$

where I3 stands for the 3 × 3 identity matrix, *k<sup>σ</sup>* × *k* = *kσ* − *σ k* denotes the cross product between the wavenumber vectors *k<sup>σ</sup>* and *k* with |*k<sup>σ</sup>* × *k*|=|*k<sup>σ</sup>* ||*k*|sin*(θ )* and *θ* is the angle between them.

As usual, the dispersion relations are then given by the solution of det*(*L*)* = 0, namely

$$
\omega = -\frac{i}{2}|\alpha|^2 (\mathbf{k}\_{\sigma} \times \mathbf{k})^2, \quad \omega = -\frac{i}{2}|\alpha|^2 (\mathbf{k}\_{\sigma} \times \mathbf{k})^2 \pm \sqrt{gH|\mathbf{k}|^2 + f\_0^2}. \tag{20}
$$

We realize immediately that when the noise has the same direction as the initial wave (i.e. *k<sup>σ</sup>* ×*k* = 0), these two frequencies correspond to the steady and Poincaré (inertia-gravity) waves of the deterministic system [19, 28]. Conversely, when the noise is not aligned to the initial wave, then the term <sup>−</sup> *<sup>i</sup>* <sup>2</sup> |*α*| <sup>2</sup>*(k<sup>σ</sup>* <sup>×</sup> *<sup>k</sup>)*<sup>2</sup> leads to a damping of the ensemble mean wave. We next construct the mean plane wave solutions associated to the frequencies (20).

#### **4.1.1 Mean Poincaré Waves**

With the value of the last two (opposite) frequencies in (20) and the associated eigenvector of L (19), one obtains the following polarization relations:

$$\mathbb{E}[\widetilde{\boldsymbol{\mathfrak{q}}}] = \begin{bmatrix} \frac{\alpha k + if\_0 \ell}{H|\boldsymbol{k}|^2} \\ \frac{\alpha \ell - if\_0 \ell}{H|\boldsymbol{k}|^2} \\ 1 \end{bmatrix} \widehat{\eta\_0} \exp\left(i(\boldsymbol{k} \cdot \boldsymbol{x} - \omega t)\right) \exp\left(-\frac{1}{2}|\boldsymbol{\alpha}|^2 (\boldsymbol{k}\_\sigma \times \boldsymbol{k})^2 t\right). \tag{21}$$

Taking the real part, we finally deduce the ensemble-mean wave solution:

$$\mathbb{E}[\eta] = \widehat{\eta\_0} \cos \left( \mathbf{k} \cdot \mathbf{x} - \omega t \right) \exp \left( -\frac{1}{2} |\alpha|^2 (\mathbf{k}\_\sigma \times \mathbf{k})^2 t \right), \tag{22a}$$

$$\mathbb{E}\left[\boldsymbol{\mu}\right] = \mathbb{E}\left[\boldsymbol{\mu}\_{\parallel}\right]\frac{\boldsymbol{k}}{|\boldsymbol{k}|} + \mathbb{E}\left[\boldsymbol{\mu}\_{\perp}\right]\frac{\boldsymbol{k}^{\perp}}{|\boldsymbol{k}|},\tag{22b}$$

$$\mathbb{E}[\boldsymbol{u}\_{\parallel}](\mathbf{x},t) = \frac{\widehat{\eta}\_{0}\boldsymbol{\omega}}{H|\boldsymbol{k}|}\cos\left(\boldsymbol{k}\cdot\mathbf{x}-\boldsymbol{\omega}t\right)\exp\left(-\frac{1}{2}|\boldsymbol{\alpha}|^{2}(\boldsymbol{k}\_{\sigma}\times\boldsymbol{k})^{2}t\right),\tag{22c}$$

$$\mathbb{E}[u\_{\perp}](\mathbf{x},t) = \frac{\widehat{\eta}\_{0}f\_{0}}{H|\mathbf{k}|}\sin\left(\mathbf{k}\cdot\mathbf{x}-\omega t\right)\exp\left(-\frac{1}{2}|\alpha|^{2}(\mathbf{k}\_{\sigma}\times\mathbf{k})^{2}t\right),\tag{22d}$$

where the component <sup>E</sup>[*<sup>u</sup>* ] is associated to the mean pressure waves that depends on surface elevation mean, whereas the latter component <sup>E</sup>[*u*⊥] is associated to mean vorticity waves that are initiated by rotation. Note that in the short waves limit with *k* <sup>2</sup> <sup>1</sup>*/L*<sup>2</sup> *<sup>d</sup>* , the mean Poincaré wave corresponds to the inertia-gravity wave of a shallow water system without rotation. The damping term exp <sup>−</sup> <sup>1</sup> <sup>2</sup> |*α*| <sup>2</sup>*(k<sup>σ</sup>* <sup>×</sup> *k)*2*t* depends on the noise's wavelength and variance. This term is zero when the noise and the wave are colinear (i.e. the angle *θ* between *k<sup>σ</sup>* and *k* is zero). For high noise magnitude (and *θ* = 0), the damping occurs in a very fast way. In the long wave limit with *k* <sup>2</sup> <sup>1</sup>*/L*<sup>2</sup> *<sup>d</sup>* , the frequency approaches the Coriolis frequency and the damping term is much less important unless the noise amplitude is very high. Nevertheless, the exponential damping in time remains when the noise and the waves are not aligned.

#### **4.1.2 Mean Geostrophic Mode**

The polarization for the eigenvalue *<sup>ω</sup>* = −<sup>1</sup> <sup>2</sup> |*α*| <sup>2</sup>*(k<sup>σ</sup>* <sup>×</sup> *<sup>k</sup>)*<sup>2</sup> reads

$$\mathbb{E}[\widetilde{\boldsymbol{q}}](\mathbf{x},t) = \begin{bmatrix} -i\frac{\mathcal{S}}{f\_0}\ell \\ i\frac{\mathcal{S}}{f\_0}k \\ 1 \end{bmatrix} \widehat{\eta}\_0 \exp\left(i\mathbf{k}\cdot\mathbf{x}\right) \exp\left(-\frac{1}{2}|\boldsymbol{\alpha}|^2(\mathbf{k}\_\sigma \times \mathbf{k})^2t\right). \tag{23}$$

The ensemble-mean of the wave solutions are given by

$$\mathbb{E}[\eta] = f\_0 \widehat{\eta}\_0 \cos \left( \mathbf{k} \cdot \mathbf{x} \right) \exp \left( -\frac{1}{2} |\alpha|^2 (\mathbf{k}\_\sigma \times \mathbf{k})^2 t \right), \tag{24a}$$

$$\mathbb{E}[\mathfrak{u}] = -g\widehat{\eta}\_0 \mathbb{k}^\perp \sin \left( \mathfrak{k} \cdot \mathbf{x} \right) \exp \left( -\frac{1}{2} |\alpha|^2 (\mathbb{k}\_\sigma \times \mathbb{k})^2 t \right). \tag{24b}$$

This is a pure vorticity wave. When the noise and the wave are aligned, it corresponds to a steady wave in geostrophic balance, which is also, as we saw, a stationary solution of the nonlinear stochastic system associated to a divergencefree quadratic variation (10). We next look at the path-wise wave solutions under specific noise.

## *4.2 Path-Wise Waves Under Constant Noise*

As an initial informative example, we first assume that the noise is constant in space using a zeroth order approximation of the Fourier mode exp*(ik<sup>σ</sup>* **·** *x)* in (16). It can be expressed as

$$\sigma\_l \mathbf{d} \mathbf{B}\_l = \text{Re}\{i\alpha \mathbf{k}\_\sigma^\perp \mathbf{d} \beta\_l\}, \quad \mathbf{a} = \alpha^2 \mathbf{k}\_\sigma^\perp (\mathbf{k}\_\sigma^\perp)^T. \tag{25}$$

In order to infer wave solutions, we look for stochastic ansatz of the form

$$\widetilde{q}\ (\mathbf{x},t) = \widehat{q}\_0 \exp\left(i\left(\mathbf{k}\cdot\mathbf{x} - \omega t - \text{Re}\{i\gamma\beta\_l\}\right)\right),\tag{26a}$$

where *γ* is of unit s−1*/*2. This ansatz has been shown to be a solution of a linear stochastic water waves for constant noise in [8]. Applying now the Itô formula [22] for the deterministic function *(t, z)* → *q*<sup>0</sup> exp *i(c* − *ωt* − *z)* composed with the random process Re{*iγβt*}, we deduce

$$\mathrm{d}\_{l}\widetilde{\boldsymbol{q}} = -\left(\left(i\boldsymbol{\omega} + \frac{1}{2}|\boldsymbol{\gamma}|^{2}\right)\mathrm{d}\boldsymbol{t} + i\mathrm{Re}\{i\boldsymbol{\gamma}\,\mathrm{d}\beta\_{l}\}\right)\widetilde{\boldsymbol{q}},\tag{26b}$$

where the second term on the right-hand-side (RHS) comes from the quadratic variation of the random phase.

Injecting next the stochastic ansatz (26) as well as the noise definition (25) into the linearized system (14), we obtain a system composed of differentiable terms and Brownian (martingale) terms that can be compactly written as

$$\begin{split} \mathbf{L}\widehat{\mathbf{q}}\_{0}\mathbf{d}t = 0, \quad \mathbf{L} = \left(-i\boldsymbol{\omega} - \frac{1}{2}|\boldsymbol{\gamma}|^{2} + \frac{1}{2}\boldsymbol{\alpha}^{2}(\mathbf{k}\_{\sigma}\times\mathbf{k})^{2}\right)\mathbf{l}\_{3} + \begin{bmatrix} 0 & -f\_{0}\ i\boldsymbol{j}\mathbf{k} \\ f\_{0} & 0 & i\boldsymbol{j}\boldsymbol{\ell} \\ i\boldsymbol{H}\boldsymbol{k}\ i\boldsymbol{H}\boldsymbol{\ell} & 0 \end{bmatrix}, \\\ -i\operatorname{Re}\{i\boldsymbol{\gamma}\mathbf{d}\boldsymbol{\beta}\_{l}\} + i\boldsymbol{k}\cdot\operatorname{Re}\{i\boldsymbol{\alpha}\mathbf{k}\_{\sigma}^{\perp}\mathbf{d}\boldsymbol{\beta}\_{l}\} = 0. \end{split} \tag{27a}$$

The last equation leads to

$$
\gamma = \alpha \mathbf{k}\_{\sigma} \times \mathbf{k} \in \mathbb{R}.\tag{28}
$$

Substituting it into (27a), we deduce

$$\mathbf{L} = \begin{bmatrix} -i\boldsymbol{\omega} - f\_0 \ \dot{\boldsymbol{g}}\boldsymbol{k} \\ f\_0 \ -i\boldsymbol{\omega} \ \dot{\boldsymbol{g}}\boldsymbol{\ell} \\ \dot{\boldsymbol{i}}\boldsymbol{H}\boldsymbol{k} \ \dot{\boldsymbol{i}}\boldsymbol{H}\boldsymbol{\ell} \ -i\boldsymbol{\omega} \end{bmatrix}. \tag{29a}$$

We remark that in this case the linear operator L for the resolved variables reduces to the same as that of the deterministic system [19, 28]. Solving subsequently det*(*L*)* = 0 gives us

$$
\omega = 0, \quad \omega = \pm \sqrt{gH|\mathbf{k}|^2 + f\_0^2}. \tag{29b}
$$

We next construct the stochastic plane wave solutions associated to these frequencies.

#### **4.2.1 Stochastic Poincaré Waves**

With the value of the last two frequencies in (29b) and the associated eigenvector of L, one can find the following polarization relations:

$$
\widetilde{q}(\mathbf{x},t) = \begin{bmatrix}
\frac{\omega k + if\_0 \ell}{H|\mathbf{k}|^2} \\
\frac{\omega \ell - if\_0 \mathbf{k}}{H|\mathbf{k}|^2} \\
1
\end{bmatrix} \widehat{\eta\_0} \exp\left(i\left(\mathbf{k}\cdot\mathbf{x} - \omega t + \gamma \mathrm{Im}\{\beta\_l\}\right)\right).
\tag{30}
$$

Taking the real part of this ansatz, we deduce the path-wise wave solution:

$$\eta(\mathbf{x},t) = \widehat{\eta}\_0 \cos \left(\mathbf{k} \cdot \mathbf{x} - \omega t + \boldsymbol{\chi} \operatorname{Im} \{\boldsymbol{\beta}\_l\} \right), \tag{31a}$$

$$u(\mathbf{x},t) = \frac{\widehat{\eta}\_0 \omega}{H|\mathbf{k}|} \cos \left(\mathbf{k} \cdot \mathbf{x} - \omega t + \gamma \operatorname{Im} \{\beta\_l\} \right) \frac{\mathbf{k}}{|\mathbf{k}|}$$

$$+ \frac{\widehat{\eta}\_0 f\_0}{H|\mathbf{k}|} \sin \left(\mathbf{k} \cdot \mathbf{x} - \omega t + \gamma \operatorname{Im} \{\beta\_l\} \right) \frac{\mathbf{k}^\perp}{|\mathbf{k}|}. \tag{31b}$$

In this simple case, we can analytically compute the ensemble-mean from these path-wise wave solutions. Note that *Xt* := *k* **·** *x* − *ωt* + *γ* Im{*βt*} ∼ N *(k* **·** *x* − *ωt, γ* <sup>2</sup>*t)*, and the characteristic function of the Gaussian process *Xt* is given by E exp*(iXt)* = exp *i(<sup>k</sup>* **·** *<sup>x</sup>* <sup>−</sup> *ωt)* <sup>−</sup> <sup>1</sup> <sup>2</sup>*<sup>γ</sup>* <sup>2</sup>*<sup>t</sup>* . One can then deduce the mean of the random ansatz (26), taking its real part leads finally to

$$\mathbb{E}[\eta] = \widehat{\eta}\_0 \cos \left( \mathbf{k} \cdot \mathbf{x} - \omega t \right) \exp \left( -\frac{1}{2} \boldsymbol{\nu}^2 t \right), \tag{32a}$$

$$\mathbb{E}[\mathbf{u}] = \frac{\widehat{\eta}\_0}{H|\mathbf{k}|^2} \Big( \omega \mathbf{k} \cos \left( \mathbf{k} \cdot \mathbf{x} - \omega t \right) + f\_0 \mathbf{k}^\perp \sin \left( \mathbf{k} \cdot \mathbf{x} - \omega t \right) \Big) \exp \left( -\frac{1}{2} \boldsymbol{\chi}^2 t \right). \tag{32b}$$

It can be readily observed from the random dispersion relation (28), that we recover the general mean solution (22) presented in the previous section.

#### **4.2.2 Stochastic Geostrophic Mode**

The polarization for the eigenvalue *ω* = 0 reads

$$
\widetilde{q}(\mathbf{x},t) = \begin{bmatrix} -i\frac{\mathcal{S}}{f\_0}\ell \\ i\frac{\mathcal{S}}{f\_0}k \\ 1 \end{bmatrix} \widehat{\eta}\_0 \exp\left(i\left(\mathbf{k}\cdot\mathbf{x} + \boldsymbol{\gamma}\operatorname{Im}\{\beta\_l\}\right)\right). \tag{33}
$$

The path-wise wave solutions are then given by

$$\boldsymbol{\eta} = f\_0 \widehat{\eta}\_0 \cos \left( \mathbf{k} \cdot \mathbf{x} + \boldsymbol{\gamma} \text{Im} \{ \boldsymbol{\beta}\_l \} \right), \quad \boldsymbol{\mathfrak{u}} = -\boldsymbol{g} \widehat{\eta}\_0 \mathbf{k}^\perp \sin \left( \mathbf{k} \cdot \mathbf{x} + \boldsymbol{\gamma} \text{Im} \{ \boldsymbol{\beta}\_l \} \right). \tag{34}$$

Similarly, one can recover the general mean solution (24) by taking the expectation of these path-wise solutions. As in the previous case only the waves that are excited by the stochastic forcing remain active on long terms horizon.

# *4.3 Approximation of Path-Wise Waves Under Homogeneous Noise*

We now extend the previous solution to statistically homogeneous noise. In the same way as previously we will assume a monochromatic noise as defined in (16), but now slowly varying in space:

$$\sigma\_l \mathbf{d} \mathbf{B}\_l = \text{Re}\{i\alpha \mathbf{k}\_\sigma^\perp \exp(i\epsilon \mathbf{k}\_\sigma \cdot \mathbf{x}) \mathbf{d} \beta\_l\}, \quad \mathbf{a} = |\alpha|^2 \mathbf{k}\_\sigma^\perp (\mathbf{k}\_\sigma^\perp)^T,\tag{35}$$

where  1 is a small parameter to ensure the smooth structure of the noise. To infer wave solutions for such homogeneous noise, we now look for the following ansatz:

$$\widetilde{q}\left(\mathbf{x},t\right) = \widehat{q}\_0 \exp\left(i\left(\mathbf{k}\cdot\mathbf{x} - \omega t - \text{Re}\left\{i\boldsymbol{\upchi}\exp(i\boldsymbol{\upepsilon}\boldsymbol{k}\_\sigma\cdot\mathbf{x})\boldsymbol{\upbeta}\_l\right\}\right)\right),\tag{36a}$$

which generalizes our previous ansatz to homogeneous noise. Applying the Itô formula for this ansatz, we have

$$\mathrm{d}\widetilde{\boldsymbol{q}} = -\left( (i\boldsymbol{\omega} + \frac{1}{2}|\boldsymbol{\gamma}|^2)\,\mathrm{d}t + i\mathrm{Re}\{i\boldsymbol{\gamma}\,\exp(i\boldsymbol{\epsilon}\,\mathbf{k}\_{\sigma}\cdot\mathbf{x})\mathrm{d}\beta\_{l} \} \right) \widetilde{\boldsymbol{q}}.\tag{36b}$$

Injecting these solutions ansatz into system (14), we can separate again the Brownian parts and the time-differentiable component. The former reads

$$-i\operatorname{Re}\left\{i\gamma\exp(i\epsilon\mathbf{k}\_{\sigma}\cdot\mathbf{x})\mathrm{d}\beta\_{l}\right\}+i\mathbf{k}\cdot\operatorname{Re}\left\{i\alpha\mathbf{k}\_{\sigma}^{\perp}\exp(i\epsilon\mathbf{k}\_{\sigma}\cdot\mathbf{x})\mathrm{d}\beta\_{l}\right\}=0,\qquad(37\mathrm{a})$$

which leads to

$$
\gamma = \alpha \mathbf{k}\_{\sigma} \times \mathbf{k} \in \mathbb{R}.\tag{37b}
$$

In a similar way to the previous case, substituting this random dispersion into the diagonal component of the resolved linear operator satisfying <sup>L</sup>*q*<sup>0</sup> <sup>d</sup>*<sup>t</sup>* <sup>=</sup> 0, we get diag*(*L*)* = −*iω*I3. However, in order to compute the gradient terms of the antisymmetric part of L, the random phase is linearized as

$$\widetilde{q}\left(\mathbf{x},t\right) \approx \widehat{q}\_0 \exp\left(i\left(\mathbf{k}\cdot\mathbf{x} - \omega t - \gamma \mathrm{Re}\left[i\left(1 + i\epsilon \mathbf{k}\_\sigma \cdot \mathbf{x}\right)\beta\_l\right]\right)\right),$$

$$= \widehat{q}\_0 \exp\left(i\left(\underbrace{\left(\mathbf{k} + \epsilon \mathbf{k}\_\sigma \gamma \mathrm{Re}\{\beta\_l\}}\_{:=\widetilde{\mathbf{k}}\_l}\right)\cdot \mathbf{x} - \omega t + \gamma \mathrm{Im}\{\beta\_l\}}\right)\right). \tag{38}$$

Hereafter, *<sup>k</sup><sup>t</sup>* = [ *k,* ] *<sup>T</sup>* is referred to as the effective wavenumber vector ensuing from the space varying random phase. It can be noticed that the real component of the complex Brownian path influences the wave's spatial phase, whereas the wave's temporal phase is randomized by the imaginary component. This latter has already been considered in the constant noise case. With the previous approximation, the linear operator L can be finally written as

$$\mathbf{L} = \begin{bmatrix} -i\omega & -f\_0 \ i g \tilde{k}\_l \\ f\_0 & -i\omega \ i g \tilde{\ell}\_l \\ iH \tilde{k}\_l \ \mathrm{i}H \tilde{\ell}\_l & -i\omega \end{bmatrix}. \tag{39a}$$

The two resulting dispersion relations are now given by

$$
\omega = 0, \quad \omega = \pm \sqrt{gH \left| \widetilde{\mathbf{k}}\_{l} \right|^{2} + f\_{0}^{2}}. \tag{39b}
$$

The latter random dispersion relation reduces to the previous relation (29b) (associated to a spatially constant noise) when  = 0. We note that the homogeneous random noise leads to wave scattering. Such phenomena corresponds to similar results found in the setting of the Wentzel-Kramers-Brillouin approximation [15, 21] or more recently through Wigner transform [7, 13, 14]. The stochastic framework explored here leads nevertheless to simpler formal developments.

In the same way as previously, we exhibit in the following the two types of stochastic waves associated to this spatially slowly varying homogeneous noise.

#### **4.3.1 Stochastic Poincaré Waves**

The path-wise wave solution in this case can be written as

$$\eta(\mathbf{x},t) = \widehat{\eta}\_0 \cos \left(\widetilde{\mathbf{k}}\_l \cdot \mathbf{x} - \alpha t + \boldsymbol{\chi} \text{Im}\{\boldsymbol{\beta}\_l\}\right), \tag{40a}$$

$$
\mu(\mathbf{x},t) = \frac{\widehat{\eta}\omega}{H|\widetilde{\mathbf{k}}\_{l}|} \cos\left(\widetilde{\mathbf{k}}\_{l} \cdot \mathbf{x} - \omega t + \boldsymbol{\gamma}\operatorname{Im}\{\boldsymbol{\beta}\_{l}\}\right) \frac{\widetilde{\mathbf{k}}\_{l}}{|\widetilde{\mathbf{k}}\_{l}|}
$$

$$
+ \frac{\widehat{\eta}\,f\_{0}}{H|\widetilde{\mathbf{k}}\_{l}|} \sin\left(\widetilde{\mathbf{k}}\_{l} \cdot \mathbf{x} - \omega t + \boldsymbol{\gamma}\operatorname{Im}\{\boldsymbol{\beta}\_{l}\}\right) \frac{\widetilde{\mathbf{k}}\_{l}^{\perp}}{|\widetilde{\mathbf{k}}\_{l}|}.\tag{40b}
$$

It can be remarked that the surface elevation phase has two sources of randomness: a temporal one, in the modified frequency, and one in space coming from the space varying noise. For *k* = *k<sup>σ</sup>* the solutions corresponds again to the classical deterministic waves, while, as shown in Sect. 4.1, when *k* = *k<sup>σ</sup>* , the mean of the stochastic wave solutions (22) is damped compared to the deterministic surface elevation (*hd (x,t)*).

#### **4.3.2 Stochastic Geostrophic Mode**

The path-wise wave solutions are given by

$$\boldsymbol{\eta} = f\_0 \widehat{\eta}\_0 \cos \left( \widetilde{\mathbf{k}}\_l \cdot \mathbf{x} + \boldsymbol{\gamma} \operatorname{Im} \{ \boldsymbol{\beta}\_l \} \right), \quad \boldsymbol{\mathfrak{u}} = -\boldsymbol{\hat{g}} \widehat{\eta}\_0 \widetilde{\mathbf{k}}\_l^\perp \sin \left( \widetilde{\mathbf{k}}\_l \cdot \mathbf{x} + \boldsymbol{\gamma} \operatorname{Im} \{ \boldsymbol{\beta}\_l \} \right). \qquad (41)$$

These solutions correspond to dispersive geostrophic modes wave packet. The velocity wave is a pure vorticity wave packet. Its ensemble mean is also damped for *k* = *k<sup>σ</sup>* and corresponds to the geostrophic stationary wave for *k* = *k<sup>σ</sup>* (24).

As a final word, on the general linear stochastic shallow water system, it can be noticed that in the long wave limit with | *k*| <sup>2</sup> <sup>1</sup>*/L*<sup>2</sup> *<sup>d</sup>* , (when the frequency approaches the Coriolis frequency), the pressure gradient force becomes negligible compared to the other terms in (14a), and for a noise amplitude of order unity the linear shallow water system boils down to a linear stochastic transport equation in a rotating frame:

$$\mathrm{d}\_{l}\mathfrak{u} + \left(f\_{0}\mathfrak{u}^{\perp} - \frac{1}{2}\nabla \cdot (\boldsymbol{a}\nabla\boldsymbol{u})\right)\mathrm{d}\mathfrak{t} + \sigma\_{l}\mathrm{d}B\_{l} \cdot \nabla\mathfrak{u} = 0. \tag{42}$$

Up to the diffusion and noise term (whose energy balances exactly) this closely corresponds to the so-called near inertial regime in which the fluid is purely transported. In the LU setting, the noise acts always as a random deviation whose energy is exactly compensated by the diffusion term. In the particular case of the shallow water model (linear or nonlinear) the total energy is conserved. For the linear system the total energy of the mean being damped up to a constant, the total energy variance increases up to a constant as a consequence of the total energy conservation.

## *4.4 Numerical Illustrations*

We next give simple illustrations of the stochastic wave solutions. Here, rather than evaluating the analytical solutions presented in the previous sections, we propose to discretize numerically the linearized RSW-LU system (14) and perform Monte-Carlo simulations in order to estimate both path-wise and ensemble-mean wave solutions. To that end, the spectral (Fourier) method is adopted for the spatial discretization within a periodic domain, and an exponential integrator [8] combined with the Milstein scheme [9] is used to approximate the mild solution. This semidiscrete problem can be written as

$$
\widehat{\boldsymbol{q}^{(l)}} = -i\boldsymbol{k} \cdot \widehat{(\boldsymbol{q}\_l \widehat{\boldsymbol{\sigma}\_l \Delta} \boldsymbol{B}\_l)},
\tag{43a}
$$

$$
\widehat{\boldsymbol{q}^{(2)}} = -i\boldsymbol{k} \cdot \widehat{(\boldsymbol{q}^{(l)} \widehat{\sigma\_I \Delta} \boldsymbol{B}\_l)},
\tag{43b}
$$

$$\widehat{\boldsymbol{q}}\_{\mathrm{I}+\Delta\boldsymbol{t}} = \exp(\boldsymbol{A}\Delta\boldsymbol{t})\left(\widehat{\boldsymbol{q}}\_{\mathrm{I}} + \widehat{\boldsymbol{q}^{(\mathrm{i})}} + \frac{1}{2}\widehat{\boldsymbol{q}^{(\mathrm{2})}}\right),\tag{43c}$$

$$A = \begin{bmatrix} 0 & f\_0 & -igk \\ -f\_0 & 0 & -ig\ell \\ -iHk & -iH\ell & 0 \end{bmatrix}, \quad \mathbf{q} = \begin{bmatrix} \mu \\ \upsilon \\ \eta \end{bmatrix}, \tag{43d}$$

where • denotes the projection coefficient on the discrete Fourier modes, *<sup>q</sup>* <sup>=</sup> <sup>F</sup>−1*(q)* is the inverse discrete Fourier transform of *q*, *Δt* and *ΔBt* stands for the timestep and the Brownian motion's increment respectively. We remark that the classical 2/3 dealiasing rule can be adopted for the practical computations of (43a) and (43b).

A deterministic monochromatic wave corresponding to a single frequency of the Poincaré waves (propagating to the left) is fixed as the initial condition:

$$
\widehat{\boldsymbol{q}}\_{0} = \begin{bmatrix}
\frac{\alpha k + if\_{0}\boldsymbol{\ell}}{H|\mathbf{k}|^{2}} \\
\frac{\alpha \boldsymbol{\ell} - if\_{0}\boldsymbol{k}}{H|\mathbf{k}|^{2}} \\
1
\end{bmatrix} \delta(\mathbf{k} - \mathbf{k}\_{0}), \qquad \boldsymbol{\omega} = \sqrt{\boldsymbol{g}H|\mathbf{k}|^{2} + f\_{0}^{2}},
\tag{44}
$$

where *δ* denotes the Dirac function and *k*<sup>0</sup> is the initial wavenumber vector.

For the simulation configuration, we consider a squared shallow basin of length *<sup>L</sup>* <sup>=</sup> <sup>5120</sup> km and depth *<sup>H</sup>* <sup>=</sup> <sup>100</sup> m at mid-latitude (*f*<sup>0</sup> <sup>=</sup> <sup>10</sup>−<sup>4</sup> <sup>s</sup>−1) and a largescale wave with *k* = [3*Δk,* 0*Δ*] *<sup>T</sup>* , where *Δk* = *Δ* = 2*π/L*. Each random system either under constant (25) or under homogeneous noises (16) has been simulated over 5 years with 100 realizations. We remark that here we do not use the smoothness approximation (35) in the homogeneous case. The noise's amplitude is fixed as *<sup>α</sup>* <sup>=</sup> <sup>√</sup>*τ g/f*0, where *<sup>τ</sup>* <sup>=</sup> *Δx/*√*gH* with *<sup>g</sup>* <sup>=</sup> <sup>9</sup>*.*<sup>81</sup> m s−2, *Δx* <sup>=</sup> <sup>40</sup> km and *Δt* = 5*τ* .

Figure 1 illustrates the evolution of the path-wise surface elevation *η* and of the ensemble-mean solution <sup>E</sup>[*η*] under a constant noise which has a different direction (with *k<sup>σ</sup>* = [4*Δk,* 6*Δ*] *<sup>T</sup>* ) than that of the wave (*k*0). In this case, the pathwise solution preserves the magnitude of the initial monochromatic wave while the ensemble-mean wave is damped along time.

Figure 2 demonstrates the results obtained with the homogeneous noise (with the same *k<sup>σ</sup>* as in the previous case). In that case, the path-wise wave is dispersive (scattering effect) whereas the mean wave is dissipative.

The solutions related to the homogeneous noise is rougher than for a constant noise. The damping associated to the mean solution is visually similar in both case.

We then diagnosed an energy decomposition with respect to ensemble of different runs in Fig. 3. This decomposition consists of the (ensemble) mean of energy E D 1 <sup>2</sup> *(H*|*u*| <sup>2</sup>+*gη*2*)* <sup>d</sup>*<sup>x</sup>* , the energy of (ensemble) mean D 1 2 *<sup>H</sup>*|E[*u*]|2<sup>+</sup> *<sup>g</sup>*E[*η*] 2 d*x* and the energy of "eddy" E D 1 2 *<sup>H</sup>*|*<sup>u</sup>* <sup>−</sup> <sup>E</sup>[*u*]|<sup>2</sup> <sup>+</sup> *g(η* <sup>−</sup> <sup>E</sup>[*η*]*)*<sup>2</sup> d*x* . Figure 3 shows that for both noise models, the energy of mean waves is quickly

**Fig. 1** Illustration for the path-wise surface elevation (top) and its ensemble-mean (bottom) with constant noise (with *k<sup>σ</sup>* × *k* = 0) at different time (by columns)

**Fig. 2** Illustration for the path-wise surface elevation (top) and its ensemble-mean (bottom) with an homogeneous noise (with *k<sup>σ</sup>* × *k* = 0) at different time (by columns)

dissipates along time while the energy of eddy waves increases with the same rate and continuously backscatters the variance to the ensemble. This mechanism ensures that the mean of the random energy is preserved in time. This corresponds well to the characteristic (2d), (14d) of the proposed stochastic transport model.

Figure 3 illustrates also the dissipation rate of the energy of the mean waves in terms of different scales of the noise (equivalently, different angles of directions

**Fig. 3** (Top) Time evolution of energy decomposition (w.r.t. ensemble) with constant (left) and homogeneous (right) noises; (Bottom) Comparison of dissipation of ensemble mean for different noise's scales

**Fig. 4** Illustrations under multi-scale noise. From left to right: pathwise wave solution after 5 years simulation, the corresponding ensemble-mean and evolution of energy decomposition

between the noise and the wave). The numerical results confirm our analyses in Sect. 4: the larger the angle *θ*, or the smaller the noise's scale *k<sup>σ</sup>* , the faster the mean waves are damped in both cases.

Furthermore, propagation of the monochromatic wave by a multi-scale noise model have been numerically tested. In particular, we consider a band of wavenumbers for the noise, *k<sup>σ</sup>* = {*k<sup>j</sup>* }*j*=*m,...,M* with |*km*| *<* |*k<sup>j</sup>* | *<* |*kM*|*,* ∀*m<j<M*, satisfying the <sup>−</sup><sup>3</sup> spectrum power law: *<sup>α</sup>*<sup>2</sup> *<sup>j</sup>* |*k*<sup>⊥</sup> *j* | <sup>2</sup> <sup>=</sup> *<sup>α</sup>*<sup>2</sup> *m*|*k*<sup>⊥</sup> *m*| <sup>2</sup> *(*|*k<sup>j</sup>* <sup>|</sup>*/*|*km*|*)*−3, i.e. *αj* <sup>=</sup> *αm(*|*k<sup>j</sup>* <sup>|</sup>*/*|*km*|*)*−5*/*2. Figure <sup>4</sup> shows the results with *<sup>k</sup><sup>m</sup>* = [5*Δk,* <sup>5</sup>*Δ*] *<sup>T</sup>* , *αm* = *α* (as mentioned above) and 10 wavenumbers in total with uniform step *Δk*. After 5 years of simulation, the pathwise wave is dispersive while the mean wave is dissipative, and both of them are more irregular than the monochromatic noise solutions (see Fig. 2). We obtain also the consistent conclusion for the ensemble decomposition of the total energy. Moreover, the conversion from energy of mean to energy of eddy in this case is more faster and efficient than that of the monochromatic noise model (see Fig. 3).

## **5 Shallow Water PV Dynamics and Geostrophic Adjustment**

The geostrophic adjustment is the process by which the flow and the pressure field tend to mutually adjust at large scale under the influence of earth's rotation. An obvious manifestation of this adjustment corresponds to isobaric wind field and geostrophic current in the ocean. In the classical case of the deterministic linear shallow water model the geostrophic adjustment can be explained in terms of a variational formulation as the state of minimum energy corresponding to a given constant value of the potential vorticity (PV). Within the LU modelling, the geostrophic adjustment can be explained from the nonlinear system. To that end, let us derive the PV equation of the stochastic system (2). Taking first the curl of the momentum equation (2a) under the space-time invariant Coriolis parameter *f*0, we have

$$\mathbb{D}\_{l}(\xi+f\_{0}) = -(\xi+f\_{0})\nabla \cdot \mathbf{u} \,\mathrm{d}t + \left[\nabla^{\perp}\boldsymbol{u}, \nabla \sigma\_{l} \mathrm{d}\mathcal{B}\_{l}\right]\_{F}$$

$$+\frac{1}{2}\sum\_{l,j=1,2} \partial\_{\mathbf{x}\_{l}}\partial\_{\mathbf{x}\_{j}}\Big(\nabla a\_{lj} \times \mathbf{u}\Big)\mathrm{d}t,\tag{45a}$$

where *ξ* = **∇**×*u* = *∂x v*−*∂yu* denotes the relative vorticity and [*A, B*]*<sup>F</sup>* = tr*(A<sup>T</sup> B)* stands for the Frobenius inner product of two matrices *A* and *B*. Applying next the chain rule [24] for the stochastic transport of height (2b), one obtain

$$\mathbb{D}\_l h^{-1} = h^{-1} \nabla \cdot \mathbf{u} \,\mathrm{d}t.\tag{45b}$$

Let us recall the product rules of two stochastic transport equations in the following. In particular, if two arbitrary tracers *θ* and *ζ* satisfy

$$
\mathbb{D}\_l \theta = \Theta \,\mathrm{d}t, \quad \mathbb{D}\_l \xi = Z \,\mathrm{d}t + \Sigma \,\mathrm{d}B\_l, \tag{46a}
$$

where *Θ,Z* are time-differentiable forcing terms and *Σ* a martingale forcing component, then according to the Itô-integration-by-part formula [22], we have

$$\mathbb{D}\_{l}(\theta\boldsymbol{\xi}) = \theta\mathbb{D}\_{l}\boldsymbol{\xi} + \boldsymbol{\xi}\mathbb{D}\_{l}\theta - \mathbf{d} \Big\langle \int\_{0}^{l} \boldsymbol{\sigma}\_{s}\mathbf{d}\mathbf{B}\_{s} \cdot \nabla\theta, \int\_{0}^{l} \boldsymbol{\Sigma}\mathbf{d}\mathbf{B}\_{s} \Big\rangle\_{l},\tag{46b}$$

where the last bracket term denotes the quadratic covariation of two martingales [22]. Applying such rule for *<sup>θ</sup>* <sup>=</sup> *<sup>h</sup>*−<sup>1</sup> and *<sup>ζ</sup>* <sup>=</sup> *<sup>ξ</sup>* <sup>+</sup> *<sup>f</sup>*<sup>0</sup> associated with (45), one deduces the stochastic evolution of the PV, *q* = *(ξ* + *f*0*)/h*, namely

$$\mathbb{D}\_{l}q = h^{-1} \Big[\nabla^{\perp} \mathfrak{u}, \nabla \sigma\_{l} \mathrm{d} \mathcal{B}\_{l} \Big]\_{F} + \frac{1}{2} h^{-1} \sum\_{l,j=1,2} \partial\_{\mathbf{x}\_{l}} \partial\_{\mathbf{x}\_{j}} \Big(\nabla a\_{lj} \times \mathbf{u} \Big) \mathrm{d}t$$

$$- \operatorname{d} \Big\langle \int\_{0}^{l} \sigma\_{s} \mathrm{d} \mathcal{B}\_{s} \cdot \nabla h^{-1}, \int\_{0}^{l} \left[\nabla^{\perp} \mathfrak{u}, \nabla \sigma\_{s} \mathrm{d} \mathcal{B}\_{s} \right]\_{F} \Big\rangle\_{l}. \tag{47a}$$

Opposite to the deterministic shallow water case the PV is not transported by the stochastic flow in general and some source/sink terms appear on the right-hand side of this PV equation. These source/sink terms reflect here the action of the small-scale on the non conservation of PV. In the deterministic context, PV is very sensitive to turbulence and subgrid modelling [3]. The same mechanism is at play here. We can nevertheless explore the condition for which PV remains conserved in the setting of a stochastic modeling of the small scales effect. The first and last terms cancel if the large-scale velocity field and the small-scale random component align with each other up to a uniform vector field. The second term trivially cancels if the random field is homogeneous in space (as in that case *a* becomes a constant matrix). With these two previous conditions (alignment and homogeneity) PV is path-wise conserved. For homogeneous (incompressible) noise the expectation can be written in flux form and the mean PV is globally conserved. For homogeneous field, the PV equation reduces then to

$$\mathbb{D}\_{I}q = h^{-1}\Big[\nabla^{\perp}\mathfrak{u}, \nabla\sigma\_{I}\mathrm{d}\mathcal{B}\_{I}\big]\_{F} - \mathrm{d}\Big(\int\_{0}^{\cdot}\sigma\_{I}\mathrm{d}\mathcal{B}\_{s} \cdot \nabla h^{-1}, \int\_{0}^{\cdot}\left[\nabla^{\perp}\mathfrak{u}, \nabla\sigma\_{I}\mathrm{d}\mathcal{B}\_{s}\right]\_{F}\Big). \tag{47b}$$

The above equation, combined with the previous results on the stochastic linear dynamics, enables revisiting the mechanism of geostrophic adjustment in the presence of a forcing. The corresponding global pictures is as follows. Let us first consider random fluctuations in the ocean generated by wind forcing at large scales. Then, due to the wavelength mechanism described in the previous section, all the waves that are not aligned with this forcing are smoothed out and eventually annihilated following an exponential decay. The only waves that eventually remain are aligned with the wind forcing, which at large scale corresponds essentially to near inertial waves, and the PV dynamics corresponds then to a pure transport. Thus, the system tends then to relax to a balanced state as in the deterministic case. Conversely, we can devise a similar picture in the atmospheric context, in a configuration where the ocean plays the role of a noise for the atmosphere. The atmospheric waves will evolve toward a near inertial wave field aligned with the noise, following the exact same process. As a result, merging these two perspectives provides an interesting ocean/atmosphere coupling mechanisms of auto adjustment. At small scales, things are more complicated as the ISD has to be taken into account with an isotropization process that is likely less obvious. As a result the forcing terms in the PV equation remain. Furthermore, in that case the smooth spatial structure assumption of the noise imposes a very low noise amplitude. For the study of strong small-scale forcing this assumption as well as the wave ansatz associated to it have to be revisited. This will be the objective of future works.

## **6 Conclusions**

This stochastic extension of the shallow water equations highlights the behavior of the LU setting for large-scale representation of flow dynamics. Opposite to classical eddy viscosity models, which introduces a similar damping term on the waves form, here, the waves that are sustained by the noise term keep their full classical expressions. This provides a simple mechanism for waves selection and an interesting simplified model explaining the emergence of near inertial waves in ocean atmosphere systems. In the LU representation of the linearized shallow water system, the ensemble-mean waves that are not excited by a noise term with the same wavelength vanish exponentially fast whereas the others correspond to the classical deterministic wave solutions. The decay rate depends on the noise variance and on the wavenumber (in a quadratic way). The vanishing is therefore all the more fast for waves with small wavelength. The noise acts hence as a Dirac comb on the mean wave field.

**Acknowledgments** The authors acknowledge the support of the ERC EU project 856408- STUOD. The authors would like to thank Louis Thiry for his helpful comments, remarks and stimulating discussions. We also warmly thank Baylor Fox-Kemper and Oana Lang for fruitful discussions and advises on this work. The code to reproduce the numerical results is available at https://github.com/matlong/sw-wave-lu.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Analysis of Sea Surface Temperature Variability Using Machine Learning**

**Said Ouala, Bertrand Chapron, Fabrice Collard, Lucile Gaultier, and Ronan Fablet**

**Abstract** Sea surface temperature (SST) is a critical factor in the global climate system and plays a key role in many marine processes. Understanding the variability of SST is therefore important for a range of applications, including weather and climate prediction, ocean circulation modeling, and marine resource management. In this study, we use machine learning techniques to analyze SST anomaly (SSTA) data from the Mediterranean Sea over a period of 33 years. The objective is to best explain the temporal variability of the SSTA extremes. These extremes are revealed to be well explained through a non-linear interaction between multi-scale processes. The results contribute to better unveil factors influencing SSTA extremes, and the development of more accurate prediction models.

**Keywords** Sea surface temperature · Machine learning · Stochastic models · Extremes

# **1 Introduction**

Sea surface temperature (SST) is a critical parameter in the global climate system [1, 2] and plays a vital role in many marine processes, including ocean circulation, evaporation, and the exchange of heat and moisture between the ocean and atmosphere [3, 4, 5].

In recent years, particular attention has been attracted by marine heat waves, when SST largely exceeds the local expected average values [6, 7]. Extreme SST

B. Chapron Ifremer, LOPS, Plouzané, France

S. Ouala (-) · R. Fablet

IMT Atlantique, Lab-STICC, Brest, France e-mail: said.ouala@imt-atlantique.fr

F. Collard · L. Gaultier ODL, Locmaria-Plouzané, France

can cause coral bleaching [8, 9], with cascading effects on the entire ecosystem. Additionally, localized events affect the amount of atmospheric moisture available, to impact precipitation patterns and the likelihood of drought or flooding in certain regions [10]. Better uncovering factors contributing to these extreme events is therefore of great importance to help predicting and mitigating their impacts.

The SST dynamics compound many processes that interact across a continuum of spatio-temporal scales. A first-order approximation of such a system was initially introduced by [11, 12]. Hasselmann pioneered a two-scale stochastic decomposition, to represent the interactions between slow and fast variables. In this study, we focus on SSTA data collected in the Mediterranean Sea, and examine the potential of machine learning techniques to derive relevant dynamical models. Focus is given on the seasonal modulation of the SSTA and we wish to unveil factors influencing the temporal variability of SSTA extremes. The proposed analysis builds on Hasselmann's assumption that the variability of the SSTA can be decomposed into slow and fast components. The slow variables mostly follows the seasonal cycle, while the fast variables are linked to rapid processes, e.g. the wind variability. We thus approximate the probability density function of the SSTA data, using a stochastic differential equation in which the drift function represents the seasonal cycle and the diffusion function represents the envelope of the fast SSTA response.

The paper is organized as follows. We start by introducing the general underlying state space model of the SST anomaly. Rather than directly presenting the stochastic model, we first assume that an underlying deterministic ordinary differential equation (ODE) can represent the non-periodic variability of the SSTA. Considering a phase space reconstruction setting, we use the neural embedding of dynamical systems (NbedDyn) framework [13, 14] for this task. We then discuss the limitations of such a representation, and present the stochastic model. We conclude by summarizing our findings and potential future directions.

# **2 Method**

Let us assume the following state-space model

$$
\dot{\mathbf{z}}\_{\mathbf{l}} = f(\mathbf{z}\_{\mathbf{l}}) \tag{1}
$$

$$\mathbf{x}\_{l} = \mathcal{H}(\mathbf{z}\_{l}) \tag{2}$$

where *<sup>t</sup>* ∈ [0*,* +∞] is time. The variables **<sup>z</sup>***<sup>t</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* and **<sup>x</sup>***<sup>t</sup>* <sup>∈</sup> <sup>R</sup>*<sup>n</sup>* represent the state variables and the SST anomaly observations respectively. *f* and H*<sup>t</sup>* are the dynamical and observation operators. The impact of noise on the dynamics and observation models is omitted for simplicity of the presentation.

## *2.1 Deterministic Model Hypothesis*

**The NbedDyn Framework** If we assume that **zt** is asymptotic to a limit-set *L* ⊂ R*<sup>s</sup>* and that the observations model is not an embedding [15], The NbedDyn model allows one to jointly derive a geometric reconstruction of the unseen phase space from partial observations and a corresponding dynamical model. For any given operator H of a deterministic dynamical system, Takens theorem [16] guarantees that such an augmented space exists. However, instead of using a delay embedding, NbedDyn defines a *dE*-dimensional augmented space with states **<sup>u</sup>***<sup>t</sup>* <sup>∈</sup> <sup>R</sup>*dE* as follows:

$$\mathbf{u}\_t^T = [\mathbf{x}\_t^T, \mathbf{y}\_t^T] \tag{3}$$

where **<sup>y</sup>***<sup>t</sup>* <sup>∈</sup> <sup>R</sup>*dE*−*<sup>r</sup>* are stated as the latent states and *<sup>T</sup>* represents the matrix transpose. They account for the unobserved components of the true state **z***<sup>t</sup>* .

The augmented state **u***<sup>t</sup>* is assumed to satisfy the following state space model:

$$
\dot{\mathbf{u}}\_l = f\_{\theta\_l}(\mathbf{u\_{l}}) \tag{4}
$$

$$\mathbf{x}\_{\mathsf{f}} = (\mathbf{G}\mathbf{u}\_{\mathsf{f}}) \tag{5}$$

where **G** is a projection matrix that satisfies **x***<sup>t</sup>* = **Gut**. The dynamical operator *fθ*<sup>1</sup> belongs to a given family of neural network operators parameterized by a parameter vector *θ*1. In this work, we follow [14] and use a linear quadratic parameterization of *fθ*<sup>1</sup> . This particular parameterization allows us to guarantee boundedness of the ODE (4) using the Schlegel boundedness theorem [17]. A linear quadratic ODE model can be written as follows:

$$\dot{\mathbf{u}}\_{l} = f\_{\theta\_{l}}(\mathbf{u}\_{\mathbf{f}}) = \mathbf{c} + \mathbf{L}\mathbf{u}\_{l} + [\mathbf{u}\_{l}^{T}\mathbf{Q}^{(l)}\mathbf{u}\_{l}, \dots, \mathbf{u}\_{l}^{T}\mathbf{Q}^{(d\_{E})}\mathbf{u}\_{l}]^{T} \tag{6}$$

where **<sup>c</sup>** <sup>∈</sup> <sup>R</sup>*dE* , **<sup>L</sup>** <sup>∈</sup> <sup>R</sup>*dE*×*dE* and **<sup>Q</sup>***(i)* = [*qi,j,k*] *dE j,k*=1*, i* <sup>=</sup> <sup>1</sup>*,...,dE*. The above approximate model is shifted according to **<sup>u</sup>**¯*<sup>t</sup>* <sup>=</sup> **<sup>u</sup>***<sup>t</sup>* <sup>−</sup> **<sup>m</sup>** with **<sup>m</sup>** <sup>∈</sup> <sup>R</sup>*dE* . The approximate dynamical equation of the shifted state can be written as:

$$\dot{\bar{\mathbf{u}}}\_{l} = \mathbf{d} + \mathbf{A}\bar{\mathbf{u}}\_{l} + [\bar{\mathbf{u}}\_{l}^{T}\mathbf{Q}^{(1)}\mathbf{u}\_{l}, \dots, \bar{\mathbf{u}}\_{l}^{T}\mathbf{Q}^{(d\_{E})}\bar{\mathbf{u}}\_{l}]^{T} \tag{7}$$

with

$$\mathbf{d} = \mathbf{c} + \mathbf{L}\mathbf{m} + \left[\mathbf{m}^T \mathbf{Q}^{(l)} \mathbf{m}, \dots, \mathbf{m}^T \mathbf{Q}^{(s)} \mathbf{m}\right]^T \tag{8}$$

and

250 S. Ouala et al.

$$\mathbf{A} = \left( a\_{lj} \right) = \left( l\_{lj} + \sum\_{k=1}^{s} (q\_{l,j,k} + q\_{l,k,j}) m\_k \right) \tag{9}$$

Given an observation time series of size *N* +1 {**x***t*<sup>0</sup> *,...,* **x***tN* }, the training setting comes to jointly learning the model parameters *<sup>θ</sup>*<sup>1</sup> = {**c***,***L***,* **<sup>Q</sup>1***,* **<sup>Q</sup>2***,* ··· *,* **<sup>Q</sup>dE***,* **<sup>m</sup>**} and the latent states **y***<sup>t</sup>* according to the following constrained optimization problem

$$\begin{aligned} \left\| \hat{\boldsymbol{\theta}}\_{1}, \{\hat{\mathbf{y}}\_{l\_{l}}\}\_{l=0}^{l=N-1} = \arg\min\_{\boldsymbol{\theta}\_{1}, \{\mathbf{y}\_{l\_{l}}\}} \sum\_{i=1}^{N} \left\| \mathbf{x}\_{l\_{l}} - \mathbf{G}\boldsymbol{\Phi}\_{\boldsymbol{\theta}\_{1},l\_{l}} \left( \mathbf{u}\_{l\_{l-1}} \right) \right\|^{2} \\ &+ \lambda\_{1} \left\| \mathbf{u}\_{l\_{l}} - \boldsymbol{\Phi}\_{\boldsymbol{\theta}\_{1},l\_{l}} (\mathbf{u}\_{l\_{l-1}}) \right\|^{2} \\ &+ \lambda\_{2} \mathcal{C}\_{1} \\ &+ \lambda\_{3} \mathcal{C}\_{2} \end{aligned} \tag{10}$$

with *Φθ*1*,t(***u***t*−1*)* <sup>=</sup> **<sup>u</sup>***t*−<sup>1</sup> <sup>+</sup> *<sup>t</sup> <sup>t</sup>*−<sup>1</sup> *fθ*<sup>1</sup> *(***u***w)dw* is the flow of the ODE (6) (in our work, this flow is approximated using a Runge Kutta 4 scheme) and <sup>C</sup><sup>1</sup> <sup>=</sup> *<sup>s</sup> i,j,k*=<sup>1</sup> *qi,j,k* <sup>+</sup> *qi,k,j* <sup>+</sup> *bj,i,k* <sup>+</sup> *bj,k,i* <sup>+</sup> *bk,i,j* <sup>+</sup> *bk,j,i*<sup>2</sup> and <sup>C</sup><sup>2</sup> <sup>=</sup> *<sup>s</sup> <sup>i</sup>*=<sup>1</sup> Max*(αi,* <sup>0</sup>*)/*Max*(αi* <sup>+</sup> <sup>1</sup>*,* <sup>0</sup>*)* where *αi, i* <sup>=</sup> <sup>1</sup>*,...,dE* the eigenvalues of the matrix **<sup>A</sup>***<sup>s</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> *(***<sup>A</sup>** <sup>+</sup> **<sup>A</sup>***<sup>T</sup> )*. The variables *<sup>λ</sup>*1*,*2*,*<sup>3</sup> are constant weighting parameters. The first constraint C<sup>1</sup> steams from the energy-preserving condition of the quadratic non-linearity. It forces the contribution of the quadratic terms of *fθ*<sup>1</sup> to the fluctuation energy to sum up to zero. The second constraint, C2, ensures that the eigenvalues of **A***<sup>s</sup>* are negative. Satisfying these constraints guarantees that the model *fθ*<sup>1</sup> is bounded through the existence of a monotonically attracting trapping region that includes the limit-set revealed by the minimization of the forecasting loss. Similarly to the Takens delay embedding technique, the sequence:

$$\boldsymbol{R}\_{t\_0, t\_N} = \{ \hat{\mathbf{u}}\_{t\_l}^T = [\mathbf{x}\_{t\_l}^T, \hat{\mathbf{y}}\_l^T] \text{ with } t\_l = t\_0, \dots, t\_N \} \tag{11}$$

represents a geometric reconstruction of the phase space. In addition to this reconstruction, the NbedDyn model can be used to forecast new observations by determining an initial condition of the unobserved component **y***<sup>t</sup>* and performing a numerical integration of the ODE model (6). We infer the initial condition using a minimization of an objective function similar to (10), but only with respect to the latent states **y***<sup>t</sup>* . This minimization can be seen as a variational data assimilation problem, with partial observations of the state-space variables and known dynamical and observation models [18].

**Related Works** Related state-of-the-art techniques mainly rely on the reconstruction of a phase space using delay embedding [16]. This includes both traditional parametric and non-parametric modeling techniques [19, 20] as well as recurrent neural networks (RNNs). The latter family of methods includes both simple RNN

parameterizations of dynamical systems, as well as latent space inference techniques that are built on an approximation of a posterior distribution that requires the parameterization of a delay embedding [21, 22, 23].

The interest of the NbedDyn framework in contrast to delay embedding based approaches resides in the fact that we do not exploit either a delay embedding or an explicit modeling of the inference model (i.e., the reconstruction of the latent states given the observed time series). As such, our scheme only involves the selection of the class of ODEs of interest. This model reduces the complexity of the overall scheme to the complexity of the ODE representation and guarantees the consistency of the reconstructed latent states w.r.t. the learnt ODE.

## *2.2 Stochastic Model Hypothesis: The Stochastic NbedDyn*

When using phase space reconstruction techniques, one should not forget about the assumptions that this theory is built on. For any embedding to work, we are assuming that the dynamical model in (1) exists and can be represented by an ordinary differential equation [15]. For several realistic applications, this ODE may not exist or can have an extremely large dimension. In geoscience, for instance, the dimension of a state space variable can reach *<sup>s</sup>* <sup>≈</sup> *O(*109*)*. In these situations, reconstructing such an high-dimensional phase space becomes significantly more challenging. In practice, the model returned by any embedding technique can be complemented by an appropriate closure. The form of this closure term can be deterministic using for example the framework of [24] or stochastic through an appropriate calibration of a noise forcing.

When considering SST anomaly data, after calibration of the neural embedding model, an unpredictable, high frequency residual remains. Based on Hasselmann's idea, we assume this residual component represents the effect of fast-scale processes, e.g. passages of atmospheric and oceanic eddies. To first order, it can be represented as a modulated white noise. Indeed, this residual, shown in Fig. 3, exhibits correlations with the slow-scale SST anomaly data.

To model stochastic SST anomalies, the deterministic NbedDyn model described above is first optimized, and further complemented (6) with a stochastic forcing as follows:

$$\begin{cases} \dot{\mathbf{u}}\_{l} = f\_{\theta\_{l}}(\mathbf{u}\_{l}) + \mathbf{g}\_{\theta\_{2}}(\mathbf{u}\_{l})\dot{\mathbf{g}}\_{l} \\ \mathbf{x}\_{l} = \mathbf{G}\mathbf{u}\_{l} \end{cases} \tag{12}$$

with *ξ <sup>t</sup>* is a white noise. We derive the parameters of the model (12), as follows. Given an observation time series of size *N* + 1 {**x***t*<sup>0</sup> *,...,* **x***tN* }, similarly to the deterministic case, we optimize the diffusion parameters *θ*<sup>2</sup> to minimize the forecast of the observations. In addition to the diffusion parameters, we also reconstruct a noise realization *ξ rec* that generates the observations process under (12). Overall, the optimization problem can be written as follows:

$$\begin{aligned} \hat{\theta}\_{2}, \{\hat{\boldsymbol{\xi}}\_{l\_{l}}^{rec}\}\_{l=0}^{i=N-1} &= \arg\min\_{\theta\_{2}, \boldsymbol{\xi}^{rec}} \sum\_{t=1}^{T} \left\| \mathbf{x}\_{l} - \mathbf{G} \boldsymbol{\Phi}\_{\theta, l} \left( \mathbf{u}\_{l-1}, \boldsymbol{\xi}^{rec}\_{l-1} \right) \right\|^{2} \\\\ \text{Subject to } & \begin{cases} \mathbf{u}\_{l} &= \boldsymbol{\Phi}\_{\theta, l} \left( \mathbf{u}\_{l-1}, \boldsymbol{\xi}^{rec} \right) \\ \mathbf{G} \mathbf{u}\_{l} &= \mathbf{x}\_{l} \\ \mathbf{R}\_{\boldsymbol{\xi}^{rec}} \boldsymbol{\xi}^{rec} \left( \boldsymbol{\tau} \right) &= 0 \text{ for all } \boldsymbol{\tau} \neq \boldsymbol{0} \end{cases} \end{aligned} \tag{13}$$

with {ˆ *ξ rec ti* } *i*=*N*−1 *<sup>i</sup>*=<sup>0</sup> is the noise realization that minimizes the objective function in (13) and *Φθ ,t* :

$$\Phi\_{\theta,l}(\mathbf{u}\_{l-1}, \boldsymbol{\xi}^{rec}) = \mathbf{u}\_{l-1} + \int\_{t-1}^{t} f\_{\theta}(\mathbf{u}\_{w}) dw + \int\_{t-1}^{t} g\_{\theta}(\mathbf{u}\_{w}) \boldsymbol{\xi}^{rec}\_{w} dw$$

the solution of the stochastic model. This solution is approximated in this work using an Euler-Maruyama scheme, which makes the model converge to an Ito SDE.

In practice, we use the following regularized optimization problem:

$$\begin{split} \hat{\theta}\_{2}, \hat{\xi}^{rec} = \arg\min\_{\theta\_{2}, \xi^{rec}} \sum\_{t=1}^{T} \left\| \mathbf{x}\_{t} - \mathbf{G} \Phi\_{\theta, t} \left( \mathbf{u}\_{t-1}, \xi^{rec} \right) \right\|^{2} \\ &+ \lambda\_{4} \mathcal{C}\_{3} \\ &+ \lambda\_{5} \mathcal{C}\_{4} \\ &+ \lambda\_{6} \mathcal{C}\_{5} \end{split} \tag{14}$$

with <sup>C</sup><sup>3</sup> = **R***<sup>ξ</sup> rec<sup>ξ</sup> rec (τ )*2, <sup>C</sup><sup>4</sup> <sup>=</sup> **Var***(Φθ ,t(***u***t*−1*, <sup>ξ</sup> samp))*, <sup>C</sup><sup>5</sup> = *Φθ ,t(***u***t*−1*, <sup>ξ</sup> rec)*<sup>−</sup> **<sup>E</sup>**[*Φθ ,t(***u***t*−1*, <sup>ξ</sup> samp)*]<sup>2</sup> and *<sup>ξ</sup> samp* is a sampled Gaussian white noise. The variables *λ*4*,*5*,*<sup>6</sup> are constant weighting parameters. The first constraint C<sup>3</sup> makes the reconstructed noise path white. The second and third constraints, C4*,*5, ensure that the SDE generalizes to sampled white noises. Specifically C<sup>3</sup> makes an ensemble of trajectories generated from sampled white noise close to the trajectory generated from the reconstructed noise and C<sup>2</sup> reduces the spread of the ensemble around the trajectory simulated from the reconstructed noise.

After optimization, we can couple the optimization problems (10) and (14) and calibrate jointly all the model parameters *θ*1*, θ*2*,* **y***t, ξ rec*. This fine tuning step is not essential but allows both the drift and diffusion parameters of the model to adapt to each other.

# **3 Numerical Experiments**

## *3.1 Data*

Sea Surface Temperature Anomalies (SSTA) in the Mediterranean Sea correspond to the Ligurian Sea at 8*.*6◦E*,* 43*.*8◦N. The anomalies are computed based on a yearly average of the annual 99th percentile of the SST reanalysis [25, 26]. The time series is made up of daily SST anomaly measurements from 1987 to 2019. We use the daily data from 1987 to 2014 as training data. Figure 3e illustrates the time series. These time series include a seasonal cycle and non-periodic high temperature extremes in the summer.

## *3.2 Analysis of the Deterministic Model*

In this first experiment, we investigate if the deterministic neural embedding model is able to model the non-periodic variability of the SSTA extremes. For this purpose, we test 3 models with dimensions of the embedding ranging from 1 to 10.

**Analysis of the Embeddings** The choice of the dimension *dE* is linked to the number of independent variables that can be used to model the dynamics using, in our context, a bounded autonomous linear quadratic ODE. We start by studying the direct impact of *dE* on the performance of the NbedDyn model. Figure 1 shows the impact of *dE* on the training error between the observations and the model simulation. Other criteria could be used (please refer to [13, 14] for a more in depth analysis of this parameters on other case studies), but overall, the training error provides a direct measure of the effectiveness of the embedding dimension in the training phase. The first evaluation of the training error reported in Fig. 1

**Fig. 1** Mean training error at convergence. We report the mean training error at convergence of the deterministic NbedDyn model for different dimensions *dE* of the embedding. This error is averaged over the training time series, and we highlight here both the mean and standard deviations

corresponds to *dE* equal to the dimension of the measurements, i.e. *dE* = 1. In this experiment, no latent states **y***<sup>t</sup>* are used and the embedding **u***<sup>t</sup>* = **x***<sup>t</sup>* . In such situations, the ODE model can not perfectly fit the data. Furthermore, at this particular value of *dE*, the models are more likely to display a bad asymptotic behavior. As the dimension increases, this training error decreases which confirms better modeling abilities using the NbedDyn model. In the following, we study the models with *dE* = 3*,* 6 and 10.

**Asymptotic Properties of the Models** We evaluate the asymptotic behavior of the deterministic models for *dE* = 3*,* 6 and 10. For this purpose, we run the nbedDyn models for a period of 27 years. The resulting simulation is visualized with respect to the reconstructed phase space (11) of the training data in Fig. 2. Overall, the

**Fig. 2** Asymptotic solution of the deterministic models. We visualize the simulation of the deterministic models with respect to the reconstructed phase space. The models with *dE* = 3*,* 6 and 10 are given in figure (**a**), (**b**) and (**c**) respectively. For *dE >* 3, we project the simulation and the reconstructed phase space into R<sup>3</sup>

models are only able to reproduce the seasonal cycle of the SST anomaly data. Other experiments (not shown here) suggest that even a farther increase of the dimension of the embedding does not allow the model to capture the non-periodic behavior of the SST anomaly extremes.

**Analysis of the Training Residuals** To further investigate the asymptotic behavior of the deterministic models, we visualize in Fig. 3 the training residual {**x***ti* − **G***Φθ ,ti(***u***ti*−<sup>1</sup> *)* with *ti* = *t*0*,...,tN* }. When the dimension of the embedding increases above *dE* = 1, a qualitative and quantitative change in the residual error is present. This is due to the fact that a two-dimensional ODE (in R) is needed to capture the oscillations of the seasonal cycle of the SSTA. However, when the dimension of the embedding increases above 2, no clear qualitative or quantitative change is present. Furthermore, the residual is much more high frequency than the training SSTA data, which suggests that the errors are due to a missing high frequency scales that can not be modeled using the standard deterministic model.

Based on these considerations, and motivated by Hasselmann's works on stochastic climate models with applications on SST anomaly data, we proposed the stochastic NbedDyn model. In this framework, the SST residual of Fig. 3 is modeled as a stochastic forcing.

## *3.3 Analysis of the Stochastic Model*

We focus our analysis on the model with *dE* = 6. We add a stochastic forcing to the neural embedding model (the parameters of the diffusion function *gθ*<sup>2</sup> are optimized according to Appendix 1). Figure 4 shows the reconstructed phase space under this new model, as well as a model simulation of 27 years. When compared to the simulations of the deterministic model in Fig. 2, the stochastic model is able to cover the whole reconstructed phase space, including the regions with high temperature extremes. This shows that including the high frequency forcing is crucial for the model to capture the non-periodic behavior of the extremes.

These observations are further illustrated in the simulation example given in Fig. 5. The stochastic model is able to produce an ensemble of SST anomaly trajectories that reproduce the non-periodic variability of the extremes. Furthermore, the trajectories generated from a sampled white noise match the one of the reconstructed noise, which validates the proposed training procedure.

We can also discuss the marginal PDF of the stochastic model and compare it to the one computed from the data in Fig. 6. The PDF of the model is computed over a simulation of 109 years. Overall, the model is able to correctly model the high SST anomalies (in the summer), including the non-periodic extremes that form the tail of the distribution. The negative SST anomalies (in the winter) are not approximated as good as in the summer case. This is due to the fact that the model flattens the PDF in the winter by generating trajectories that have more spread (as highlighted in the ensemble prediction experiment in Fig. 5). We did not investigate this problem

**Fig. 3** Training residual and corresponding training data. We visualize the training residual for *dE* = 1 in (**a**), *dE* = 3 in (**b**), *dE* = 6 in (**c**) and *dE* = 10 in (**d**). Corresponding training data are given in (**e**)

**Fig. 5** Ensemble simulation of the stochastic model. We visualize an ensemble simulation of the stochastic model, both in the training (**a**) and test (**b**) sets. The simulation in the training set is carried to compare trajectories computed from a sampled noise to the one issued from the reconstructed noise *ξ rec*

within the present study. However, we can make the PDF sharper in the winter by forcing the diffusion of the model to be closer to zero during this season.

## **4 Conclusion**

In this work, we examined the potential of machine learning techniques to derive relevant dynamical models of sea surface temperature anomaly data in the Mediterranean Sea. We focused on the seasonal modulation of SSTA extremes and used a neural embedding model to reconstruct the phase space of SSTA data. We then added a stochastic forcing term to account for the missing high frequency variability. Our results contribute to the understanding of the factors influencing SSTA extremes and the development of more accurate prediction models. In particular, the analysis highlights the importance of including these fast high-frequency scales in the modeling of SSTA data.

One potential avenue for future work is to investigate the white noise hypothesis in comparison to other types of stochastic models, such as those based on colored noise or fractional Brownian motion. Furthermore, it would be interesting to apply the methodology to other regions and compare the results to evaluate the local impacts of the fast-scales on the slower ones. The ability of using this model as an emulator and studying its predictive skills with respect to standard ocean data assimilation based systems is also a promising perspective.

Finally, and from a methodological point of view, this work highlights the importance of complementing models that are returned by an embedding methodology. Specifically, and as discussed in Sect. 2.2, in complex applications such as the ones in geosciences, the dimension of the underlying state variables is likely to be huge and defining ways of complementing reduced order models through appropriate closure terms is mandatory in order to capture the variability of the data. Analysing the residual of the model fitting procedure is a natural way to define and optimize this closure terms.

## **Appendix 1: Training**

The trainable parameters of the deterministic NbedDyn models i.e. the linear quadratic ODE and initial conditions of the latent states are initially sampled from a uniform distribution. The training of all models is carried using the Adam optimizer. We use a varying learning rate (from 0.1 to 0.001) in all the experiments to speed up the training. Regarding the weighting parameters {*λi*} *i*=3 *<sup>i</sup>*=1, we set *<sup>λ</sup>*<sup>1</sup> <sup>=</sup> <sup>1</sup> during all the training. The weights responsible for the boundedness constraints were set at higher values in the beginning of the training i.e. *λ*<sup>2</sup> = 100 and *λ*<sup>3</sup> = 1000 and then reduced to *λ*<sup>2</sup> = 1 when *λ*3C<sup>2</sup> = 0. The training is stopped using crossvalidation. Regarding the stochastic forcing, the parameters of the diffusion are initially sampled from a uniform distribution and the noise path *ξ rec* is initialized from a standard normal distribution. We use a learning rate of 0.001 and the weighting parameters {*λi*} *i*=6 *<sup>i</sup>*=<sup>4</sup> to one during all the training. We finished the training with a fine tuning step, in which all the parameters of the model are optimized jointly with a learning rate of 0.0001.

## **Appendix 2: Parameterization of the Diffusion Function**

The diffusion function *gθ*<sup>2</sup> : <sup>R</sup>*dE* −→ <sup>R</sup>*dE*×*dN* where *dN* is the dimension of the noise. In our experiment, we parameterized this function using a fully connected neural networks with 2 hidden layers with a sigmoid activation and 400 neurones per hidden layer. The dimension of the noise *dN* is set to the dimension of the state **u***<sup>t</sup>* .

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Data Assimilation: A Dynamic Homotopy-Based Coupling Approach**

**Sebastian Reich**

**Abstract** Homotopy approaches to Bayesian inference have found wide- spread use especially if the Kullback–Leibler divergence between the prior and the posterior distribution is large. Here we extend one of these homotopy approaches to include an underlying stochastic diffusion process. The underlying mathematical problem is closely related to the Schrödinger bridge problem for given marginal distributions. We demonstrate that the proposed homotopy approach provides a computationally tractable approximation to the underlying bridge problem. In particular, our implementation builds upon the widely used ensemble Kalman filter methodology and extends it to Schrödinger bridge problems within the context of sequential data assimilation.

# **1 Introduction**

Sequential data assimilation interlaces dynamic processes with intermittent partial state observations in order to provide reliable state estimates and their uncertainties. A wide array of numerical methods have been proposed to tackle this problem computationally. Popular methods include sequential Monte Carlo, variational inference, and various ensemble Kalman filter formulations [5, 8]. These methods can encounter difficulties whenever the predictive distribution is incompatible with the incoming data; in other words whenever the distance between the prior, as provided by the underlying stochastic process, and the data informed posterior distribution is large. It has long been realized that this challenge can be partially circumvented by altering the underlying stochastic process through appropriate control terms or modified proposal densities [5, 23, 16, 9]. Recently, the connection between devising such control terms and Schrödinger bridge problems [4] has been made explicit [16]. However, Schrödinger bridge problems are notoriously difficult to

S. Reich (-)

University of Potsdam, Institute of Mathematics, Potsdam, Germany e-mail: sebastian.reich@uni-postdam.de

<sup>©</sup> The Author(s) 2024

B. Chapron et al. (eds.), *Stochastic Transport in Upper Ocean Dynamics II*, Mathematics of Planet Earth 11, https://doi.org/10.1007/978-3-031-40094-0\_12

solve numerically. The key contribution of this paper is to provide a computationally tractable (sub-optimal) solution via a novel extension of established homotopy approaches [7, 14]. Similar to related homotopy approaches for purely Bayesian inference, the solution of certain partial differential equations (PDE) is required in order to find the desired control terms [14, 25]. In line with standard ensemble Kalman filter (EnKF) methodologies we approximate these PDEs via a constant gain approximation [21]. There are also alternative approaches to sequential data assimilation or inference which utilize ideas from optimal transportation; see for example [15, 6, 20, 3].

The paper is structured as follows. The mathematical formulation of the data assimilation problem, as considered in this paper, is laid out in Sect. 2. The standard optimal control and Schrödinger bridge approach to data assimilation is briefly summarized in Sect. 3, and the novel control formulation based on an homotopy formulation is introduced in Sect. 4. A practical implementation based on the EnKF methodology is proposed in Sect. 5. A series of increasingly complex data assimilation problems is considered in Sect. 6 in order to demonstrate the feasibility of the proposed methodologies. The paper concludes with some conclusions in Sect. 7. Detailed mathematical derivations can be found in Appendices 1 and 2, respectively.

## **2 Problem Formulation and Background**

We consider drift diffusion processes given by a stochastic differential equations (SDE)

$$\mathbf{d}X\_{l} = f(X\_{l})\mathbf{d}t + \sqrt{2\sigma}\,\mathrm{d}W\_{l},\tag{1}$$

where *Xt* : <sup>→</sup> <sup>R</sup>*dx* , *<sup>f</sup>* : <sup>R</sup>*dx* <sup>→</sup> <sup>R</sup>*dx* , *<sup>σ</sup>* <sup>∈</sup> <sup>R</sup>≥0, and *Wt* : <sup>→</sup> <sup>R</sup>*dx* denotes *dx* -dimensional standard Brownian motion [12, 13].

Assuming the law of *Xt* is absolutely continuous w.r.t. Lebesgue measure with density *πt* , this leads to the Fokker–Planck equation [13]

$$\partial\_l \pi\_l = -\nabla \cdot \left(\pi\_l \left(f - \sigma \nabla \log \pi\_l\right)\right). \tag{2}$$

The SDE (1) can be replaced by the mean field ODE

$$\frac{d}{dt}\tilde{X}\_l = f(\tilde{X}\_l) - \sigma \nabla \log \tilde{\pi}\_l \tag{3}$$

where *π*˜*<sup>t</sup>* denotes the law of *X*˜*<sup>t</sup>* . Provided *π*˜<sup>0</sup> = *π*0, it holds that *π*˜*<sup>t</sup>* = *πt* for all *t >* 0. Note that the evolution of the random variable *X*˜*<sup>t</sup>* is entirely deterministic subject to random initial conditions *X*˜ <sup>0</sup> ∼ *π*0.

At time *t* = *T >* 0, we have observations of the system according to

Data Assimilation: A Coupling Approach 263

$$\mathbf{y}\_T = h(\mathbf{x}\_T^\dagger) + \boldsymbol{\nu} \tag{4}$$

from which we wish to infer the unknown state *x*† *<sup>T</sup>* <sup>∈</sup> <sup>R</sup>*dx* . Here *<sup>h</sup>* : <sup>R</sup>*dx* <sup>→</sup> <sup>R</sup>*dy* denotes the forward map and *ν* ∼ N *(*0*, R)* is *dy* -dimensional Gaussian noise with covariance matrix *<sup>R</sup>* <sup>∈</sup> <sup>R</sup>*dy*×*dy* .

Let *<sup>L</sup>* : <sup>R</sup>*dx* <sup>→</sup> <sup>R</sup> denote the corresponding negative log-likelihood function. Since *ν* is Gaussian it is given by

$$L(\mathbf{x}) = \frac{1}{2} (h(\mathbf{x}) - \mathbf{y}\_T)^\top R^{-1} (h(\mathbf{x}) - \mathbf{y}\_T) \tag{5}$$

up to an irrelevant constant. The observations are combined with the predictive density *πT* at time *t* = *T* according to Bayes' theorem,

$$
\pi\_T^\mathbf{a} = \frac{e^{-L}\pi\_T}{\int e^{-L(\mathbf{x})}\pi\_T(\mathbf{x})d\mathbf{x}}.\tag{6}
$$

The process of transforming the random variable *XT* ∼ *πT* into a random variable *Xa <sup>T</sup>* <sup>∼</sup> *<sup>π</sup>*<sup>a</sup> *<sup>T</sup>* is called data assimilation in the context of dynamical systems and stochastic processes [10, 17, 8].

Since performing data assimilation can be difficult if the relative Kullback– Leibler divergence

$$\mathrm{KL}(\pi r | \pi\_T^{\mathfrak{a}}) = \int\_{\mathbb{R}^{d\_{\mathfrak{X}}}} \pi r(\mathfrak{x}) (\log \pi r(\mathfrak{x}) - \log \pi\_T^{\mathfrak{a}}(\mathfrak{x})) d\mathfrak{x},\tag{7}$$

also called the relative entropy [13], between the prior *πT* and posterior *π<sup>a</sup> <sup>T</sup>* is large and/or if the involved distributions are strongly non-Gaussian [19, 1], we propose to construct a new SDE with state process *X*<sup>h</sup> *<sup>t</sup>* such that *X*<sup>h</sup> <sup>0</sup> <sup>∼</sup> *<sup>π</sup>*<sup>0</sup> and *<sup>X</sup>*<sup>h</sup> *<sup>T</sup>* <sup>∼</sup> *<sup>π</sup><sup>a</sup> T* . In other words, we are looking for a stochastic process (bridge) with initial density *π*<sup>0</sup> and final density *π*<sup>a</sup> *<sup>T</sup>* . The problem of finding the optimal process (in the sense of minimal Kullback–Leibler divergence) is known as the Schrödinger bridge problem [4].

## **3 Schrödinger Bridge Approach**

The Bayesian adjustment (6) at final time *t* = *T* leads in fact to an adjustment over the whole solution space of the underlying diffusion process described by (1). Let us denote the so called smoothing distribution by *π*<sup>a</sup> *<sup>t</sup>* , *t* ∈ [0*, T* ] [18, 5]. It is well established that these marginal distributions can be generated from a controlled SDE

$$\mathbf{d}X\_t^\mathbf{a} = f(X\_t^\mathbf{a})\mathbf{d}t + \mathbf{g}\_t^\mathbf{a}(X\_t^\mathbf{a})\mathbf{d}t + \sqrt{2\sigma}\,\mathrm{d}W\_t\tag{8}$$

for appropriate control *g*<sup>a</sup> *<sup>t</sup>* : <sup>R</sup>*dx* <sup>→</sup> <sup>R</sup>*dx* such that *<sup>X</sup>*<sup>a</sup> <sup>0</sup> <sup>∼</sup> *<sup>π</sup>*<sup>a</sup> <sup>0</sup> implies *<sup>X</sup>*<sup>a</sup> *<sup>t</sup>* <sup>∼</sup> *<sup>π</sup>*<sup>a</sup> *t* for all *t >* 0. It is also well known that finding a suitable *g*<sup>a</sup> *<sup>t</sup>* can be formulated as an optimal control problem which in turn is closely related to the backward Kolmogorov equation [13, 16]. Formulations related to (8) have also been used in the context of sequential Monte Carlo methods [5].

As proposed in [16], an alternative perspective on sequential data assimilation is provided by Schrödinger bridges. Given two marginal distributions *q*<sup>0</sup> and *qT* and stochastic process *Xt* (referred to as the *reference process*), a Schrödinger bridge is another stochastic process *X*ˆ*<sup>t</sup>* such that *X*ˆ <sup>0</sup> ∼ *q*0, *X*ˆ *<sup>T</sup>* ∼ *qT* and the Kullback– Leibler divergence between the processes {*X*ˆ*t*}*t*∈[0*,T* ] and {*Xt*}*t*∈[0*,T* ] is minimal. Specialised to our problem and considering a single data assimilation cycle that means the marginals are the initial and posterior densities, i.e. *q*<sup>0</sup> = *π*<sup>0</sup> and *qT* = *π*a *<sup>T</sup>* , and the reference process is the solution to (1). The solution to the associated Schrödinger bridge problem is again of the form (8) with modified control term denoted by *g*SB *<sup>t</sup> (x)*.

A Schrödinger bridge is thus the optimal coupling as measured by the Kullback– Leibler divergence to the underlying reference process. Unfortunately Schrödinger bridges lead to boundary value problems in the space of probability measures and the required control term *g*SB *<sup>t</sup>* seems rather difficult to compute in practice. In addition to the computational complexity of solving nonlinear Schrödinger bridge problems, the target distribution *π<sup>a</sup> <sup>T</sup>* is implicitly defined in the setting of data assimilation. The next section offers a solution to both of these issues. We point to [23] for a discussion of alternative approaches which introduce appropriate control terms into data assimilation procedures.

## **4 Homotopy Induced Dynamic Coupling**

Since Schrödinger bridges are computationally challenging, we ask whether a less optimal but cheaper approach might also be feasible. Indeed, in the context of data assimilation a non-optimal coupling can be found via a homotopy between the initial and target distribution as follows. Let

$$
\pi\_l^{\text{h}}(\mathbf{x}) = Z\_l^{-1} e^{-\not\pi^l L(\mathbf{x})} \pi\_l(\mathbf{x}) \tag{9}
$$

denote the homotopy in question, with *Zt* = - *e*<sup>−</sup> *<sup>t</sup> <sup>T</sup> L(x)πt(x)*d*x* the time dependent normalization constant. It clearly holds that *π*<sup>h</sup> <sup>0</sup> <sup>=</sup> *<sup>π</sup>*<sup>0</sup> and *<sup>π</sup>*<sup>h</sup> *<sup>T</sup>* <sup>=</sup> *<sup>π</sup>*<sup>a</sup> *<sup>T</sup>* . Note that the scaling *<sup>t</sup>* <sup>→</sup> *<sup>e</sup>*<sup>−</sup> *<sup>t</sup> <sup>T</sup> <sup>L</sup>* was chosen for its simplicity and follows previous work on Bayesian inference problems [7, 14]. Finding better homotopies or systematic ways of constructing one could be an interesting direction for future research.

We can then reason backwards from the Fokker–Planck equation of *π<sup>h</sup> <sup>t</sup>* to conclude that if it is the density of a random variable *X<sup>h</sup> <sup>t</sup>* then that random variable must satisfy the modified SDE:

Data Assimilation: A Coupling Approach 265

$$\mathbf{d}X\_t^\mathbf{h} = f(X\_t^\mathbf{h})\mathbf{d}t - \frac{\sigma t}{T}\nabla L(X\_t^\mathbf{h})\mathbf{d}t + \mathbf{g}\_l(X\_l^\mathbf{h})\mathbf{d}t + \sqrt{2\sigma}\mathbf{d}W\_l,\tag{10}$$

where *gt* is a solution to the PDE

$$\nabla \cdot (\boldsymbol{\pi}\_{\rm t}^{\rm h} \mathbf{g}\_{\rm l}) = \frac{1}{T} \boldsymbol{\pi}\_{\rm l}^{\rm h} \left( \boldsymbol{L} + \boldsymbol{t} \nabla \boldsymbol{L} \cdot \left( \boldsymbol{f} - \frac{\boldsymbol{\sigma} \boldsymbol{t}}{T} \nabla \boldsymbol{L} - \boldsymbol{\sigma} \nabla \log \boldsymbol{\pi}\_{\rm l}^{\rm h} \right) \right) + \boldsymbol{\pi}\_{\rm l}^{\rm h} \frac{\dot{\mathbf{Z}}\_{\rm l}}{\mathbf{Z}\_{\rm l}}.\tag{11}$$

The derivations of (11) can be found in Appendix 1.

Note that (10) constitutes a mean field model since *gt* depends on the distribution *π*h *<sup>t</sup>* of *X*<sup>h</sup> *<sup>t</sup>* . We also wish to point out that

$$
\hat{\mathbf{g}}\_l^{\text{SB}}(\mathbf{x}) := -\frac{\sigma t}{T} \nabla L(\mathbf{x}) + \mathbf{g}\_l(\mathbf{x}) \tag{12}
$$

provides a solution to the associated coupling problem. The control term *<sup>g</sup>*ˆSB *<sup>t</sup>* is however non-optimal in the sense of the Schrödinger bridge problem since it does not minimise the Kullback–Leibler divergence.

Since (11) is linear in *gt* we can decompose (11) into a set of simpler equations ∇ · *π*h *t gi t* <sup>=</sup> *<sup>π</sup>*<sup>h</sup> *<sup>t</sup> (k<sup>i</sup>* <sup>−</sup> <sup>E</sup>*k<sup>i</sup> )* such that the *k<sup>i</sup>* add up to the right hand side of (11). In order to maintain - ∇ · *π*h *<sup>t</sup> gt* <sup>=</sup> <sup>0</sup> for the individual *<sup>g</sup><sup>i</sup> <sup>t</sup>* we can make use of the fact that the terms in (11) are of the form *π*<sup>h</sup> *<sup>t</sup> (k* <sup>−</sup> <sup>E</sup>*k)*. Separating the terms, we obtain the following equations, the sum of whose solutions solves (11):

$$\nabla \cdot \left( \pi\_t^{\text{h}} \text{g}\_t^{\text{l}} \right) = \pi\_t^{\text{h}} \left( \frac{L}{T} - \mathbb{E} \frac{L}{T} \right) \tag{13a}$$

$$\nabla \cdot \left( \pi\_l^{\text{h}} \text{g}\_l^2 \right) = \pi\_l^{\text{h}} \left( \frac{t}{T} \nabla L \cdot f - \mathbb{E} \frac{t}{T} \nabla L \cdot f \right) \tag{13b}$$

$$\nabla \cdot \left( \boldsymbol{\pi}\_{l}^{\mathbf{h}} \boldsymbol{g}\_{l}^{\mathbf{3}} \right) = -\boldsymbol{\pi}\_{l}^{\mathbf{h}} \left( \frac{\sigma t}{T} \nabla L \cdot \nabla \log \boldsymbol{\pi}\_{l}^{\mathbf{h}} - \mathbb{E} \frac{\sigma t}{T} \nabla L \cdot \nabla \log \boldsymbol{\pi}\_{l}^{\mathbf{h}} \right) \tag{13c}$$

$$\nabla \cdot \left( \pi\_l^{\text{h}} g\_l^4 \right) = -\pi\_l^{\text{h}} \left( \frac{\sigma t^2}{T^2} \nabla L \cdot \nabla L - \mathbb{E} \frac{\sigma t^2}{T^2} \nabla L \cdot \nabla L \right). \tag{13d}$$

Note that

$$
\pi\_{\rm l}^{\rm h} \nabla L \cdot \nabla \log \pi\_{\rm l}^{\rm h} = \nabla \left( \pi\_{\rm l}^{\rm h} \nabla L \right) - \pi\_{\rm l}^{\rm h} \Delta L,\tag{14}
$$

which can be used to avoid the computation of <sup>∇</sup> log *<sup>π</sup>*<sup>h</sup> *<sup>t</sup>* (with = ∇·∇ the Laplacian operator) in (13c). Thus the controlled SDE (10) can be replaced by

$$\mathbf{d}X\_t^{\mathbf{h}} = f(X\_t^{\mathbf{h}})\mathbf{d}t - \frac{2\sigma t}{T}\nabla L(X\_t^{\mathbf{h}})\mathbf{d}t + \hat{\mathbf{g}}\_l(X\_l^{\mathbf{h}})\mathbf{d}t + \sqrt{2\sigma}\mathbf{d}W\_l,\tag{15}$$

where *g*ˆ*<sup>t</sup>* is a solution to

$$\nabla \cdot (\pi\_t^{\text{h}} \hat{\boldsymbol{g}}\_l) = \frac{1}{T} \pi\_t^{\text{h}} \left( L + t \nabla L \cdot \left( f - \frac{\sigma t}{T} \nabla L \right) + \sigma t \Delta L \right) + \pi\_t^{\text{h}} \frac{\dot{Z}\_l}{Z\_l}. \tag{16}$$

Furthermore, if *L* is a constant (as a function of *x*) or small in comparison to the other contributions in (16), then (16) simplifies further. In particular, this is the case if the forward map is linear, that is, *h(x)* = *H x*.

Building upon the mean field ODE (3), one obtains the equivalent controlled mean field ODE system

$$\frac{\mathbf{d}}{\Delta t}\tilde{X}\_{l}^{\mathbf{h}} = f(\tilde{X}\_{l}^{\mathbf{h}}) - \frac{\sigma t}{T}\nabla L(\tilde{X}\_{l}^{\mathbf{h}}) + g\_{l}(\tilde{X}\_{l}^{\mathbf{h}}) - \sigma \nabla \log \tilde{\pi}\_{l}^{\mathbf{h}}(\tilde{X}\_{l}^{\mathbf{h}}),\tag{17}$$

with *gt* defined as before. This mean field formulation again requires knowledge (or approximation) of <sup>∇</sup> log *<sup>π</sup>*˜ <sup>h</sup> *<sup>t</sup>* . A Gaussian approximation might be sufficient in certain circumstances giving rise to

$$\nabla \log \tilde{\pi}\_l^{\mathbf{h}}(\mathbf{x}) \approx - (\tilde{\Sigma}\_l^{\mathbf{h}})^{-1} (\mathbf{x} - \tilde{\mu}\_l^{\mathbf{h}}),\tag{18}$$

where *<sup>μ</sup>*˜ <sup>h</sup> *<sup>t</sup>* denotes the mean of *X*˜ <sup>h</sup> *<sup>t</sup>* and ˜ <sup>h</sup> *<sup>t</sup>* its covariance matrix.

## **5 Numerical Implementation**

No analytic solution to (11) is known and we thus have to resort to approximations. We note that a similar PDE arises in the computation of the gain in the feedback particle filter [24] and one could use the diffusion map based approximation [22] for the problem at hand. This method also transforms the PDE into a Poisson equation which it then translated into an equivalent integral equation, the semi-group form of the Poisson equation. As the name suggests the integral equation makes use of the generator of a semi-group which can be approximated by diffusion maps. Here we instead propose to follow the constant gain approximation first introduced in the EnKF methodology [21].

## *5.1 Ensemble Kalman Mean Field Approximation*

Let us assume that *L* ≈ *const* in (16). Then we only need to deal with the modified negative log likelihood function

$$\tilde{L}(\mathbf{x}) = \frac{1}{T} \left( L(\mathbf{x}) + t \nabla L(\mathbf{x}) \cdot \left( f(\mathbf{x}) - \frac{\sigma t}{T} \nabla L(\mathbf{x}) \right) \right) \tag{19a}$$

Data Assimilation: A Coupling Approach 267

$$\approx \frac{1}{T} \left( L(\mathbf{x}) + \frac{t}{\Delta t} \left\{ L(\mathbf{x}) - L\left(\mathbf{x} - \Delta t f(\mathbf{x}) + \Delta t \frac{t \sigma}{T} \nabla L(\mathbf{x})\right) \right\} \right) \tag{19b}$$

with *t* being the time-step also used later for time-stepping the evolution equations (10) or (15), respectively. Since *L* is given by (5), we define the modified forward map

$$\tilde{h}(\mathbf{x}) = h\left(\mathbf{x} - \Delta t f(\mathbf{x}) + \Delta t \frac{\sigma t}{T} \nabla L(\mathbf{x})\right) \tag{20}$$

and thus

$$\tilde{L}(\mathbf{x}) \approx \frac{t + \Delta t}{2\Delta t T} (h(\mathbf{x}) - \boldsymbol{\y}\_T)^\top \boldsymbol{R}^{-1} (h(\mathbf{x}) - \boldsymbol{\y}\_T) - \frac{t}{2\Delta t T} (\tilde{h}(\mathbf{x}) - \boldsymbol{\y}\_T)^\top \boldsymbol{R}^{-1} (\tilde{h}(\mathbf{x}) - \boldsymbol{\y}\_T). \tag{21}$$

Following the standard EnKF methodology for quadratic loss functions, this suggests to approximate the drift function *g*ˆ*<sup>t</sup>* in (15) as follows:

$$\hat{\mathbf{g}}\_{l}^{\text{KF}}(\mathbf{x}) = -\frac{t + \Delta t}{\Delta t T} \boldsymbol{\Sigma}\_{l}^{\text{x}h} \boldsymbol{R}^{-1} \left( \frac{1}{2} \left( h(\mathbf{x}) + \boldsymbol{\pi}\_{l}^{\text{h}}[h] \right) - \mathbf{y}\_{T} \right) \tag{22a}$$

$$+\frac{t}{\Delta t \, T} \boldsymbol{\Sigma}\_{\boldsymbol{t}}^{\times \bar{h}} \boldsymbol{R}^{-1} \left(\frac{1}{2} \left(\tilde{h}(\boldsymbol{x}) + \boldsymbol{\pi}\_{\boldsymbol{t}}^{\text{h}}[\tilde{h}]\right) - \mathbf{y}\_{T}\right). \tag{22b}$$

Here we have introduced the notation *π*<sup>h</sup> *<sup>t</sup>* [*l*] to denote the expectation value <sup>E</sup>*<sup>l</sup>* of a function *l(x)* under the PDF *π*<sup>h</sup> *<sup>t</sup>* . Furthermore, *xh <sup>t</sup>* denotes the correlation matrix between *x* and *h(x)* under the PDF *π*<sup>h</sup> *<sup>t</sup>* etc. The derivation of (22) can be found in Appendix 2.

## *5.2 Particle Approximation and Time-Stepping*

The controlled mean field equations (15) can be implemented numerically by the standard Monte Carlo *Ansatz*, that is, *M* particles *X(i) <sup>t</sup>* are propagated according to

$$\mathbf{d}X\_{l}^{(l)} = f(X\_{l}^{(l)})\mathbf{d}t - \frac{2\sigma t}{T}\nabla L(X\_{l}^{(l)})\mathbf{d}t + \hat{\mathbf{g}}\_{l}^{\text{KF}}(X\_{l}^{(l)})\mathbf{d}t + \sqrt{2\sigma}\mathbf{d}W\_{l}^{(l)}\tag{23}$$

for *<sup>i</sup>* <sup>=</sup> <sup>1</sup>*,...,M*. The required expectation values in *<sup>g</sup>*ˆKF *<sup>t</sup>* are evaluated with respect to the empirical measure

$$\hat{\pi}\_l^{\mathbf{h}}(\mathbf{x}) = \frac{1}{2} \sum\_{l=1}^{M} \delta(\mathbf{x} - X\_l^{(l)}). \tag{24}$$

The interacting particle system can be time-stepped using an appropriate adaptation of (61) from Appendix 2. The computation of gradients can be avoided by applying the statistical linearisation (60).

## **6 Examples**

We now discuss a sequence of increasingly complex examples. The purpose is both to illuminate certain aspects of the proposed control terms as well as to indicate the computational advantages of the proposed methodology. All examples will be based on linear forward maps *h(x)* = *H x* and, therefore *L* is constant and can be ignored.

## *6.1 Pure Diffusion Processes*

We set the drift *f* to zero in (1) and also assuming Gaussian initial conditions. Then the control term (22) gives rise to the mean-field SDE

$$\mathrm{d}X\_{l}^{\mathrm{h}} = \sqrt{2\sigma} \,\mathrm{d}W\_{l} - \frac{2\sigma t}{T} \,\mathrm{d}^{\top} \,\mathrm{R}^{-1} (\boldsymbol{H}X\_{l}^{\mathrm{h}} - \boldsymbol{y}\_{T}) \mathrm{d}t \tag{25a}$$

$$- \,\Sigma\_{l}^{\mathrm{h}} \boldsymbol{H}^{\top} \left\{ \frac{1}{T} \boldsymbol{R}^{-1} - \frac{2\sigma t^{2}}{T^{2}} \boldsymbol{R}^{-1} \boldsymbol{H} \boldsymbol{H}^{\top} \boldsymbol{R}^{-1} \right\} \left( \frac{1}{2} \left( \boldsymbol{H} \boldsymbol{X}\_{l}^{\mathrm{h}} + \boldsymbol{H} \boldsymbol{\mu}\_{l}^{\mathrm{h}} \right) - \boldsymbol{y}\_{T} \right) \mathrm{d}t \tag{25b}$$

in the limit *t* <sup>→</sup> 0. We note that *<sup>X</sup>*<sup>h</sup> *<sup>t</sup>* <sup>∼</sup> *<sup>π</sup>*<sup>h</sup> *<sup>t</sup>* will remain Gaussian for all times and we denote the mean by *μ*<sup>h</sup> *<sup>t</sup>* and the covariance matrix by <sup>h</sup> *<sup>t</sup>* . Hence, it holds that *xx <sup>t</sup>* <sup>=</sup> <sup>h</sup> *<sup>t</sup>* and *π*<sup>h</sup> *<sup>t</sup>* [*x*] = *<sup>μ</sup>*<sup>h</sup> *t* .

Please note that the additional drift term in (25a) is pulling *X*<sup>h</sup> *<sup>t</sup>* towards the observation *yT* regardless of the value of <sup>h</sup> *<sup>t</sup>* . It should also be noted that the drift term in (25b) can be both attractive or repulsive with regard to the observation *yT* depending on the eigenvalues of

$$
\Omega\_{\rm I} = \frac{1}{T} \boldsymbol{R}^{-1} - \frac{2\sigma \boldsymbol{t}^2}{T^2} \boldsymbol{R}^{-1} \boldsymbol{H} \boldsymbol{H}^{\top} \boldsymbol{R}^{-1}.\tag{26}
$$

The strength of this drift term is moderated by the covariance matrix <sup>h</sup> *t* .

We consider a one-dimensional problem with *R* = 0*.*01, *σ* = 1, *H* = 1, *yT* = 1 and *T* = 1. The initial conditions are Gaussian with mean *μ*<sup>0</sup> = 0 and variance <sup>0</sup> = 1. It follows that *π*<sup>1</sup> is Gaussian with mean *μ*<sup>1</sup> = 0 and variance <sup>1</sup> = 2 and the resulting Gaussian posterior *π<sup>a</sup>* <sup>1</sup> has mean and variance given by

**Fig. 1** Time evolution of the mean *μh <sup>t</sup>* and the variance *<sup>h</sup> t* under the mean field equations (25). Their values at final agree with the posterior values provided by (27)

$$
\mu\_1^a = K \text{y}\_T \approx 0.9524, \qquad \mu\_1^a = 2 - 2K \approx 0.0952 \tag{27}
$$

with Kalman gain *K* = 2*/(*2 + 0*.*01*)* ≈ 0*.*9524.

In Fig. 1 one can find the time evolution of the mean and the variance under the mean field equations (25). The early impact of the data driven control term on the dynamics is perhaps surprising and quite opposite to the standard sequential approach to data assimilation where one first propagates to final time and only then adjusts according to the available data. It is also worth noticing that the sign of the corresponding *t* changes sign at *tc* <sup>=</sup> <sup>√</sup>2*/*<sup>20</sup> implying that the drift term in (25) has a destabilizing effect on the dynamics for *t>tc*.

## *6.2 Purely Deterministic Processes*

We now set *σ* = 0 in (1). We obtain from (22) the mean field ODE system

$$\frac{d}{dt}X\_t^\mathbf{h} = f(X\_t^\mathbf{h}) - \frac{t + \Delta t}{\Delta t T} \Sigma\_t^\mathbf{h} H^\top R^{-1} \left(\frac{1}{2} H \left(X\_t^\mathbf{h} + \mu\_t^\mathbf{h}\right) - \mathbf{y}\_T\right) \tag{28a}$$

$$+\frac{t}{\Delta t \, T} \Sigma\_{\text{I}}^{\chi \bar{h}} R^{-1} \left( \frac{1}{2} \left( \tilde{h}(X\_{\text{I}}^{\text{h}}) + \pi\_{\text{I}}^{\text{h}}[\tilde{h}] \right) - \chi\_{T} \right) \tag{28b}$$

with

$$\dot{h}(\mathbf{x}) = H\mathbf{x} - \Delta t H f(\mathbf{x}).\tag{29}$$

These equations can be expanded giving rise to

$$\frac{\mathbf{d}}{\mathbf{d}t}X\_{l}^{\mathbf{h}} = f(X\_{l}^{\mathbf{h}}) - \frac{1}{T} \left\{ \Sigma\_{l}^{\mathbf{h}} + t\Sigma\_{l}^{\mathbf{x}f} \right\} H^{\top}R^{-1} \left( \frac{1}{2} H \left( X\_{l}^{\mathbf{h}} + \boldsymbol{\mu}\_{l}^{\mathbf{h}} \right) - \mathbf{y}\_{T} \right) \tag{30a}$$

$$-\frac{t}{2T}\boldsymbol{\Sigma}\_{l}^{\text{h}}\boldsymbol{H}^{\top}\boldsymbol{R}^{-1}\boldsymbol{H}\left(\boldsymbol{f}(\boldsymbol{X}\_{l}^{\text{h}}) + \boldsymbol{\pi}\_{l}^{\text{h}}[\boldsymbol{f}]\right)\tag{30b}$$

upon ignoring terms of order O*(t)*. Unless the drift function *f* is linear, these mean field equations provide only an approximation to the controlled mean field equations (15).

## *6.3 Linear Gaussian Case*

It is instructive to investigate the linear case

$$f(\mathbf{x}) = F\mathbf{x} + b \tag{31}$$

in more detail where again everything remains Gaussian provided *X<sup>h</sup>* <sup>0</sup> is Gaussian distributed, that is *π*0*(x)* = N *(x*;*μ*0*,* 0*)*. Under these conditions the densities *πt* and *π*<sup>h</sup> *<sup>t</sup>* will also be Gaussian, we write *π*<sup>h</sup> *<sup>t</sup> (x)* <sup>=</sup> <sup>N</sup> *(x*;*μ*<sup>h</sup> *<sup>t</sup> ,* <sup>h</sup> *<sup>t</sup> )*. The associated mean field equations follow from Appendix 2 and are given by

$$\frac{\mathbf{d}}{\Delta t}X\_t^{\mathbf{h}} = FX\_t^{\mathbf{h}} + b + \sigma(\Sigma\_t^{\mathbf{h}})^{-1}(X\_t^{\mathbf{h}} - \mu\_t^{\mathbf{h}}) - \frac{2\sigma t}{T}H^\top R^{-1}(HX\_t^{\mathbf{h}} - \mathbf{y}\_T) \tag{32a}$$

$$-\left.C\_{I}H^{\top}R^{-1}\left(\frac{1}{2}H\left(X\_{l}^{\text{h}}+\mu\_{l}^{\text{h}}\right)-\text{yr}\right)\right.\tag{32b}$$

$$-\left(\frac{t}{T}\Sigma\_{\mathrm{I}}^{\mathrm{h}}H^{\top}R^{-1}H\left(\frac{1}{2}F\left(X\_{\mathrm{I}}^{\mathrm{h}}+\mu\_{\mathrm{I}}^{\mathrm{h}}\right)+b\right).\tag{32c}$$

with

$$C\_{\rm I} = \frac{1}{T} \Sigma\_{\rm I}^{\rm h} + \frac{t}{T} \Sigma\_{\rm I}^{\rm h} F^{\top} - \frac{2\sigma t^2}{T^2} \Sigma\_{\rm I}^{\rm h} H^{\top} R^{-1} H. \tag{33}$$

A qualitative discussion can be performed in the scalar case, that is *dx* = 1, *H* = 1, *σ* = 1, *b* = 0, *T* = 1 and *F* = *λ*. One finds that the control terms involving *F* stabilize the dynamics whenever *λ >* 0. This observation is in line with the fact that the data is crucial only if the dynamics in *Xt* is unstable, that is, *λ >* 0.

We consider a two dimensional diffusion process with state variable *x* = *(x*1*, x*2*)* and linear drift term (31) given by

$$F = \begin{pmatrix} -2 & 1\\ 1 & -2 \end{pmatrix},\tag{34}$$

*b* = 0, and diffusion constant *σ* = 0*.*1*I* . The forward operator is *H* = 1 0 and the variance of the noise *R* = 0*.*01. The initial distribution was *π*<sup>0</sup> = N *((*1*,* 3*),* 0*.*02*I )*. The observed value at time *T* = 1 is set to *yT* = 2*.*5.

**Fig. 2** Left panel: Time evolution of the mean in *x*<sup>1</sup> and *x*2, the two associated variances and the covariance between *x*<sup>1</sup> and *x*<sup>2</sup> under the linear diffusion process. Right panel: Time evolution of the same quantities under the controlled diffusion process

The posterior mean takes values *μ*<sup>a</sup> <sup>1</sup> <sup>≈</sup> <sup>2</sup>*.*<sup>25</sup> and *<sup>μ</sup>*<sup>a</sup> <sup>2</sup> ≈ 1*.*50, while the posterior covariance matrix becomes

$$
\Sigma^{\mathrm{a}} \approx \begin{pmatrix} 0.0086 \ 0.0039 \\ 0.0039 \ 0.0503 \end{pmatrix} . \tag{35}
$$

Numerical results can be found in Fig. 2. The impact of the control term on the linear diffusion process can clearly be seen and is most prominent on the observed *x*<sup>1</sup> component of the process. The final values of the controlled process agree well with their posterior counterparts.

## *6.4 Nonlinear Diffusion Example*

We consider a two-dimensional problem and denote the state variable by *x* = *(x*1*, x*2*)*. The drift term is given by

$$f(\mathbf{x}) = -\nabla V(\mathbf{x}), \qquad V(\mathbf{x}) = \frac{\lambda\_1}{2} \left(\mathbf{x}\_2 - 2 + \beta \mathbf{x}\_1^2\right)^2 + \frac{\lambda\_2}{2} \left(\frac{\mathbf{x}\_1^4}{2} - \mathbf{x}\_1^2\right) \tag{36}$$

with parameters *λ*<sup>1</sup> = 2000, *λ*<sup>2</sup> = 5, and *β* = 1*/*5. The diffusion constant is set to *σ* = 1. The choice of the potential *V (x)* has two effects: (1) there is a relative high barrier for particles to pass from positive to negative *x*1-values and vice versa; (2) the dynamics stay close to the parabola *<sup>x</sup>*<sup>2</sup> <sup>=</sup> <sup>2</sup> <sup>−</sup> *βx*<sup>2</sup> 1 .

The initial distribution is obtained by sampling *x*<sup>1</sup> from a Gaussian with mean 1*.*5 and variance 0*.*0625. The *x*<sup>2</sup> component is obtained from the relation

$$\mathbf{x}\_2 = \mathbf{2} - \beta \mathbf{x}\_1^2. \tag{37}$$

We observe the first component *x*<sup>1</sup> of the state vector at time *T* = 1 with measurement error variance *R* = 0*.*01. The observed value is set to *yT* = −1*.*5. Due to the tiny observation error the posterior is centred sharply about the observed value. Furthermore, recall that the dynamics is essentially slaved to the parabola *<sup>x</sup>*<sup>2</sup> <sup>=</sup> <sup>2</sup> <sup>−</sup> *βx*<sup>2</sup> <sup>1</sup> which makes the inference problem strongly nonlinear.

All particle simulations are run with an ensemble size of *M* = 1000. Essentially identical results are obtained for *M* = 100. Smaller ensemble sizes lead to numerical instabilities.

In Fig. 3, one can find the particle distribution at time *t* = 1 which constitutes the prior distribution for the associated Bayesian inference problem. It is obvious that a particle filter would fail to recover the posterior distribution which is sharply centered about the observed value. We found that increasing the ensemble size to *M* = 10*,*000 allows a particle filter to recover the posterior distribution; but the effective ensemble size still drops dramatically. The approximation provided by the EnKF is also displayed. The EnKF fails to recover the posterior due to its inherent linear regression ansatz which is inappropriate for this strongly nonlinear inference problem even in the limit ensemble size *M* → ∞.

In Fig. 4, the results from the controlled mean field formulation are displayed. It can be concluded that the posterior distribution is well approximated despite the constant gain approximation made in order to formulate the control term *<sup>g</sup>*ˆKF *<sup>t</sup>* in (22).

## *6.5 Lorenz-63 Example*

All examples so far have considered a single data assimilation cycle only. We now perform a proper sequential data assimilation experiment for the standard Lorenz-63 model [11]

**Fig. 4** Left panel: Initial and final particle positions under the controlled evolution process. Right panel: Particle positions at intermediate times *tk* ∈ [0*,* 1]

$$\frac{\mathbf{d}}{\mathbf{d}t}X\_l = f(X\_l),\tag{38}$$

where *Xt* : <sup>→</sup> <sup>R</sup><sup>3</sup> and

$$f(\mathbf{x}, \mathbf{y}, z) = \begin{pmatrix} a(\mathbf{y} - \mathbf{x}) \\ x(b - z) - \mathbf{y} \\ \mathbf{x}\mathbf{y} - cz \end{pmatrix} \tag{39}$$

with parameters *a* = 10, *b* = 28 and *c* = 8*/*3.

In order to obtain a reference solution *X*† *<sup>t</sup>* for *t* ≥ 0, the ODE (38) is solved numerically with step-size *t* = 0*.*005 and initial condition

$$X\_0^\dagger = \begin{pmatrix} -0.587276\\ -0.563678\\ 16.8708 \end{pmatrix}.\tag{40}$$

Scalar-valued observations are generated every *t*obs *>* 0 units of time using the forward model

$$\mathbf{y}\_{n\Delta t\_{\rm obs}} = H\mathbf{X}\_{n\Delta t\_{\rm obs}} + \boldsymbol{\nu}\_{n}, \qquad n = 1, \ldots, N,\tag{41}$$

with measurement errors *νn* <sup>∼</sup> <sup>N</sup>*(*0*,* <sup>1</sup>*)* and forward map *<sup>H</sup>* <sup>=</sup> *(*100*)* <sup>∈</sup> <sup>R</sup>1×3. We use *t*obs ∈ {0*.*05*,* 0*.*1*,* 0*.*12} in our experiments and perform *N* = 20*,*000 assimilation cycles.

The initial ensemble {*X(i)* <sup>0</sup> }*<sup>M</sup> <sup>i</sup>*=<sup>1</sup> is drawn from the Gaussian distribution with mean *X*† <sup>0</sup> and covariance matrix 0*.*01*I* . We employ multiplicative ensemble inflation which amounts to replacing the Lorenz-63 dynamics by

$$\frac{\mathbf{d}}{\mathbf{d}t}X\_t^{(l)} = f(X\_t^{(l)}) + \sigma\_k(X\_t^{(l)} - \hat{\mu}\_t), \qquad i = 1, \ldots, M,\tag{42}$$

with inflation factors

$$
\sigma\_k = 0.025k, \qquad k = 0, \ldots, 9. \tag{43}
$$

Here *<sup>μ</sup>*ˆ*<sup>t</sup>* denotes the empirical mean of the ensemble {*X(i) <sup>t</sup>* }*<sup>M</sup> <sup>i</sup>*=1. These equations are combined with the augmented evolution equations (30) and solved numerically with step-size *t* = 0*.*005 and ensemble sizes *M* ∈ {5*,* 10*,* 15}.

We report the resulting root mean square errors

$$\text{RMSE} = \sqrt{\frac{1}{3N} \sum\_{n=1}^{N} \|\hat{\mu}\_{n\Delta l \text{obs}} - X\_{n\Delta l\_{\text{obs}}}^{\dagger}\|^2},\tag{44}$$

which are computed for each ensemble size *M*, observation interval *t*obs and inflation factor *σk*. The results are displayed in Table 1 where the smallest RMSE over the range of inflation factors {*σk*} 9 *<sup>k</sup>*=<sup>0</sup> is stated for each *<sup>M</sup>* and *t*obs. We also state the corresponding RMSEs from a standard ensemble square root filter implementation [2, 8]. We find that the proposed homotopy approach outperforms the ensemble square root filter in terms of RMSE in all settings considered. The improvements increase for increasing observation intervals *t*obs. The homotopy approach also appears less sensitive to the ensemble size *M*.

We close this example by pointing out that less of an improvement could be expected for a fully observed Lorenz-63 system. The proposed homotopy approach seems particularly effective in guiding the unobserved solution components to regions of high posterior probability. See also the example from Sect. 6.4.

**Table 1** RMSE for both a standard ensemble square root filter (ESRF) implementation and our homotopy approach in terms of ensemble sizes *M* ∈ {5*,* 10*,* 15} and observation intervals *t*obs ∈ {0*.*05*,* 0*.*1*,* 0*.*12}. The homotopy based data assimilation method leads to significantly reduced RMSEs in all settings


# **7 Conclusions**

Devising alternative proposal densities has a long history in the context of sequential data assimilation and filtering. Here we have explored a computationally tractable approach which combines the concept of Schrödinger bridges with a rather straightforward homotopy approach. A further key ingredient is the approximate solution of the arising PDEs in terms of a constant gain approximation, which is also widely used within the EnKF community. Numerical examples indicate that the approach is viable and can overcome limitations of both standard sequential Monte Carlo as well as standard EnKF methods. This has been demonstrated for single assimilation steps as well as long-time data assimilation using the chaotic Lorenz-63 model with only the first component observed infrequently. It remains to be seen how the proposed methods behave for high dimensional stochastic processes.

**Acknowledgments** This research has been funded by the Deutsche Forschungsgemeinschaft (DFG)- Project-ID 318763901 - SFB1294. We thank Nikolai Zaki for earlier work on the topic of this paper.

## **Appendix 1: Derivation of Control Term Equation**

Given an evolution equation

$$\mathbf{d}X\_{l} = f(X\_{l})\mathbf{d}t + \sqrt{2\sigma}\,\mathrm{d}W\_{l} \tag{45}$$

we obtain the Fokker–Planck equation

$$
\partial\_l \pi\_l = -\nabla \cdot \left( \pi\_l \left( f - \sigma \nabla \log \pi\_l \right) \right) . \tag{46}
$$

Now we modify (45) by an additional drift term, i.e.

$$\mathbf{d}X\_{l}^{\mathbf{h}} = f(X\_{l}^{\mathbf{h}})\mathbf{d}t + \tilde{\mathbf{g}}\_{l}(X\_{l}^{\mathbf{h}})\mathbf{d}t + \sqrt{2\sigma}\,\mathrm{d}W\_{l} \tag{47}$$

with *<sup>g</sup>*˜*<sup>t</sup>* : <sup>R</sup>*dx* <sup>→</sup> <sup>R</sup>*dx* . In that case, we would get a Fokker–Planck equation for the new equation:

$$\partial\_{\mathbf{l}} \pi\_{\mathbf{l}}^{\mathbf{h}} = -\nabla \cdot \left( \pi\_{\mathbf{l}}^{\mathbf{h}} \left( f + \tilde{\mathbf{g}} - \sigma \nabla \log \pi\_{\mathbf{l}}^{\mathbf{h}} \right) \right) = -\nabla \cdot \left( \pi\_{\mathbf{l}}^{\mathbf{h}} \left( f - \sigma \nabla \log \pi\_{\mathbf{l}}^{\mathbf{h}} \right) \right) - \nabla \cdot (\pi\_{\mathbf{l}}^{\mathbf{h}} \tilde{\varrho}\_{\mathbf{l}}).\tag{48}$$

We can find *g*˜*<sup>t</sup>* in terms of known quantities as follows: we begin by taking the derivative of *π*<sup>h</sup> *<sup>t</sup>* with respect to time:

$$
\partial\_t \pi\_l^\mathbf{h} = -\pi\_l^\mathbf{h} \left( \frac{\dot{Z}\_l}{Z\_l} + \frac{L}{T} \right) + \frac{1}{Z\_l} e^{-\frac{\mathbf{f}}{T}L} \partial\_l \pi\_l. \tag{49}
$$

Next we substitute (46) for *∂tπt* and use *πt* <sup>=</sup> *Zt <sup>e</sup> <sup>t</sup> <sup>T</sup> Lπ*<sup>h</sup> *t* :

$$\partial\_t \pi\_l^\mathbf{h} = -\pi\_l^\mathbf{h} \left( \frac{\dot{Z}\_l}{Z\_l} + \frac{L}{T} \right) - Z\_l^{-1} e^{-\frac{l}{T}L} \nabla \cdot \left( \pi\_l \left( f - \sigma \nabla \log \pi\_l \right) \right) \tag{50a}$$

$$\dot{\boldsymbol{\tau}} = -\boldsymbol{\pi}\_{l}^{\mathbf{h}} \left( \frac{\dot{\boldsymbol{Z}}\_{l}}{Z\_{l}} + \frac{L}{T} \right) - Z\_{l}^{-1} \boldsymbol{e}^{-\frac{\ell}{T}L} \nabla \cdot \left( Z\_{l} e^{\frac{\ell}{T}L} \boldsymbol{\pi}\_{l}^{\mathbf{h}} \left( \boldsymbol{f} - \boldsymbol{\sigma} \nabla \log Z\_{l} e^{\frac{\ell}{T}L} \boldsymbol{\pi}\_{l}^{\mathbf{h}} \right) \right) \tag{50b}$$

$$\dot{\mathbf{q}} = -\boldsymbol{\pi}\_{l}^{\mathbf{h}} \left( \frac{\dot{\mathbf{Z}}\_{l}}{Z\_{l}} + \frac{L}{T} \right) - e^{-\frac{t}{T}L} \nabla \cdot \left( e^{\frac{t}{T}L} \boldsymbol{\pi}\_{l}^{\mathbf{h}} \left( f - \sigma \frac{t}{T} \nabla L - \sigma \nabla \log \boldsymbol{\pi}\_{l}^{\mathbf{h}} \right) \right) \tag{50c}$$

$$\dot{\rho} = -\pi\_l^h \left( \frac{\dot{Z}\_l}{Z\_l} + \frac{L}{T} \right) - \nabla \cdot \left( \pi\_l^{\text{h}} \left( f - \sigma \frac{t}{T} \nabla L - \sigma \nabla \log \pi\_l^{\text{h}} \right) \right) \tag{50d}$$

$$-\left.\frac{t}{T}\pi\_{l}^{\text{h}}\nabla L \cdot \left(f - \sigma\frac{t}{T}\nabla L - \sigma\nabla\log\pi\_{l}^{\text{h}}\right).\tag{50e}$$

Comparing with (48) it follows that we require

$$\nabla \cdot (\pi\_{\rm I}^{\rm h} \tilde{g}\_{\rm I}) = \frac{1}{T} \pi\_{\rm I}^{\rm h} \left( L + t \nabla L \cdot \left( f - \sigma \frac{t}{T} \nabla L - \sigma \nabla \log \pi\_{\rm I}^{\rm h} \right) \right) + \pi\_{\rm I}^{\rm h} \frac{\dot{Z}\_{\rm I}}{Z\_{\rm I}} - \nabla \cdot \left( \pi\_{\rm I}^{\rm h} \sigma \frac{t}{T} \nabla L \right). \tag{51}$$

For the *Z*˙*<sup>t</sup>* term we have

$$\frac{\bar{Z}\_l}{Z\_l} = \frac{1}{Z\_l} \int \partial\_l e^{-\frac{\bar{\tau}}{\bar{\tau}} L(\mathbf{x})} \pi\_l(\mathbf{x}) d\mathbf{x} \tag{52a}$$

$$\hat{\mathbf{Z}} = \frac{1}{Z\_1} \int -\frac{L}{T} e^{-\frac{\hat{\tau}}{T}L(\mathbf{x})} \pi\_l(\mathbf{x}) + e^{-\frac{\hat{\tau}}{T}L(\mathbf{x})} \partial\_l \pi\_l(\mathbf{x}) d\mathbf{x} \tag{52b}$$

$$=-\frac{1}{T}\mathbb{E}L - \frac{1}{Z\_l} \int e^{-\frac{l}{T}L(\boldsymbol{\chi})} \nabla \cdot \left(\pi\_l \left(f - \sigma \nabla \log \pi\_l\right)\right) d\boldsymbol{x} \tag{52c}$$

$$=-\frac{1}{T}\mathbb{E}L - \frac{1}{Z\_l} \int \frac{t}{T} e^{-\frac{t}{T}L(\boldsymbol{\chi})} \pi\_l \nabla L \cdot (f - \sigma \nabla \log \pi\_l) \, d\boldsymbol{x} \tag{52d}$$

$$\mathbf{h} = -\frac{1}{T} \mathbb{E}L - \frac{t}{T} \mathbb{E}\nabla L \cdot (f - \sigma \nabla \log \pi\_l) \tag{52e}$$

$$=-\frac{1}{T}\mathbb{E}L-\frac{t}{T}\mathbb{E}\nabla L \cdot \left(f-\sigma\nabla\log\pi\_{l}^{\frac{\mathsf{h}}{\mathsf{h}}}-\sigma\frac{t}{T}\nabla L\right),\tag{52f}$$

where the third equality follows from integration by parts and the expected value is with taken with respect to *π*<sup>h</sup> *<sup>t</sup>* . We finally note that

Data Assimilation: A Coupling Approach 277

$$
\tilde{\mathbf{g}}\_l(\mathbf{x}) = \mathbf{g}\_l(\mathbf{x}) - \sigma \frac{t}{T} \nabla L(\mathbf{x}).\tag{53}
$$

# **Appendix 2: Ensemble Kalman Filter Approximations**

We provide details on the derivation of the EnKF-like approximation (22) to the controlled mean field equation (10) and the various simplifications that arise from assuming a linear forward map.

We first recall that a continuous time formulation of the EnKF for a generic likelihood function

$$L(\mathbf{x}) = \frac{1}{2} (\hat{h}(\mathbf{x}) - \mathbf{y}\_T)^\top \hat{R}^{-1} (\hat{h}(\mathbf{x}) - \mathbf{y}\_T) \tag{54}$$

is provided by

$$\frac{\mathbf{d}}{\mathbf{d}t}X\_{I} = -\Sigma\_{I}^{\chi\hat{h}}\hat{R}^{-1}\left(\frac{1}{2}\left(\hat{h}(X\_{I}) + \pi\_{I}[h]\right) - \chi\_{T}\right). \tag{55}$$

Here *πt* denotes the law of *Xt* , *πt*[*g*] the expectation value of a function *g* under *πt* , and

$$
\Sigma\_{\rm I}^{\rm x\hat{h}} = \pi\_{\rm I} \left[ (\mathbf{x} - \pi\_{\rm I}[\mathbf{x}]) (\hat{h} - \pi\_{\rm I}[h])^{\top} \right] \tag{56}
$$

is the covariance matrix between the state *x* and the forward map *h*ˆ.

Formal application of this approach to the two contributions to the likelihood function (21) leads to (22). More precisely, the first term leads to *h*ˆ = *h* and

$$
\hat{R} = \frac{t + \Delta t}{2\Delta t \, T} R \tag{57}
$$

while the second term results in *h*ˆ = *h*˜ and

$$
\hat{R} = -\frac{t}{\Delta t T} R.\tag{58}
$$

The EnKF makes use of statistical linearization

$$
\Sigma\_{\rm l}^{\times \times} \pi\_{\rm l}[\nabla h] = \Sigma\_{\rm l}^{\times h} \tag{59}
$$

which holds provided *πt* is Gaussian or if *h* is linear; a result known as Stein's identity. The identity can also be used to approximate derivatives in a (weakly) non-Gaussian setting giving rise to

$$
\nabla h(\mathbf{x}) \approx (\Sigma\_t^{\mathbf{x}\mathbf{x}})^{-1} \Sigma\_t^{\mathbf{x}h}.\tag{60}
$$

We also recall the robust time-stepping method

$$X\_{l\_{\hat{n}+1}} - X\_{l\_{\hat{n}}} = -\Delta t \Sigma\_{l\_{\hat{n}}}^{\chi \hat{h}} \left(\Delta t \Sigma\_{l\_{\hat{n}}}^{\hat{h}\hat{h}} + R\right)^{-1} \left(\frac{1}{2}\left(\hat{h}(X\_{l\_{\hat{n}}}) + \pi\_{l\_{\hat{n}}}[h]\right) - \mathfrak{y}\_{T}\right), \tag{61}$$

which again can be adjusted appropriately to (22).

We now assume a linear forward map, that is *h(x)* = *H x*, and discuss the simplifications that result in the computation of (22). Note that

$$
\tilde{h}(\mathbf{x}) = H\mathbf{x} - \Delta t H f(\mathbf{x}) + \Delta t \frac{\sigma t}{T} H H^\top \mathbf{R}^{-1} H \mathbf{x}. \tag{62}
$$

Hence the covariance matrix *xh*˜ *<sup>t</sup>* can be reformulated to

$$
\Sigma\_t^{\ge \tilde{h}} = \Sigma\_t^{\ge \ge} H^\top - \Delta t \Sigma\_t^{\ge f} H^\top + \Delta t \frac{\sigma t}{T} \Sigma\_t^{\ge \ge} H^\top R^{-1} H H^\top \tag{63}
$$

and (22) simplifies to

$$\hat{\mathbf{g}}\_{l}^{\text{KF}}(\mathbf{x}) = -\frac{1}{T} \Sigma\_{l}^{\text{xx}} H^{\top} R^{-1} \left( \frac{1}{2} \left( H\mathbf{x} + H\boldsymbol{\mu}\_{l}^{\text{h}} \right) - \mathbf{y}\_{T} \right) \tag{64a}$$

$$-\frac{t}{T} \boldsymbol{\Sigma}\_{\boldsymbol{t}}^{\boldsymbol{\chi}\boldsymbol{f}} \boldsymbol{H}^{\top} \boldsymbol{R}^{-1} \left(\frac{1}{2} \left(\boldsymbol{H}\boldsymbol{x} + \boldsymbol{H}\boldsymbol{\mu}\_{\boldsymbol{t}}^{\boldsymbol{h}}\right) - \boldsymbol{\chi}\_{\boldsymbol{T}}\right) \tag{64b}$$

$$+\frac{\sigma t^2}{T^2} \boldsymbol{\Sigma}\_t^{\times x} \boldsymbol{H}^\top \boldsymbol{R}^{-1} \boldsymbol{H} \boldsymbol{H}^\top \boldsymbol{R}^{-1} \left(\frac{1}{2} \left(\boldsymbol{H}\boldsymbol{x} + \boldsymbol{H}\boldsymbol{\mu}\_t^\mathbf{h}\right) - \mathbf{y}\_T\right) \tag{64c}$$

$$-\frac{t}{2T} \boldsymbol{\Sigma}\_{\boldsymbol{I}}^{\boldsymbol{\chi}\boldsymbol{\chi}} \boldsymbol{H}^{\top} \boldsymbol{R}^{-1} \boldsymbol{H} \left(\boldsymbol{f}(\boldsymbol{\chi}) + \boldsymbol{\pi}\_{\boldsymbol{I}}^{\boldsymbol{\text{h}}}[\boldsymbol{f}]\right) \tag{64d}$$

$$+\frac{\sigma\boldsymbol{\sigma}^{2}}{T^{2}}\boldsymbol{\Sigma}\_{\boldsymbol{t}}^{\boldsymbol{\chi}\boldsymbol{x}}\boldsymbol{H}^{\top}\boldsymbol{R}^{-1}\boldsymbol{H}\boldsymbol{H}^{\top}\boldsymbol{R}^{-1}\left(\frac{1}{2}\left(\boldsymbol{H}\boldsymbol{x}+\boldsymbol{H}\boldsymbol{\mu}\_{\boldsymbol{t}}^{\mathbf{h}}\right)-\boldsymbol{y}\_{\mathcal{T}}\right)+\mathcal{O}(\boldsymbol{\Delta}t)\tag{64e}$$

$$\mathbf{u} = -\mathbf{C}\_{l} \boldsymbol{H}^{\top} \boldsymbol{R}^{-1} \left(\frac{1}{2} \left(\boldsymbol{H}\mathbf{x} + \boldsymbol{H}\boldsymbol{\mu}\_{l}^{\mathbf{h}}\right) - \mathbf{y}\_{T}\right) \tag{64f}$$

$$-\frac{t}{2T}\Sigma\_{\text{t}}^{\text{xx}}H^{\top}R^{-1}H\left(f(\mathbf{x})+\pi\_{\text{t}}^{\text{h}}[f]\right)+\mathcal{O}(\Delta t)\tag{64g}$$

with

$$C\_{\rm I} = \frac{1}{T} \Sigma\_{\rm I}^{xx} + \frac{t}{T} \Sigma\_{\rm I}^{xf} - \frac{2\sigma t^2}{T^2} \Sigma\_{\rm I}^{xx} H^\top R^{-1} H. \tag{65}$$

Upon dropping terms of order <sup>O</sup>*(t)* and using *xx <sup>t</sup>* <sup>=</sup> <sup>h</sup> *<sup>t</sup>* , we obtain (25) for *f* = 0 and (30) for *σ* = 0 as special cases. The mean field equations (32) also follow easily from *f (x)* = *F x* + *b*.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Constrained Random Diffeomorphisms for Data Assimilation**

**Valentin Resseguier, Yicun Zhen, and Bertrand Chapron**

# **1 Introduction**

For ensemble-based data assimilation purposes, there is a definite need for relevant ensemble sampling tools. Indeed, the quality and spreading of these ensembles have deep implications in the quality of the data assimilation (Dufée et al 2022), and—until recently—those so-called covariance inflation tools have mostly relied on unsuitable linear Gaussian frameworks (Tandeo et al 2020, Resseguier et al 2020a). A promising alternative is the generation of ensembles through a stochastic remapping of the physical space.

Consider a random mapping *T* , acting at every infinitesimal time step, such that *Tt(x)* − *x* is interpreted as a "location perturbation" expressed by

$$T\_l(\mathbf{x}) = \mathbf{x} + a(t, \mathbf{x})\Delta t + e\_l(t, \mathbf{x})\Delta \eta\_l(t),\tag{1.1}$$

where *a(t, x), ei(t, x)* <sup>∈</sup> <sup>R</sup>*n*. In Eq. (1.1), *a(t, x)* controls deterministic location shifts, and *Δηi(t)* ∼ N *(*0*, Δt)* random ones. At every time step, this random mapping *T* shall induce a perturbation to any tensor field *θ (t)* (Zhen et al 2022). For instance, one can perturb a differential form *θ (t)* applying *θ (t)* → *T* <sup>∗</sup> *<sup>t</sup> θ (t)* with *T* ∗ *<sup>t</sup>* the associated pull-back operator.

V. Resseguier (-) INRAE, OPAALE, Rennes, France

LAB, SCALIAN DS, Rennes, France e-mail: valentin.resseguier@scalian.com

Y. Zhen Department of Oceanography, Hohai University, Nanjing, China

#### B. Chapron Laboratoire d'Océanographie Physique et Spatiale (LOPS), Ifremer, Plouzané, France

A rigorous mathematical definition and calculation of *Tt* and *T* <sup>∗</sup> *<sup>t</sup>* can be obtained in terms of stochastic flows of diffeomorphisms and its Lie derivatives (e.g., Bethencourt De Leon 2021). Yet, to rapidly assess *T* ∗ *<sup>t</sup> θ*, a Taylor expansion and Itô's lemma can be used. Given coordinates *(x*1*,...,xn)*, when *θ* is a differential *k*−form, it can be written as

$$\theta = \sum\_{i\_1 < \ldots < i\_k} f^{i\_1, \ldots, i\_k} d\mathbf{x}^{i\_1} \wedge \cdots \wedge d\mathbf{x}^{i\_k},\tag{1.2}$$

with *f* a semimartingale smooth in space. Then

$$T\_l^\*\theta = \sum\_{l\_1 < \ldots < l\_k} f^{l\_1,\ldots,l\_k}(T\_l(\mathbf{x})) T\_l^\*(d\mathbf{x}^{l\_1} \wedge \cdots \wedge d\mathbf{x}^{l\_k}),\tag{1.3}$$

leading to a compact expression

$$T\_l^\*\theta = \theta + \mathcal{M}(t,\theta)\Delta t + \mathcal{N}\_l(t,\theta)\Delta\eta\_l(t),\tag{1.4}$$

with some differential *k*−forms M*(t, θ )* and N*i(t, θ )*. Appendix 4 provides definitions of M and N (see Zhen et al 2022, Appendix B for a full proof).

Hereafter, we present and discuss the potential of this random mapping scheme to possibly prescribe *θ*, and the parameters *a* and *ei* to ensure that certain quantities, i.e. mass, vorticity, helicity, energy, are conserved.

Several examples of *T* ∗ *<sup>t</sup> θ* can indeed be considered. For instance, when *θ* = *f* is a function (differential 0−form),

$$\delta(T\_l^\*\theta) = f + \underbrace{\left(a^j \partial\_{\ge l} f + \frac{1}{2} e\_l^p e\_l^q \partial\_{\ge l} \partial\_{\ge l} f\right)}\_{=:\mathcal{M}} \Delta t + \underbrace{e\_l^p \partial\_{\ge l} f}\_{=:\mathcal{N}\_l} \Delta \eta\_l. \tag{1.5}$$

And when *<sup>θ</sup>* <sup>=</sup> *fdx*<sup>1</sup> ∧···∧ *dx<sup>n</sup>* (differential *<sup>n</sup>*-form), it then follows

$$T\_l^\* \theta = \left| f + \left( (\partial\_{\chi^p} a^p + \frac{1}{2} J\_l) f + (a^p + e\_l^p \partial\_{\chi^q} e\_l^q) \partial\_{\chi^p} f + \frac{1}{2} e\_l^p e\_l^q \partial\_{\chi^p} \partial\_{\chi^q} f \right) \Delta t \right|$$

$$+ (\partial\_{\chi^p} e\_l^p \, f + e\_l^p \partial\_{\chi^p} f) \Delta \eta \, \Big| \, d\chi^1 \wedge \dots \wedge d\chi^n. \tag{1.6}$$

Finally, when *<sup>θ</sup>* <sup>=</sup> *<sup>f</sup> <sup>j</sup> dx<sup>j</sup>* <sup>=</sup> *<sup>n</sup> <sup>j</sup>*=<sup>1</sup> *<sup>f</sup> <sup>j</sup> dx<sup>j</sup>* is differential 1-form, we have

$$T\_l^\* \theta = \left\{ f^j + (a^p \partial\_{\chi^p} f^j + \frac{1}{2} e\_l^p e\_l^q \partial\_{\chi^p} \partial\_{\chi^q} f^j + \partial\_{\chi^j} a^p f^p + \partial\_{\chi^j} e\_l^p e\_l^q \partial\_{\chi^q} f^p) \Delta t \right\}$$

$$+ (e\_l^p \partial\_{\chi^p} f^j + \partial\_{\chi^j} e\_l^p f^p) \Delta \eta\_l \Big| \,\Delta t^j. \tag{1.7}$$

## **2 Induced Stochastic PDE**

From the expression of *T* ∗ *<sup>t</sup> θ*, a SPDE is derived from an original PDE, when *θ* is a differential form. Suppose *S<sup>d</sup>* is the full state variable of the deterministic dynamical system:

$$\frac{\partial S^d}{\partial t} = g(S^d). \tag{2.1}$$

Let *f <sup>d</sup>* be a component or a collection of components of *S<sup>d</sup>* . We then associate *f <sup>d</sup>* to a differential form *<sup>θ</sup> <sup>d</sup>* , i.e. there is an invertible map <sup>F</sup> that maps the space of *<sup>f</sup> <sup>d</sup>* to the space of *<sup>θ</sup> <sup>d</sup>* , such that <sup>F</sup>*(f <sup>d</sup> )* <sup>=</sup> *<sup>θ</sup> <sup>d</sup>* . Typically, if *<sup>f</sup> <sup>d</sup>* is a tracer, it is often associated to the 0-form *<sup>θ</sup> <sup>d</sup>* <sup>=</sup> *<sup>f</sup> <sup>d</sup>* . If *<sup>f</sup> <sup>d</sup>* is the density *<sup>ρ</sup><sup>d</sup>* , we might associate the *<sup>n</sup>*-form *<sup>θ</sup> <sup>d</sup>* <sup>=</sup> *<sup>ρ</sup><sup>d</sup> dxi*<sup>1</sup> ∧···∧*dxin* . More generally, *<sup>θ</sup> <sup>d</sup>* , and thus <sup>F</sup>, can be prescribed to ensure that certain quantities—such as mass, energy, circulation—are conserved (Zhen et al 2022, section 3.3). Consider the propagation equation for *f <sup>d</sup>*

$$\text{cl}\,f^d = \text{g}^f(\mathbb{S}^d)\text{d}t.\tag{2.2}$$

It implies a propagation equation for *θ*:

$$\text{d}\theta^d = \text{g}^{\theta}(\text{S}^d)\text{d}t.\tag{2.3}$$

We will now stochastically perturb the above deterministic dynamics. Let us denote *S,f* and *θ* the semimartingale solutions of this randomized dynamics. The proposed discrete-time perturbation at each time step consists of the following two steps:

$$\left\{\tilde{\theta}(t+\Delta t) = \theta(t) + \mathbf{g}^{\theta}(S(t))\Delta t,\tag{2.4}$$

$$\left\{\theta(t+\Delta t) = T\_l^\* \tilde{\theta}(t+\Delta t), \tag{2.5}$$

with *T* ∗ *<sup>t</sup> θ (t* ˜ +*Δt)* = *θ (t* ˜ +*Δt)*+M*(t, θ (t* ˜ +*Δt))Δt*+N*i(t, θ (t* ˜ +*Δt))Δηi(t)*+*o(Δt)* for the associated differential forms M*(t, θ )*˜ and N*i(t, θ )*˜ .

The deterministic PDE (2.4) and *θ (t* ˜ +*Δt)*−*θ (t)* scales in *O(Δt)*. There is no noise term to induce a scaling in *O(*√*Δt)*. Therefore, it can be assumed that there exists *C >* 0 so that M*(t, θ (t* ˜ + *Δt))* − M*(t, θ (t)) < CΔt* and N*i(t, θ (t* ˜ + *Δt))* − N*i(t, θ (t)) < CΔt*, for *Δt* small enough. Accordingly,

$$\begin{aligned} T\_i^\* \tilde{\theta}(t + \Delta t) &= \tilde{\theta}(t + \Delta t) + \left( \mathcal{M}(t, \theta(t)) + \mathcal{O}(\Delta t) \right) \Delta t \\ &+ \left( \mathcal{N}\_i(t, \theta(t)) + \mathcal{O}(\Delta t) \right) \Delta \eta\_i(t) + o(\Delta t), \end{aligned} \tag{2.6}$$

$$\begin{aligned} &= \tilde{\theta}(t + \Delta t) + \mathcal{M}(t, \theta(t)) \Delta t + \mathcal{N}\_i(t, \theta(t)) \Delta \eta\_i(t) + o(\Delta t). \end{aligned} \tag{2.6}$$

Therefore,

$$\theta(t+\Delta t) = \theta(t) + \mathbf{g}^{\theta}(\mathcal{S}(t))\Delta t + \mathcal{M}(t,\theta(t))\Delta t + \mathcal{N}\_l(t,\theta(t))\Delta \eta\_l + o(\Delta t). \tag{2.7}$$

It suggests the following stochastic propagation equation for *θ*:

$$\mathbf{d}\theta = \mathbf{g}^{\theta}(\mathbf{S})\mathbf{d}t + \mathcal{M}(t,\theta)\mathbf{d}t + \mathcal{N}\_l(t,\theta)\mathbf{d}\eta\_l. \tag{2.8}$$

Since there is a 1-1 correspondence between *θ* and *f* , Eq. (2.3) also suggests a stochastic propagation equation for *f* , which can be written as

$$\mathrm{d}f = \mathrm{g}^f(\mathbb{S})\mathrm{d}t + \mathcal{M}^f(f)\mathrm{d}t + \mathcal{N}\_l^f(f)\mathrm{d}\eta\_l. \tag{2.9}$$

We denote the additional terms in Eq. (2.9) by

$$\mathrm{d}\_{s}f := \mathcal{M}^{f}(f)\mathrm{d}t + \mathcal{N}\_{l}^{f}(f)\mathrm{d}\eta\_{l}.\tag{2.10}$$

Then Eq. (2.9) can be written as:

$$\mathbf{d}f = \mathbf{g}^f(\mathbf{S})\mathbf{d}t + \mathbf{d}\_\mathbf{s}f. \tag{2.11}$$

## **3 Comparison with Other Perturbation Schemes**

Obtained above, d*sf* is completely determined by *T* <sup>∗</sup> *<sup>t</sup> θ*, but is not directly related to the original dynamics Eq. (2.2). Once the expression of *T* in Eq. (1.1) and the choice of the differential form *θ <sup>d</sup>* are determined, the perturbation term d*sf* is prescribed. However, the choice of *θ <sup>d</sup>* is up to the user, and may then be related to the original dynamics.

In the following, we thus demonstrate that both the stochastic advection by Lie transport (SALT) equation (Holm 2015) and the location uncertainty (LU) equation (Mémin 2014, Resseguier et al 2017; 2020b) can be properly recovered using the proposed perturbation scheme.

## *3.1 Comparison with the LU Equations*

The Reynolds transport theorem is central to the LU setting. The Reynolds transport theorem expresses an integral conservation equation for the transport of any conserved quantity within a fluid, connected to its corresponding differential equation. A link between the proposed perturbation approach and the LU formulation can be anticipated to be related to differential *n*-forms. But first, we consider a key ingredient of LU: the stochastic material derivative of functions (differential 0 forms).

#### **3.1.1 0-Forms in the LU Framework**

Dropping the forcing terms, LU equation for compressible and incompressible flow reads (Resseguier et al 2017)

$$
\partial\_t f + \mathfrak{w}^\star \cdot \nabla f = \nabla \cdot (\frac{1}{2} \mathfrak{a} \nabla f) - \sigma \dot{\mathfrak{B}} \cdot \nabla f,\tag{3.1}
$$

$$\boldsymbol{\mathfrak{w}}^{\star} = \boldsymbol{\mathfrak{w}} - \frac{1}{2} (\boldsymbol{\nabla} \cdot \boldsymbol{\mathfrak{a}})^{\top} + \boldsymbol{\sigma} \left(\boldsymbol{\nabla} \cdot \boldsymbol{\sigma}\right)^{\top},\tag{3.2}$$

where *<sup>a</sup>* <sup>=</sup> *<sup>σ</sup>*•*k<sup>σ</sup> <sup>T</sup>* •*<sup>k</sup>* and *<sup>f</sup>* can be any quantity that is assumed to be transported by the flow, i.e. *Df/Dt* = 0 where *D/Dt* is the Itô material derivative. For instance, *f* could be the velocity (dropping forces in the SPDE), the temperature, or the buoyancy.

Separating the terms of the SPDE related to the deterministic dynamics from the term associated to the stochastic scheme, it comes

$$\mathbf{d}^{\mathrm{LU}}f = \mathbf{g}^f(\mathcal{S})\mathbf{d}t + \mathbf{d}\_s^{\mathrm{LU}}f,\tag{3.3}$$

where

$$\log^f(\mathcal{S}) = -\,\,\mathfrak{w}\cdot\nabla f,\tag{3.4}$$

$$\mathrm{d}\_{s}^{\mathrm{LU}}f = -\left(\mathfrak{w}^{\star} - \mathfrak{w}\right) \cdot \nabla f \, \mathrm{d}t - \sigma \, \mathrm{d}\mathcal{B} \cdot \nabla f + \nabla \cdot \left(\frac{1}{2} \mathfrak{a} \nabla f\right) \mathrm{d}t. \tag{3.5}$$

Besides, from our proposed scheme applied to a 0-form *θ* = *f* (Eq. (1.5)), we obtain:

$$\mathrm{d}\_{s}f = \left(a^{p}\partial\_{\chi^{p}}f + \frac{1}{2}e\_{l}^{p}e\_{l}^{q}\partial\_{\chi^{p}}\partial\_{\chi^{q}}f\right)\mathrm{d}t + e\_{l}^{p}\partial\_{\chi^{p}}f\,\mathrm{d}\eta\_{l}.\tag{3.6}$$

To physically interpret this equation, we rewrite:

$$\frac{\mathrm{d}\_{\mathrm{s}}f}{\mathrm{d}t} = -V^{p}\partial\_{\mathrm{x}^{p}}f + \partial\_{\mathrm{x}^{p}}\left((\frac{1}{2}e\_{l}^{p}e\_{l}^{q})\partial\_{\mathrm{x}^{q}}f\right),\tag{3.7}$$

where

$$V^p = -a^p + \frac{1}{2} \partial\_{\lambda^q} (e^p\_{\ l} e^q\_{\ l}) - e^p\_{\ l} \frac{\mathrm{d}\eta\_l}{\mathrm{d}t}.\tag{3.8}$$

Terms of advection and diffusion are recognized. The matrix <sup>1</sup> <sup>2</sup> *eie<sup>T</sup> <sup>i</sup>* is symmetric non-negative and represents a diffusion matrix. The *p*-th component of the advecting velocity *<sup>V</sup> <sup>p</sup>* is composed of the drift <sup>−</sup>*ap*, a correction <sup>1</sup> <sup>2</sup> *∂xq (e<sup>p</sup> i e q <sup>i</sup> )*, and a stochastic advecting velocity −*e p i* d*ηi* <sup>d</sup>*<sup>t</sup>* .

Direct calculation yields that Eq. (3.5) coincides with Eq. (3.7) when *a* = *<sup>σ</sup>*•*k<sup>σ</sup> <sup>T</sup>* •*<sup>k</sup>* <sup>=</sup> *eie<sup>T</sup> <sup>i</sup>* and *σB*˙ = −*eidηi* and

$$T\_l(\mathbf{x}) = \mathbf{x} + e\_l^q \partial\_{\dot{\mathbf{x}}\_q} e\_l \Delta t + e\_l \Delta \eta\_l = \mathbf{x} - \mathbf{w}\_S^c \Delta t + (-\mathbf{w}\_S^c \Delta t - \sigma \Delta \mathbf{B}), \tag{3.9}$$

where

$$\boldsymbol{\mathfrak{w}}\_{S}^{\boldsymbol{\varepsilon}} = -\frac{1}{2} (\boldsymbol{\sigma}\_{\bullet k} \cdot \boldsymbol{\nabla}) \boldsymbol{\sigma}\_{\bullet k} = -\frac{1}{2} (\boldsymbol{\nabla} \cdot \boldsymbol{\mathfrak{a}})^{\top} + \frac{1}{2} \boldsymbol{\sigma} (\boldsymbol{\nabla} \cdot \boldsymbol{\sigma})^{\top}. \tag{3.10}$$

The LU equation can thus be derived by choosing *θ* = *f* and *Tt* by Eq. (3.9). Note, the term *(*−*w<sup>c</sup> <sup>S</sup>Δt* <sup>−</sup> *<sup>σ</sup>ΔB)* <sup>=</sup> *(* <sup>1</sup> 2 *e q <sup>i</sup> ∂xq eiΔt* + *eiΔηi)* is the Itô noise plus its Itôto-Stratonovich correction. Hence, it corresponds to the Stratonovich noise *ei* ◦ d*ηi* of the flow associated to *Tt* . The additional drift <sup>−</sup>*w<sup>c</sup> <sup>S</sup>Δt* is different in nature. It is related to the advection correction *w<sup>c</sup> <sup>S</sup>* · ∇*f* in the LU setting. Indeed, in the LU framework, the Itô drift, *w*, is seen as the resolved large-scale velocity. That is why, in this framework, the deterministic dynamics (3.4) involve the Itô drift, *w*. This is also the reason why, under the LU derivation, the advected velocity is assumed to be given by the Itô drift, *w*. It differs from the Stratonovich drift *w<sup>S</sup>* = *w* + *wc <sup>S</sup>*, used as advected velocity in SALT approach or in Mikulevicius and Rozovskii (2004) (where the Stratonovich drift is denoted *u*). Interested readers are referred to (Resseguier et al 2020b, Appendix A) for a discussion on these assumptions and for the complete table of SALT-LU notations correspondences. Note however that in all these approaches, the advecting velocity is always the Stratonovich drift. This can be seen e.g., in the Stratonovich form of LU equations, derived in (Resseguier 2017, Appendix 10.1) and (Resseguier et al 2020a, 6.1.3):

$$
\partial\_t f + \mathfrak{w}\_S \cdot \nabla f = -\left(\sigma \diamond \mathbf{B}\right) \cdot \nabla f,\tag{3.11}
$$

where *σ* ◦*B*˙ is the Stratonovich noise of the SPDE. Since the advecting velocity *w<sup>S</sup>* and the resolved velocity *w* differ by a drift *w<sup>c</sup> <sup>S</sup>*, the term *<sup>w</sup><sup>c</sup> <sup>S</sup>* · ∇*f* is interpreted as an advection correction, being part of the stochastic scheme (3.5). Accordingly, the remapping *Tt* involves an additional drift <sup>−</sup>*w<sup>c</sup> <sup>S</sup>Δt* .

To also understand (3.9), the inverse flow can be considered:

$$T\_l^{-1}(\mathbf{x}) = \mathbf{x} - e\_l \Delta \eta\_l = \mathbf{x} + \sigma \Delta \mathbf{B}. \tag{3.12}$$

If *Tt* represents a necessary perturbation to match, at each time step, a true solution, *T* <sup>−</sup><sup>1</sup> *<sup>t</sup>* measures the difference, at each time step, between this true solution and a model forecast. Therefore, the LU equation can be derived using the proposed perturbation scheme, choosing *θ* = *f* and assuming that a true solution differs from a model forecast by a displacement prescribed by Eq. (3.12).

#### **3.1.2 n-Forms in the LU Framework**

The LU physical justification relies on a stochastic interpretation of fundamental conservation laws, typically conservation of extensive properties (i.e. integrals of functions over a spatial volume) like momentum, mass, matter and energy (Resseguier et al 2017). These extensive properties can be expressed by integrals of differential *n*−forms. For instance, the mass and the momentum are integrals of the differential *<sup>n</sup>*−forms *ρdx*<sup>1</sup> ∧···∧ *dx<sup>n</sup>* and *<sup>ρ</sup>wdx*<sup>1</sup> ∧···∧ *dxn*, respectively. In the LU framework, a stochastic version of the Reynolds transport theorem (Resseguier et al 2017, Eq. (28)) is used to deal with these differential *n*−forms *<sup>θ</sup>* <sup>=</sup> *fdx*<sup>1</sup> ∧··· ∧ *dxn*. Assuming an integral conservation *<sup>d</sup> dt V (t) f* = 0 on a spatial domain *V (t)* transported by the flow, it leads to the following SPDE:

$$\frac{Df}{Dt} + \nabla \cdot (\boldsymbol{\omega}^{\star} + \boldsymbol{\sigma} \dot{\mathbf{B}}) f = \frac{d}{dt} \left\langle \int\_{0}^{t} D\_{l} f, \int\_{0}^{t} \nabla \cdot \boldsymbol{\sigma} \dot{\mathbf{B}} \right\rangle = (\nabla \cdot \boldsymbol{\sigma}\_{\bullet \dot{t}}) (\nabla \cdot \boldsymbol{\sigma}\_{\bullet \dot{t}})^{T} f,\tag{3.13}$$

where *D/Dt* denotes the Itô material derivative. Forcing terms are dropped for the sake of readability. This SPDE can be rewritten using the expression of that material derivative (Eq. (9) and (10) of Resseguier et al (2017)):

$$\partial\_t f + \nabla \cdot (\mathfrak{w}\_S f) = \frac{1}{2} \nabla \cdot (\mathfrak{a} \nabla f) + \frac{1}{2} \nabla \cdot (\mathfrak{a} \,\_{\bullet l} (\nabla \cdot \mathfrak{a}\_{\bullet l})^T f) - \nabla \cdot (\mathfrak{a} \, \dot{\mathfrak{B}} f). \tag{3.14}$$

The original deterministic equation and stochastic perturbation correspond to

$$\log^f(\mathcal{S}) = -\nabla \cdot (\mathfrak{w}f),\tag{3.15}$$

$$\mathbf{d}\_s^{\rm LU} f \equiv (-\nabla \cdot (\mathbf{w}\_S^c f) + \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla f) + \frac{1}{2} \nabla \cdot (\sigma\_{\bullet l} (\nabla \cdot \sigma\_{\bullet l})^T f)) \mathrm{d}t - \nabla \cdot (\sigma \, \mathrm{d}Bf), \tag{3.16}$$

$$=-\nabla \cdot \left( (-(\frac{1}{2}\nabla \cdot \mathbf{a})^T \mathbf{d}t + \sigma \mathbf{d} \mathbf{B})f \right) + \nabla \cdot (\frac{1}{2}\mathbf{a} \nabla f) \mathbf{d}t. \tag{3.17}$$

We can now compare these LU equations to our new stochastic scheme applied to *<sup>n</sup>*-form *<sup>θ</sup>* <sup>=</sup> *fdx*<sup>1</sup> ∧···∧ *dx<sup>n</sup>* (Eq. (1.6)). This implies that

$$\mathrm{d}\_{s}f = \left( (\partial\_{\chi^{p}}a^{p} + \frac{1}{2}J\_{l})f + (a^{p} + e\_{l}^{p}\partial\_{\chi^{q}}e\_{l}^{q})\partial\_{\chi^{p}}f + \frac{1}{2}e\_{l}^{p}e\_{l}^{q}\partial\_{\chi^{p}}\partial\_{\chi^{q}}f \right)\mathrm{d}t$$

$$+ (\partial\_{\chi^{p}}e\_{l}^{p}f + e\_{l}^{p}\partial\_{\chi^{p}}f)\mathrm{d}\eta\_{l},\tag{3.18}$$

where *Ji* = *∂xp e p <sup>i</sup> ∂xq e q <sup>i</sup>* − *∂xp e q <sup>i</sup> ∂xq e p <sup>i</sup>* . Rewritten, it leads to:

$$\frac{\mathrm{d}\_{s}f}{\mathrm{d}t} = -\partial\_{\mathbb{X}^{p}}\left(\tilde{V}^{p}f\right) + \partial\_{\mathbb{X}^{p}}\left((\frac{1}{2}e\_{l}^{p}e\_{l}^{q})\partial\_{\mathbb{X}^{q}}f\right),\tag{3.19}$$

where

$$\tilde{V}^{p} = V^{p} - (e\_{l}^{p}\partial\_{\mathbb{X}^{q}}e\_{l}^{q}) = -a^{p} + \frac{1}{2}(\partial\_{\mathbb{X}^{q}}e\_{l}^{p}e\_{l}^{q} - e\_{l}^{p}\partial\_{\mathbb{X}^{q}}e\_{l}^{q}) - e\_{l}^{p}\frac{\mathrm{d}\eta\_{l}}{\mathrm{d}t}.\tag{3.20}$$

Again an advection-diffusion equation is recognized, but of different nature. Indeed, as expected for an n-form, the PDE is similar to a density conservation equation. Moreover, the advecting drift is slightly different to take into account the crosscorrelations between *f (Tt(x))* and *T* <sup>∗</sup> *<sup>t</sup> (dx*<sup>1</sup> ∧···∧ *dxn)*.

Identifying *<sup>a</sup>* <sup>=</sup> *<sup>σ</sup>*•*k<sup>σ</sup> <sup>T</sup>* •*<sup>k</sup>* <sup>=</sup> *eie<sup>T</sup> <sup>i</sup>* and *σB*˙ = −*eidηi*,

$$\tilde{V} = -a^p + \frac{1}{2} (\partial\_{\mathbf{x}^q} e\_{l}^p e\_{l}^q - e\_{l}^p \partial\_{\mathbf{x}^q} e\_{l}^q) - e\_{l}^p \frac{\mathbf{d} \eta\_l}{\mathbf{d}t} = -(\frac{1}{2} \nabla \cdot \mathbf{a})^T + \sigma \dot{\mathbf{B}},\tag{3.21}$$

i.e.

$$a^p = \frac{1}{2}(\partial\_{\mathcal{X}^q} e\_l^p e\_l^q - e\_l^p \partial\_{\mathcal{X}^q} e\_l^q) + \frac{1}{2}\partial\_{\mathcal{X}^q} (e\_l^p e\_l^q) = e\_l^q \partial\_{\mathcal{X}^q} e\_l^p. \tag{3.22}$$

A remapping is thus obtained to write

$$T\_l(\mathbf{x}) = \mathbf{x} + e\_l^q \partial\_{\mathbf{x}\_q} e\_l \Delta t + e\_l \Delta \eta\_l = \mathbf{x} - \mathbf{w}\_S^c \Delta t + (-\mathbf{w}\_S^c \Delta t - \sigma \Delta \mathbf{B}), \tag{3.23}$$

already derived for differential 0−form in LU framework (Eq. (3.9)). Therefore, the proposed perturbation mapping can also encompass the LU framework for *n*− forms, and its capacity—given by the Reynolds transport theorem—to deal with extensive properties.

Moreover, for incompressible flows, LU equation further imposes that

$$\begin{cases} \nabla \cdot \boldsymbol{\sigma} = 0, \\ \nabla \cdot \nabla \cdot \boldsymbol{a} = 0. \end{cases} \tag{3.24}$$

Translating it into our present notation, it reads as

$$\begin{cases} \partial\_{\chi\_p} e\_i^p = 0, \text{ for each } i\\ \partial\_{\chi\_p} \partial\_{\chi\_q} (e\_i^p e\_i^q) = 0. \end{cases} \tag{3.25}$$

Following straightforward calculation, Eq. (3.24) is found equivalent to that *T* ∗ *<sup>t</sup> θ* = *<sup>θ</sup>* for *<sup>θ</sup>* <sup>=</sup> *dx*<sup>1</sup> ∧···∧ *dxn*. Such a result is expected since constraints Eq. (3.24) are obtained from the LU density conservation.

## *3.2 The SALT Perturbation Scheme*

Holm (2015) derived the original SALT equation following a stochastically constrained variational principle *δS* = 0, for which

$$\begin{cases} \mathbf{S}(\boldsymbol{u}, q) = \int \ell(\boldsymbol{u}, q) \mathbf{d}t, \\ \quad \mathbf{d}q + \mathbf{f}\_{\mathrm{d}\boldsymbol{\chi}\_{\mathrm{f}}} q = 0, \end{cases} \tag{3.26}$$

where  *(u, q)* is the Lagrangian of the system, £ is the Lie derivative, and *xt(x)* is defined by (using our notation)

$$\mathbf{x}\_{l}(\mathbf{x}) = \mathbf{x}\_{0}(\mathbf{x}) + \int\_{0}^{l} u(\mathbf{x}, \mathbf{s}) \mathbf{ds} - \int\_{0}^{l} e\_{l}(\mathbf{x}) \diamond \mathbf{d}\eta\_{l}(\mathbf{s}),\tag{3.27}$$

in which *u* is the velocity vector field. The ◦ means that the integral is defined in the Stratonovich sense, instead of in the Ito sense. Hence, d*xt* = *u(x, t)*d*t* − *ei* ◦ d*ηi* refers to an infinitesimal stochastic tangent field on the domain. We can express d*xt* = *Tt(x)*−*x*+*u*d*t*. Note the difference between Ito's notation and Stratonovich's notation, i.e. *ei* ◦ d*ηi* = *ei*d*ηi*. The initial expression of *Tt* essentially follows Ito's notation. In this subsection, it comes that *Tt(x)* = *x* − *eiΔηi*. Instead, it becomes *Tt(x)* <sup>=</sup> *<sup>x</sup>* <sup>+</sup> <sup>1</sup> 2 *e p <sup>i</sup> ∂xp eiΔt* − *eiΔηi*.

In the second equation of Eq.(3.26), *q* is assumed to be a quantity advected by the flow. *q* can correspond to any differential form that is not uniquely determined by the velocity (since the SALT equation for the velocity is usually determined by the first equation of Eq. (3.26)). Holm (2015) evaluates the Lie derivative £d*xt q* using Cartan's formula:

$$\mathfrak{f}\_{\rm dx}q = d(i\_{\rm dx}q) + i\_{\rm dx\_l}dq. \tag{3.28}$$

This Lie derivative £d*xt q* corresponds to *T* <sup>∗</sup> *<sup>t</sup> <sup>q</sup>* <sup>−</sup> *<sup>q</sup>* <sup>+</sup> *<sup>f</sup> <sup>q</sup> (S)*d*t*, if we assume that the deterministic forecast of *q* is simply the advection of *q* by *u*. More generally, £d*xt*−*u*<sup>d</sup>*tq* = *T* <sup>∗</sup> *<sup>t</sup> q* − *q*. Therefore, the SALT equation for *q* is the same as our perturbation for *q*. Note, the Cartan's formula can not be directly applied to calculate the Lie derivative if the expression of d*xt* is in Ito's notation.

Within the SALT setting, the velocity *u* comes from the first equation of Eq. (3.26). For most cases, the velocity *u* is associated with the momentum, a differential 1−form m <sup>=</sup> *uj dx<sup>j</sup>* <sup>=</sup> *<sup>u</sup>*1*dx*<sup>1</sup> <sup>+</sup> *...* <sup>+</sup> *undxn*. When the Lagrangian includes the kinetic energy, Holm (2015) observed that the stochastic noises contribute a term £d*xt θ*, where *θ* is a differential 1−form related to the momentum 1−form. In particular, *θ* = m for the "Stratonovich stochastic Euler-Poincaré flow" example, and *<sup>θ</sup>* <sup>=</sup> <sup>m</sup> <sup>+</sup> *<sup>R</sup><sup>j</sup> dx<sup>j</sup>* for the "Stochastic Euler-Boussinesq equations of a rotating stratified incompressible fluid".

Already pointed out, the operator £d*xt* is closely related to *T* <sup>∗</sup> *<sup>t</sup>* , and the SALT momentum equation can thus also be derived using our proposed perturbation scheme by properly choosing *θ*, without relying on Lagrangian mechanics.

Another way to appreciate the correspondence to SALT is by looking at the final SPDE. If we choose *θ* to be a differential 1-form to represent the momentum *f* , i.e. *<sup>θ</sup>* <sup>=</sup> *<sup>f</sup> <sup>j</sup> dx<sup>j</sup>* we obtain from Eq. (1.7) (more details in Zhen et al 2022):

$$\mathsf{d}\_{s}f^{j} = (a^{p}\partial\_{\mathbf{x}^{p}}f^{j} + \frac{1}{2}e\_{l}^{p}e\_{l}^{q}\partial\_{\mathbf{x}^{p}}\partial\_{\mathbf{x}^{q}}f^{j} + \partial\_{\mathbf{x}^{j}}a^{p}f^{p} + \partial\_{\mathbf{x}^{j}}e\_{l}^{p}e\_{l}^{q}\partial\_{\mathbf{x}^{q}}f^{p})\mathsf{d}t$$

$$+ (e\_{l}^{p}\partial\_{\mathbf{x}^{p}}f^{j} + \partial\_{\mathbf{x}^{j}}e\_{l}^{p}f^{p})\mathsf{d}\eta\_{l}.\tag{3.29}$$

Regrouping the terms for physical interpretation, it writes:

$$\frac{\mathrm{d}\_{\mathrm{s}}f^{j}}{\mathrm{d}t} = -V^{p}\partial\_{\mathrm{x}^{p}}f^{j} + \partial\_{\mathrm{x}^{p}}\left((\frac{1}{2}e\_{l}^{p}e\_{l}^{q})\partial\_{\mathrm{x}^{q}}f^{j}\right)$$

$$+\partial\_{\mathrm{x}^{j}}\left(a^{p} + e\_{l}^{p}\frac{\mathrm{d}\eta\_{l}}{\mathrm{d}t}\right)f^{p} + \partial\_{\mathrm{x}^{j}}e\_{l}^{p}e\_{l}^{q}\partial\_{\mathrm{x}^{q}}f^{p}.\tag{3.30}$$

Two last terms of the right-hand side complete the advection-diffusion terms, already appearing in (3.7). The first one, *∂xj* <sup>−</sup>*a<sup>p</sup>* <sup>−</sup> *<sup>e</sup> p i* d*ηi* d*t f <sup>p</sup>*, is reminiscent to the additional terms appearing in SALT momentum equations (Holm 2015, Resseguier et al 2020b). The second term, −*∂xj e p i e q <sup>i</sup> ∂xq <sup>f</sup> <sup>p</sup>*, comes from crosscorrelation in Itô notation.

## **4 Conclusion**

As demonstrated, both SALT and LU equations can be recovered using a prescribed definition of a random diffeomorphism *Tt* used to perturb the physical space. However, compared with SALT and LU settings, the proposed perturbation scheme does not directly rely on a particular physics. Hence, the random mapping is more flexible and can be applied to any PDE. Interestingly, similarities and differences can then be identified and studied between the proposed use of the random diffeomorphism and the existing stochastic physical SALT and LU settings. For instance, the proposed derivation provides an interesting interpretation the operator £d*xt*−*u*d*<sup>t</sup>* , appearing in the SALT equation. This term can indeed represent an infinitesimal forecast error at every forecast time step.

To apply the proposed perturbation scheme to any specific model, the diffeomorphism parameters *a* and *ei* must be determined specifically. Hence it is necessary to learn these parameters from existing data, experimental runs, or additional physical considerations. This framework naturally provides new perspectives to generate ensembles through constrained stochastic mappings applied in the physical space.

**Acknowledgments** This work is supported by the ERC project 856408-STUOD and SCALIAN DS.

#### **Appendix: Expression of** *T* **∗** *t θ*

Given coordinates *(x*1*,...,xn)* and a differential *<sup>k</sup>*−form *<sup>θ</sup>*, Zhen et al (2022) (Appendix B) proof that:

$$\begin{split} T\_t^\dagger \theta &= f(T\_t(\mathbf{x})) T\_t^\dagger (dx^{l\_1} \wedge \dots \wedge dx^{l\_k}) \\ &= \theta + \left\{ (\nabla f, e\_i) + \frac{1}{2} e\_i^\top H\_f e\_i + I W) dx^{l\_1} \wedge \dots \wedge dx^{l\_k} \\ &+ \sum\_{s=1}^k f \partial\_{\lambda^j} d^{i\_1} dx^{l\_1} \wedge \dots \wedge dx^{l\_j} \wedge \dots \wedge dx^{l\_k} \\ &+ \left( \sum\_{s$$

where *IW* is the additional term appearing in the Itô-Wentzell formula (Kunita 1997). Here, there is no noise in the original dynamics (2.3) (the first step (2.4) of the randomized dynamics) which could be correlated with the noise of the resulting stochastic scheme (2.5). That is why *IW* = 0 in the above Taylor development of f. Indeed, there is no additional cross-correlation term between *T* ∗ *t* and *θ (t* ˜ <sup>+</sup> *Δt)* <sup>=</sup> *θ (t)* <sup>+</sup> *<sup>g</sup><sup>θ</sup> (S(t))Δt* . The final SPDE (2.8) makes clear the link between the solution *θ* and the Brownian motions *ηi*. But, at a given time step *t*, since (2.2) has no noise term, *θ (t* ˜ + *Δt)* is correlated with the *t* → *ηi(t )* for *t < t* only, and is independent of the new Brownian increment *Δηi(t)* generating *Tt* . Therefore, there is no cross-correlation term between *T* <sup>∗</sup> *<sup>t</sup>* and *θ (t* ˜ + *Δt)*.

To simplify Eq. (A.1), wedge algebra is applied and the high-order infinitesimal *o(Δt)* is ignored. Accordingly, *T* ∗ *<sup>t</sup> θ* is more compactly written as

$$T\_t^\*\theta = \theta + \mathcal{M}(\theta)\Delta t + \mathcal{N}\_l(\theta)\Delta \eta\_l,\tag{A.2}$$

for some differential *k*−forms M*(θ )* and N*i(θ )*.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Stochastic Compressible Navier–Stokes Equations Under Location Uncertainty**

**Gilles Tissot, Étienne Mémin, and Quentin Jamet**

**Abstract** The aim of this paper is to provide a stochastic version under location uncertainty of the compressible Navier–Stokes equations. To that end, some clarifications of the stochastic Reynolds transport theorem are given when stochastic source terms are present in the right-hand side. We apply this conservation theorem to density, momentum and total energy in order to obtain a transport equation of the primitive variables, i.e. density, velocity and temperature. We show that performing low Mach and Boussinesq approximations to this more general set of equations allows us to recover the known incompressible stochastic Navier–Stokes equations and the stochastic Boussinesq equations, respectively. Finally, we provide some research directions of using this general set of equations in the perspective of relaxing the Boussinesq and hydrostatic assumptions for ocean modelling.

# **1 Introduction**

Stochastic modelling under location uncertainty (LU) relies on the decomposition of the displacement of fluid particles into a time-differentiable velocity field, and a highly fluctuating component represented by a Brownian motion. It was proposed in [15] to apply this principle to fluid flows, leading to a stochastic version of the Navier–Stokes equations. On this basis, a similar derivation has been performed for various ocean models, such as Boussinesq models [16], quasi-geostrophic (QG) models [4, 14, 17], surface quasi-geostrophic models (SQG) [18] and shallow water equations [5].

G. Tissot (-) · É. Mémin

INRIA Centre de l'Université de Rennes, IRMAR – UMR CNRS 6625, Rennes, France e-mail: gilles.tissot@inria.fr

Q. Jamet INRIA Centre de l'Université de Rennes, IRMAR – UMR CNRS 6625, Rennes, France LOPS, Ifremer, Plouzané, France

In the ocean, variations of density, temperature and salinity are of great importance. In the previously cited models the Boussinesq assumption of small compressibility has been assumed from the start. This is a fair approximation, but it can become limiting, for instance when radiative transfers heat the ocean surface. Some research efforts have been performed to account for compressibility in deterministic oceanic flow models [20, 9, 8] in order to obtain energetically consistent formulations. Another key aspect is that, Boussinesq models cannot sustain acoustic waves, which is relevant for two major applications: (i) ocean acoustics and (ii) numerical simulations of non-Boussinesq models, where pseudocompressibility strategies [7] are employed to compute the pressure with explicit schemes without having to solve an expensive 3D Poisson equation [3]. In addition, a rigorous development of Boussinesq systems requires to perform the Boussinesq approximations on the compressible equations [22].

The derivation of a compressible stochastic system cannot be derived from the incompressible stochastic system, since it corresponds to a generalisation step. We propose in this paper to start from the classical physical conservation laws to derive a general stochastic compressible Navier–Stokes system. We verify that the provided set of equations is consistent with the incompressible stochastic models previously developed. We will moreover theoretically show some potential developments enabling to perform a relaxation of the Boussinesq assumption. Such a procedure will allow us to propose stochastic systems of increasing complexity lying in between Boussinesq hydrostatic system and a fully compressible flow dynamics.

The paper is organised as follows. In Sect. 2, we briefly recall the LU formalism and provide a convenient form of the stochastic Reynolds transport theorem when the budget of conserved quantities is balanced by external source or flux terms of stochastic nature. In Sect. 3 we develop the stochastic compressible Navier–Stokes equations. In Sects. 4 and 5, the low Mach number and Boussinesq approximations are performed respectively. We verify in these two sections the consistency with stochastic models previously derived from stochastic isochoric models [16]. This stochastic Boussinesq model is generalised by incorporating thermodynamic effects. In Sect. 6 these approximations are relaxed and we propose a model which can be integrated explicitly in time, similarly as in [3]. In Sect. 7 some concluding remarks are given. In appendix, technical calculation rules and important details to perform energy budgets are provided.

## **2 Stochastic Reynolds Transport Theorem**

The transport of conserved quantities subject to a stochastic transport is described by the stochastic Reynolds transport theorem (SRTT) introduced in [15]. When stochastic source terms are involved in the budget, additional covariation terms have to be taken into account [16]. These terms are usually defined in an implicit manner. In the present section, we briefly present the modelling under location uncertainty, and we rewrite the SRTT in a convenient form for further developments.

In the modelling under location uncertainty [15], the displacement *X(x,t)* of a particle is written in a differential form as

$$\mathbf{d}X(\mathbf{x},t) = \mathbf{u}(\mathbf{x},t)\mathbf{d}t + \sigma\_I \mathbf{d}B\_I,\tag{1}$$

where *<sup>u</sup>* <sup>=</sup> *(u, v, w)T* is a time-differentiable velocity component, and <sup>d</sup>*B<sup>t</sup>* is the increment of a Brownian motion, whose aim is to model unresolved timedecorrelated velocity contributions. The correlation operator *σ<sup>t</sup>* is an integral operator which involves a spatial convolution in the domain *Ω* with a user-defined correlation kernel *σ***ˇ** , such that

$$\left(\left(\sigma\_{l}\mathrm{d}\mathcal{B}\_{l}\right)^{j}\left(\mathbf{x}\right) = \int\_{\mathcal{Q}} \check{\sigma}^{lj}\left(\mathbf{x}, \mathbf{x}', t\right) \mathrm{d}\mathcal{B}\_{l}^{j}\left(\mathbf{x}'\right) \mathrm{d}\mathbf{x}'.\tag{2}$$

Associated with *σ<sup>t</sup>* , we define the (matrix) variance tensor **a** (that corresponds to the one point covariance tensor) such that

$$\mathbf{a}\_{lj}(\mathbf{x})\mathbf{d}t = \mathbb{E}\left( (\sigma\_I \mathbf{d} \mathbf{B}\_I)^l \left( \mathbf{x} \right) (\sigma\_I \mathbf{d} \mathbf{B}\_I)^j \left( \mathbf{x} \right) \right). \tag{3}$$

Within this framework, the stochastic transport operator of a scalar quantity *q* is defined by

$$\begin{split} \mathbb{D}\_{l}q \stackrel{\scriptstyle \Delta}{=} & \mathbf{d}\_{l}q + \left( \underbrace{\left( \mathbf{u} - \frac{1}{2} \nabla \cdot \mathbf{a} + \sigma\_{l} (\nabla \cdot \boldsymbol{\sigma}\_{l}) \right)}\_{\mathbf{u^{\*}}} \cdot \nabla \right) q \text{ d}t \\ &+ \left( \sigma\_{l} \mathbf{d} \mathbf{B}\_{l} \cdot \nabla \right) q - \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla q) \, \mathrm{d}t, \end{split} \tag{4}$$

where *u*, called *drift velocity*, is the resolved velocity corrected by the inhomogeneity and divergence of the noise correlation tensor, respectively. Physical relevance of the drift velocity and the stochastic diffusion <sup>1</sup> <sup>2</sup>∇ **·***(***a**∇*q)* d*t* has been extensively highlighted in previous studies [e.g. 6, 4].

Variation of *q* integrated over a transported volume [16] can be written

$$\operatorname{d} \int\_{\varOmega(t)} q \, \operatorname{d} \mathbf{x} = \int\_{\varOmega(t)} \left( \mathbb{D}\_l q + q \nabla \cdot \left( \mathfrak{u}^\star \operatorname{d} t + \sigma\_l \operatorname{d} \mathcal{B}\_l \right) + \nabla \cdot \left( \sigma\_l \operatorname{h} \right) \operatorname{d} t \right) \operatorname{d} \mathbf{x}, \qquad (5)$$

with *h* defined as follows: when the stochastic transport operator is isolated on the left-hand-side (LHS), *h* is associated with the martingale part of the remaining righthand-side (RHS):

$$\mathbb{D}\_l q = f \mathbf{d}t + \mathbf{h} \cdot \mathbf{d} \mathbf{B}\_l. \tag{6}$$

Starting from (5), we can assume that some source terms *Qt* d*t* + *Q<sup>σ</sup>* **·**d*B<sup>t</sup>* , with a time-differentiable and a martingale contribution respectively, are balancing the budget of *q* in the control volume such as

$$\operatorname{d} \int\_{\mathcal{Q}(t)} q \, \mathrm{d}x = \int\_{\mathcal{Q}(t)} \left( \mathcal{Q}\_l \, \mathrm{d}t + \mathcal{Q}\_\sigma \cdot \mathrm{d}\mathcal{B}\_l \right) \, \mathrm{d}x. \tag{7}$$

These RHS terms correspond to forces (resp. work) when this general expression is associated with the momentum (resp. energy) equation. Dropping the volume integral, we can now identify *h*

$$
\mathbf{h} \cdot \mathbf{d} \mathbf{B}\_l = -q \nabla \cdot (\sigma\_l \mathbf{d} \mathbf{B}\_l) + \mathbf{Q}\_\sigma \cdot \mathbf{d} \mathbf{B}\_l. \tag{8}
$$

This leads to the explicit expression of the stochastic Reynolds transport theorem

$$\mathbf{d}\_{l}q + \nabla \cdot \left( \left( (\mathbf{u} - \frac{1}{2} \nabla \cdot \mathbf{a}) \, \mathrm{d}t + \sigma\_{l} \mathrm{d}B\_{l} \right) q \right) + \nabla \cdot (\sigma\_{l} \, \mathcal{Q}\_{\sigma}) \, \mathrm{d}t - \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla q) \, \mathrm{d}t$$
  $\mathbf{d} = \mathcal{Q}\_{l} \, \mathrm{d}t + \mathcal{Q}\_{\sigma} \cdot \mathrm{d}B\_{l}. \tag{9}$ 

The absence of the term *σt(*∇ **·** *σt)* in the modified drift is an important feature of this expression. It has been cancelled (not neglected) by accounting for the term −*q*∇ **·***(σt*d*Bt)* in Eq. (8). As it will be detailed further, this term will reappear when we will transform the conservative form of the equations to their associated nonconservative form, i.e. writing a transport equation for the primitive variables. For consistency checking, it has been assessed in appendix 7, that the same expression is obtained using a Stratonovich stochastic integral convention.

## **3 Stochastic Compressible Navier–Stokes Equations**

To obtain the stochastic compressible Navier–Stokes equations we apply the SRTT equation (9) to the mass, momentum and total energy. This requires at first to properly define the physical variables.

## *3.1 Non-dimensioning*

We consider the time *<sup>t</sup>*, *<sup>x</sup>* <sup>=</sup> *(x, y, z)<sup>T</sup>* the space coordinates of *<sup>Ω</sup>*, and *(e<sup>x</sup> , <sup>e</sup><sup>y</sup> , <sup>e</sup>z)* the associated canonical basis. Physical quantities are marked by •*<sup>φ</sup>* and the other quantities are non-dimensional. We adimensionalise by reference conditions (noted •ref), and introduce a reference distance *L*ref, velocity *u*ref, density *ρ*ref, sound speed *c*ref as well as viscosity *μ*ref. We get

*<sup>x</sup>* <sup>=</sup> *<sup>x</sup><sup>φ</sup> L*ref ; *<sup>t</sup>* <sup>=</sup> *<sup>t</sup>φu*ref *L*ref ; *<sup>u</sup>* <sup>=</sup> *<sup>u</sup><sup>φ</sup> u*ref ; *<sup>c</sup>* <sup>=</sup> *<sup>c</sup><sup>φ</sup> u*ref ; *<sup>M</sup>* <sup>=</sup> *<sup>u</sup>*ref *c*ref ; *<sup>ρ</sup>* <sup>=</sup> *<sup>ρ</sup><sup>φ</sup> ρ*ref ; *<sup>μ</sup>* <sup>=</sup> *μφ μ*ref ; *<sup>p</sup>* <sup>=</sup> *<sup>p</sup><sup>φ</sup> ρ*ref*u*<sup>2</sup> ref ; *<sup>T</sup>* <sup>=</sup> *<sup>T</sup> φc φ p u*2 ref ; *<sup>γ</sup>* <sup>=</sup> *<sup>c</sup> φ p c φ v* ; *<sup>e</sup>* <sup>=</sup> *<sup>e</sup><sup>φ</sup> u*2 ref = *T γ* ; *<sup>g</sup>* <sup>=</sup> *<sup>g</sup>φL*ref *ρ*ref*u*<sup>2</sup> ref *,* (10)

with *u* the velocity vector, *c* the speed of sound, *M* the Mach number (i.e. the ratio of typical particle speed to typical sound speed), *ρ* the density, *p* the pressure, *μ* de dynamic viscosity, *T* the temperature, *γ* the heat capacity ratio, *(cp, cv)* the heat capacities at constant pressure/volume, *e* the internal energy and *g* = −*ge<sup>z</sup>* the acceleration vector due to gravity. We introduce as well the Reynolds and Prandtl numbers

$$Re = \frac{\rho\_{\rm nl} \mu\_{\rm nl} L\_{\rm nl}}{\mu\_{\rm nl}} \quad ; \quad Pr = \frac{c\_p^{\phi} \mu^{\phi}}{k\_T^{\phi}} , \tag{11}$$

with *k φ <sup>T</sup>* the thermal conductivity.

## *3.2 Continuity*

Mass conservation ensues upon applying the SRTT on density, i.e. *q* = *ρ* with no mass source of any kind:

$$\mathbf{d}\_l \rho + \nabla \cdot \left( \left( (\mathbf{u} - \frac{1}{2} \nabla \cdot \mathbf{a}) \, \mathrm{d}t + \sigma\_l \, \mathrm{d}\mathcal{B}\_l \right) \rho \right) = \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla \rho) \, \mathrm{d}t. \tag{12}$$

## *3.3 Momentum*

Applying now the SRTT to the momentum *ρui* balanced by forces, with *ui* ∈ {*u, v, w*}.

$$\begin{split} \mathbf{d}\_{l}(\rho u\_{l}) &+ \nabla \cdot \left( \left( (\mathbf{u} - \frac{1}{2} \nabla \cdot \mathbf{a}) \, \mathrm{d}t + \sigma\_{l} \mathrm{d}\mathcal{B}\_{l} \right) \rho u\_{l} \right) + \nabla \cdot \left( \sigma\_{l} \, \mathrm{F}^{\rho u\_{l}}\_{\sigma} \right) \mathrm{d}t \\ &= -\frac{\partial p}{\partial x\_{l}} \, \mathrm{d}t - \frac{\partial \mathrm{d}p^{\sigma}\_{l}}{\partial x\_{l}} - \rho g \delta\_{l, \mathfrak{e}\_{l}} \\ &+ \frac{1}{Re} \frac{\partial \pi\_{lj}(\mathbf{u})}{\partial x\_{j}} \, \mathrm{d}t + \frac{1}{Re} \frac{\partial \pi\_{lj}(\sigma\_{l} \mathrm{d}\mathcal{B}\_{l})}{\partial x\_{j}} + \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla (\rho u\_{l})) \, \mathrm{d}t, \end{split} \tag{13}$$

with *<sup>F</sup>ρui <sup>σ</sup>* **·** <sup>d</sup>*B<sup>t</sup>* = −*∂*d*p<sup>σ</sup> t ∂xi* <sup>+</sup> <sup>1</sup> *Re*∇ **·** *(τi(σt*d*Bt))*. The forces involved here are caused by pressure gradient, viscous stresses *τ* and gravity. The pressure gradient is decomposed in a time-differentiable part *p* d*t* and a random component d*p<sup>σ</sup> <sup>t</sup>* . For sake of generality, we consider the molecular viscosity stress tensor:

$$\pi(\mathfrak{u}) = \mu \left(\nabla \mathfrak{u} + (\nabla \mathfrak{u})^T\right) + \left(\mu\_b - \frac{2}{3}\mu\right) \nabla \cdot \mathfrak{u} \,\mathbb{I},\tag{14}$$

with *μb* the bulk viscosity. Similarly to the pressure, there is a finite variation friction contribution due to *u* d*t* and a martingale contribution due to *σt*d*B<sup>t</sup>* .

After some manipulations and using the stochastic distributivity rule (70) given in appendix 7, we obtain

$$\begin{split} &\rho \operatorname{\mathbb{D}}\_{l} u\_{l} \\ &+ \sum\_{k} \operatorname{\mathbf{d}}\_{l} \left\langle \int\_{0}^{l} \rho (\boldsymbol{\sigma}\_{s} \operatorname{\mathbf{d}} \mathbf{B}\_{s})^{k}, \int\_{0}^{l} \frac{\partial}{\partial \boldsymbol{x}\_{k}} \left( \frac{1}{\rho} \left( -\frac{\partial \operatorname{\mathbf{d}} p\_{s}^{\sigma}}{\partial \boldsymbol{x}\_{l}} + \frac{1}{\operatorname{Re}} \frac{\partial \operatorname{\mathbf{r}}\_{lj} (\boldsymbol{\sigma}\_{s} \operatorname{\mathbf{d}} \mathbf{B}\_{s})}{\partial \boldsymbol{x}\_{j}} \right) \right) \right\rangle \\ &= -\frac{\partial p}{\partial \boldsymbol{x}\_{l}} \operatorname{\mathbf{d}} t - \frac{\partial \operatorname{\mathbf{d}} p\_{l}^{\sigma}}{\partial \boldsymbol{x}\_{l}} + \frac{1}{\operatorname{Re}} \frac{\partial \operatorname{\mathbf{r}}\_{lj} (\mathbf{u})}{\partial \boldsymbol{x}\_{j}} \operatorname{\mathbf{d}} t + \frac{1}{\operatorname{Re}} \frac{\partial \operatorname{\mathbf{r}}\_{lj} (\boldsymbol{\sigma}\_{l} \operatorname{\mathbf{d}} \mathbf{B}\_{l})}{\partial \boldsymbol{x}\_{j}} - \rho \operatorname{g} \boldsymbol{\delta}\_{l, \mathbf{e}\_{l}}. \end{split} \tag{15}$$

This expression is very similar to the momentum equation of the incompressible Navier-Stokes equations [15, eq. 41 with incompressibility assumption]. We only have as an additional term the covariation between (the martingale part of) forces and the small scale component *ρσt*d*B<sup>t</sup>* . This term, usually difficult to evaluate analytically is generally neglected through a slight variation of the expression of Newton's law in the LU framework, as for instance in [16, Appendix E].

## *3.4 Energy*

As in the deterministic framework [23, 2, 13], we now consider conservation of the total energy and deduce a transport equation for the temperature.

#### **General Formulation**

Work of forces and heat fluxes acting on a transported control volume induce variations of total energy *E* such that:

$$\begin{split} \mathbf{d}\_{l}(\rho E) &+ \nabla \cdot \left( \left( (\mathbf{u} - \frac{1}{2} \nabla \cdot \mathbf{a}) \, \mathrm{d}t + \sigma\_{l} \mathrm{d}B\_{l} \right) \rho E \right) + \nabla \cdot (\sigma\_{l} F\_{\sigma}^{\rho E}) \\\\ &= \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla (\rho E)) \, \mathrm{d}t + \mathrm{d}W - \nabla \cdot (\mathrm{d}q), \end{split} \tag{16}$$

with d*W* and d*q* the elementary work of the forces and heat fluxes detailed later. The martingale part of these RHS terms is written *FρE <sup>σ</sup>* **·** d*B<sup>t</sup>* .

Using (70) and the continuity equation (12), we obtain

$$\rho \mathbb{D}\_l(E) + \sum\_k \mathbf{d}\_l \left\langle \int\_0^l \rho (\sigma\_s \mathbf{d} \mathbf{B}\_s)^k, \int\_0^l \frac{\partial}{\partial x\_k} \left( \frac{1}{\rho} \mathbf{F}\_\sigma^{\rho E} \cdot \mathbf{d} \mathbf{B}\_s \right) \right\rangle = \mathbf{d}W - \nabla \cdot (\mathbf{d}q). \tag{17}$$

#### **Definition of the Energy**

At this point, the form of the total energy has to be specified. It is strongly related to the physical mechanisms at play. In the present study, we consider the total energy *ρE* <sup>=</sup> *ρ(e* <sup>+</sup> <sup>1</sup> <sup>2</sup> *u*<sup>2</sup> <sup>+</sup> *gz)*, as the sum of internal energy *<sup>e</sup>* <sup>=</sup> *<sup>T</sup> <sup>γ</sup>* , kinetic energy and potential energy due to gravity. We do not consider the energy of the Brownian motion since it is possibly infinite.

#### **Definition of the Work of Forces and Heat Fluxes**

The work of the time-differentiable pressure represents how pressure is working with the displacement of the control surface. The expression can be obtained by integrating the force multiplied by the surface displacement over a transported control volume and applying Green's formulae. The procedure is similar to the deterministic framework, with the additional implication of the drift velocity, as demonstrated in appendix 7. We have for the pressure work:

$$\begin{split} \int\_{\varOmega(t)} \mathrm{d}W\_p \, \mathrm{d}\mathbf{x} &= \int\_{\vardelta\Omega(t)} (-p \, \mathfrak{n} \, \mathrm{d}S) \cdot (\mathfrak{u}^\star \mathrm{d}t + \sigma\_l \, \mathrm{d}B\_l) \\ &= - \int\_{\varOmega(t)} \nabla \cdot (p \, (\mathsf{u}^\star \mathrm{d}t + \sigma\_l \, \mathrm{d}B\_l)) \, \mathrm{d}\mathbf{x} . \end{split} \tag{18}$$

The minus sign comes from the outward normal *n* convention. We can then identify

$$\mathbf{d}W\_p = -\nabla \cdot (p \left(\mathbf{u}^\star \mathbf{d}t + \sigma\_I \mathbf{d} \mathcal{B}\_I\right)).\tag{19}$$

In the same way, the viscous stress of the resolved component can be written

$$\mathrm{d}W\_{\mathrm{f}} = \frac{1}{Re}\nabla \cdot \left(\boldsymbol{\pi}(\boldsymbol{u}) \left(\boldsymbol{u}^{\star}\,\mathrm{d}t + \sigma\_{I}\mathrm{d}\mathcal{B}\_{I}\right)\right) . \tag{20}$$

Following Appendix 7, we take as well into account the work of the random pressure:

$$\mathbf{d}W\_{rp} = -\nabla \cdot \left(\mathbf{u}^\star \mathbf{d}p\_t^\sigma\right),\tag{21}$$

and the work of the random viscous stress

$$\mathrm{d}W\_{r\varepsilon} = \frac{1}{Re}\nabla \cdot \left(\boldsymbol{\pi}(\sigma\_{l}\mathrm{d}\mathcal{B}\_{l})\boldsymbol{u}^{\star}\right). \tag{22}$$

As rigorously detailed in appendix 7, we do not consider work of random forces associated with *σt*d*B<sup>t</sup>* , since such a work would be highly irregular (in time) and should be in balance with variations of kinetic energy of *σt*d*B<sup>t</sup>* , which is possibly infinite and not described by the present model.

There is no work contribution of gravity on total energy, since the gain in kinetic energy directly associated with the gravity force is compensated by the loss in potential energy.

Finally, we obtain the thermal conductivity by expressing the thermal fluxes by the Fourier law <sup>d</sup>*<sup>q</sup>* = − <sup>1</sup> *ReP r* ∇*T* d*t*.

#### **Transport Equation of Temperature**

By replacing the energy by the contributions of internal, kinetic and potential energy, and by subtracting the contribution of the kinetic energy using the momentum equation (15) and the distributivity rule (70), we obtain the transport equation for the temperature

*ρ γ* <sup>D</sup>*tT* <sup>+</sup> *k* d*t t* 0 *ρ(σs*d*Bs) k, t* 0 *∂ ∂xk γ ρ FT <sup>σ</sup>* **·** *σs*d*B<sup>s</sup> QT* + *i ρ* 2 d*t t* 0 *Fui <sup>σ</sup>* **·** *σs*d*Bs, t* 0 *Fui <sup>σ</sup>* **·** *σs*d*B<sup>s</sup> Qu* = −*p*<sup>∇</sup> **·***(u* d*t* + *σt*d*Bt) Pt* <sup>−</sup>d*p<sup>σ</sup> <sup>t</sup>* <sup>∇</sup> **·** *<sup>u</sup> Pσ* + 1 *Re τ (u)* : ∇ *<sup>u</sup>* <sup>d</sup>*<sup>t</sup>* <sup>+</sup> *<sup>σ</sup>t*d*B<sup>t</sup> Vt* + 1 *Re <sup>τ</sup> (σt*d*Bt)* : ∇*u Vσ* + *(u* <sup>−</sup> *<sup>u</sup>)* <sup>d</sup>*<sup>t</sup>* <sup>+</sup> *<sup>σ</sup>t*d*B<sup>t</sup>* **·** −∇*p* + 1 *Re* ∇ **·** *τ (u)* + *ρg Dt* + *<sup>u</sup>* <sup>−</sup> *<sup>u</sup>* **·** −∇d*p<sup>σ</sup> t* + 1 *Re* ∇ **·** *τ (σt*d*Bt) Dσ* + 1 *ReP r* ∇ **·***(*∇*T )* d*t.* (23)

with *Fui <sup>σ</sup>* <sup>=</sup> <sup>1</sup> *ρ Fρui <sup>σ</sup>* , and *<sup>γ</sup> ρ F<sup>T</sup> <sup>σ</sup>* **·** d*B<sup>t</sup>* the sum of all martingale terms of the RHS of Eq. (23). In Eq. (23), we recover the terms present in the deterministic framework, but considering the stochastic transport operator instead of the deterministic transport operator. Nevertheless, some covariation terms are now arising. In particular the term *QT* is induced by the random work of the forces, and the term *Qu* is induced by the increase of kinetic energy through covariations of the forces in the momentum equation. On the RHS, we remark that the drift velocity is involved in the work of the time-differentiable pressure (in *Pt*) and random pressure (in *Pσ* ), consistently with Appendix 7. The terms *Vt* and *Vσ* are smooth in time and random viscous stresses, respectively. In addition, the terms *Dt* and *Dσ* correspond to works caused by the alignment between the drift and random velocities with the pressure gradient and viscous forces. We call them *drift works*. Focusing on <sup>−</sup>*(u* <sup>−</sup> *<sup>u</sup>)***·**∇*p*, we interpret this drift work to be related to baropycnal work [1], present in the compressible large-eddy simulation framework. Indeed in standard compressible LES, baropycnal work corresponds to a contribution caused by the alignment between the large scale pressure gradient and the Reynolds stresses induced by product between the small scales contributions of *ρ* and *u* (i.e. <sup>1</sup> *ρ* ∇*p* **·** *ρ*- *u*- , with **·** denoting here small scale components and **·** large scale filtering). This Reynolds stress <sup>1</sup> *<sup>ρ</sup> ρ*- *u* has the dimension of a velocity. In our case, the interpretation of the effective displacement associated with this work is directly the drift velocity over d*t*. Similar interpretations can be made for the other drift work terms, associated with viscous stresses and random variables. The presence of gravity in the drift work *Dt* shows that in the vertical direction the time-differentiable drift work is of the form *(w*−*w)( ∂p ∂z* <sup>−</sup>*ρg)*, and we see appearing the vertical small-scale mass flux times the buoyancy *(w* <sup>−</sup> *w)(ρ*0*b)*, plus non-hydrostatic pressure effects. It can be noticed that for a divergence-free homogeneous noise (for which the variance tensor is constant in space) the drift work is null as *<sup>u</sup>* <sup>−</sup> *<sup>u</sup>* cancels.

## *3.5 Equation of State*

In order to close the system, we have to specify the equation of state. We keep generality and write the equation of state formally as follows

$$p = f(\rho, T). \tag{24}$$

As in the deterministic framework, since we have an evolution equation of density and temperature, the pressure can be determined explicitly, at the price of a Courant-Friedrichs-Lewi (CFL) condition constrained by the speed of sound.

The random pressure can be identified by differentiating the equation of state. Indeed, we have an explicit evolution equation of the pressure, which can be expressed through Ito formulae (the equation of state ¯ *f* being deterministic—i.e the state map does not depend on the random events) as

302 G. Tissot et al.

$$\begin{split} \mathbf{d}\_{l}p &= \frac{\partial f}{\partial \rho} \mathbf{d}\_{l}\rho + \frac{\partial f}{\partial T} \mathbf{d}\_{l}T + \frac{1}{2} \frac{\partial^{2} f}{\partial \rho^{2}} \mathbf{d}\_{l} \langle \rho, \rho \rangle + \frac{1}{2} \frac{\partial^{2} f}{\partial T^{2}} \mathbf{d}\_{l} \langle T, T \rangle + \frac{\partial^{2} f}{\partial \rho \partial T} \mathbf{d}\_{l} \langle \rho, T \rangle \\ &= \frac{\partial \widetilde{p}}{\partial t} \mathbf{d}t + \frac{\mathbf{d} \rho\_{l}^{\sigma}}{\tau}, \end{split} \tag{25}$$

where *<sup>p</sup>* is the time-differentiable part of the pressure which contains, among other things, all covariation terms. The martingale part of d*tp* is d*p<sup>σ</sup> <sup>t</sup> /τ* , with *τ* a decorrelation time. This decorrelation time represents the typical time during which the random pressure acts in a coherent manner to produce a change of momentum. It is assumed to be the same decorrelation time than the one classically introduced [e.g. 12, 4] to relate in practice the definition of the variance tensor to velocity fluctuations variance: (i.e. **<sup>a</sup>** <sup>=</sup> *<sup>τ</sup>* <sup>E</sup> *u*- *u*-*T* ). The term d*p<sup>σ</sup>* is identified from (25) to be the random pressure acting on the momentum equation.

If we assume that the random pressure ensues from an isentropic process, i.e. of acoustic nature, we can write

$$\mathbf{d}\_{l}p = \frac{\partial p}{\partial \rho}\Big|\_{s} \mathbf{d}\_{l}\rho = c^{2} \mathbf{d}\_{l}\rho,\tag{26}$$

with *c* the speed of sound and *s* the entropy. We can then identify from (12)

$$\mathrm{d}p\_I^{\sigma} = -\pi c^2 \nabla \cdot (\rho \sigma\_I \mathrm{d}\mathcal{B}\_I). \tag{27}$$

It can be remarked that this expression is consistent with Eq. (25) under the isentropic transformation assumption.

For oceanic flows, the equation of state is often expressed in terms of density rather in pressure. A specific treatment adapted to oceanic flows is detailed in Sect. 6.

## **4 Low Mach Approximation**

To perform the low Mach approximation, we follow the same steps as [11], but applied to the compressible stochastic Navier–Stokes equations. With our nondimensioning, we have at infinity for isentropic transformations,

$$\left.\frac{\partial p}{\partial \rho}\right|\_{s} = c\_{\text{ref}}^{2} = \frac{1}{M^{2}}.\tag{28}$$

This suggests for small *M* the following asymptotic expansion

$$\begin{aligned} \rho &= \rho\_0 + M^2 \rho\_1 + o(M^2), \\ \mu &= \mu\_0 + o(1), \\ T &= \frac{1}{M^2} T\_0 + T\_1 + o(1). \end{aligned} \tag{29}$$

and *<sup>p</sup>* <sup>=</sup> <sup>O</sup>*( <sup>ρ</sup>*<sup>1</sup> *<sup>M</sup>*<sup>2</sup> *)* <sup>=</sup> <sup>O</sup>*(*1*)*. Similarly, the random pressure <sup>d</sup>*p<sup>σ</sup> <sup>t</sup>* follows the same scaling as the time-differentiable pressure.

Collecting <sup>O</sup>*(*1*)* and <sup>O</sup>*(M*2*)* terms in the continuity equation, we obtain respectively

$$\nabla \cdot (\mathfrak{u}\_0^\star \, \mathrm{d}t + \sigma\_l \mathrm{d}\mathcal{B}\_l) = 0 \quad ; \quad \mathbb{D}\_l \rho\_1 = 0. \tag{30}$$

In the momentum equation (15), the order of magnitude of the covariation term can be determined by integrating over the domain, using distributivity of the divergence and performing an integration by parts:


where suitable boundary conditions at *δΩ* (*e.g.* Dirichlet boundary conditions (no random inflow velocity) or zero normal stress (outflow boundary conditions)), have been applied to insure the first surface term vanishes.

By neglecting the order <sup>O</sup>*(M*2*)* terms, we obtain then finally the incompressible Navier-Stokes presented in [15] under the incompressibility assumption

$$\rho\_0 \boxtimes\_l \mathfrak{u} = -\nabla p \, \mathrm{d}t - \nabla \mathrm{d}p\_l^\sigma + \frac{1}{Re} \nabla \cdot (\pi(\mathfrak{u})) \, \mathrm{d}t + \frac{1}{Re} \nabla \cdot (\pi(\sigma\_I \mathrm{d}\mathcal{B}\_l)) + \rho \operatorname{g.c.} \tag{32}$$
 
$$\nabla \cdot \mathfrak{u}^\star = 0 \quad ; \quad \nabla \cdot (\sigma\_I \mathrm{d}\mathcal{B}\_l) = 0. \tag{32}$$

# **5 Boussinesq-Hydrostatic Approximation**

In this section, starting from the stochastic compressible Navier–Stokes equations, we perform the Boussinesq approximation by considering small density fluctuations. These fluctuations are neglected, when they are not multiplied by gravity *g*, which leads to the classical definition of the buoyancy. We perform as well the hydrostatic approximation through the classical aspect ratio scaling *D* = *H/L*ref 1, with *H* the water depth. For simplicity, we do not consider a rotating frame. Coriolis correction could be straightforwardly considered as in [21]. The vertical coordinate *z* ∈ [−*H,η*] is bounded by the bottom and the free surface.

#### **Density**

The density is decomposed through the following asymptotic expansion

$$
\rho = \rho\_0 + \epsilon \rho\_1(z) + \epsilon \rho\_2(x, y, z, t) + o(\epsilon), \tag{33}
$$

with *ρ*1*(z)* the time-averaged stratification term, and  1 and we do not need to assume that *ρ*<sup>1</sup> *> ρ*2. We obtain hence

$$
\nabla \cdot \mathbf{u}^\star = 0 \quad ; \quad \nabla \cdot \sigma\_l \mathbf{d} \mathbf{B}\_l = 0 \quad ; \quad \mathbb{D}\_l \ (\rho\_l + \rho\_2) = 0. \tag{34}
$$

The drift velocity and the noise are divergence free. Density perturbations undergo a stochastic transport by the flow. We remark that since ∇ **·** *σt*d*B<sup>t</sup>* = 0, then the transport operator <sup>D</sup>*t(***·***)* can be directly used.

The terms of order  of Eq. (34) can be expressed in terms of buoyancy *b* = −*gρ*2*/ρ*0:

$$\frac{\rho\_0}{g}\mathbb{D}\_l b = \left(w^\* \,\mathrm{d}t + (\sigma\_l \mathrm{d}\mathcal{B}\_l)\_\varepsilon\right) \frac{\partial \rho\_1}{\partial z} - \frac{1}{2} \nabla \cdot \left(\mathbf{a}\_{\bullet \mathcal{E}} \frac{\partial \rho\_1}{\partial z}\right) \mathrm{d}t,\tag{35}$$

with

$$\mathbf{a} = \begin{pmatrix} \mathbf{a}\_{HH^T} & \mathbf{a}\_{H\boldsymbol{\varepsilon}} \\ \mathbf{a}\_{\boldsymbol{\varepsilon}H^T} & a\_{\boldsymbol{\varepsilon}\boldsymbol{\varepsilon}} \end{pmatrix},\tag{36}$$

and **<sup>a</sup>***zH<sup>T</sup>* <sup>=</sup> **<sup>a</sup>***<sup>T</sup> H z*, for *<sup>H</sup>* <sup>=</sup> *(x y)<sup>T</sup>* .

#### **Thermodynamic Effects**

Equation (35) is part of the stochastic version of what is often referred to as the *simple Boussinesq* equations. In the ocean, thermodynamic effects can be important, and we propose to incorporate these effects by combining the buoyancy and the energy equation, following the steps of [22]. Assuming a linear equation of state for sea water, we have

Stochastic Compressible Navier–Stokes Equations Under Location Uncertainty 305

$$
\rho = \rho\_0 \left( 1 - \beta\_T (T - T\_0) + \beta\_p p \right), \tag{37}
$$

with *βp* <sup>=</sup> <sup>1</sup>*/ρ*0*c*<sup>2</sup> and *βT* <sup>=</sup> <sup>1</sup>*/ρ*<sup>0</sup> *∂ρ ∂T* the coefficients of the Taylor expansion. For sake of simplicity, we do not take into account salinity effects, and we apply the stochastic transport operator to Eq. (37). We obtain

$$\begin{split} \mathbb{D}\_{l}\rho &= -\rho\_{0}\rho\_{T}\mathbb{D}\_{l}T + \frac{1}{c^{2}}\mathbb{D}\_{l}p \\ \mathbb{D}\_{l}\left(\rho - \frac{1}{c^{2}}p\right) &= -\frac{\beta\_{T}}{\gamma}\left(\mathbb{d}W + \mathbb{d}Q\right). \end{split} \tag{38}$$

With no viscosity, divergence-free velocity and neglecting the quadratic variations (with the same argument as in (31)) together with the hydrostatic assumption on the leading term *p*0, we can assume that the main dilatations are caused by radiative effects to which the potential buoyancy *bφ* is directly sensitive:

$$b\_{\phi} \stackrel{\Delta}{=} -\frac{\text{g}}{\rho\_0} \left( \delta \rho + \frac{\rho\_0 \text{g} \, \text{z}}{c^2} \right) = b\_{s1} + b - \text{g} \, \frac{\text{z}}{H\_p},\tag{39}$$

with *Hp* <sup>=</sup> *<sup>c</sup>*2*/g* and *bst* = −*gρ*1*/ρ*0. Upon applying the transport operator (with forcing), the following evolution equation of the potential buoyancy is obtained

$$\mathbb{D}\_l b\_\phi = \frac{\mathbf{g}\beta\_T}{\chi\rho\_0} (\mathbf{d}\mathcal{Q} + \mathbf{d}W\_d),\tag{40}$$

where <sup>d</sup>*<sup>Q</sup>* <sup>=</sup> <sup>d</sup>*Q*rad <sup>+</sup> <sup>1</sup> *ReP r* ∇ **·***(*∇*T )* d*t*, with d*Q*rad the radiative heat fluxes, and

$$\mathbf{d}W\_d = \left( (\mathbf{u}^\star - \mathbf{u}) \, \mathbf{d}t + \sigma\_l \, \mathbf{d}B\_l \right) \cdot (-\nabla p + \rho \mathbf{g}) - \left( \mathbf{u}^\star - \mathbf{u} \right) \cdot \nabla \mathbf{d}p\_l^\sigma$$

the drift works. The drift work on the vertical velocity component can be interpreted (with a linear equation of state) as an alternative to the so-called eddy diffusivity mass flux (EDMF) scheme proposed recently for atmospheric and oceanic penetrative convection parameterization (see for instance [19, 10] and references therein). Indeed, in EDMF, the subgrid stress in the transport equation of temperature is modelled as a mass flux induced by a given number of plumes (corresponding here possibly to *ρσt*d*Bt*), multiplied by the difference of temperature between the plume and the ambient flow, which is here proportional to a buoyancy anomaly. Interestingly, the pressure work provides a natural non-local (horizontal and vertical) forcing term while the other term is a local upward/downward vertical statistical forcing. EDMF schemes are obtained by specifying the noise in terms of velocity fluctuations between the mean velocity and non-convective environment, upward plumes and downward plumes. Such an interpretation need to be tested with numerical simulations, and will be the focus of a future dedicated study.

By defining the buoyancy frequency

$$N^2(z) \stackrel{\triangle}{=} \frac{\partial}{\partial z} \left( -\epsilon g \frac{\rho\_1}{\rho\_0} - g \frac{z}{H\_p} \right) = -\epsilon \frac{g}{\rho\_0} \frac{\partial \rho\_1}{\partial z} - \frac{g^2}{c^2},\tag{41}$$

stratification and radiative effects can be introduced explicitly on the buoyancy equation

$$\mathbb{D}\_{l}b + \left(w^{\bullet}\,\mathrm{d}t + (\sigma\_{l}\mathrm{d}\mathcal{B}\_{l})\_{\varepsilon}\right)N^{2} - \frac{1}{2}\nabla \cdot \left(\mathbf{a}\_{\bullet\mathcal{E}}N^{2}\right)\mathrm{d}t = \frac{\mathrm{g}\beta\_{T}}{\chi\rho\_{0}}(\mathrm{d}\mathcal{Q} + \mathrm{d}W\_{d}).\tag{42}$$

#### **Momentum**

Concerning the momentum equation, we neglect here the viscous terms. In this framework, *g* is assumed to be O*(*1*/)*. We decompose as well the pressure field as follows

$$p = \underbrace{p\_0(z)}\_{\mathcal{O}(\frac{1}{\epsilon})} + p\_1 + p\_2 + \mathcal{O}(\epsilon),\tag{43}$$

where *p*<sup>0</sup> and *p*<sup>1</sup> are in hydrostatic balance:

$$\frac{\partial p\_0}{\partial z} = -\mathbf{g}\rho\_0 \quad \text{and} \quad \frac{\partial p\_1}{\partial z} = -\epsilon \mathbf{g}\rho\_1. \tag{44}$$

The momentum equation (15) becomes

$$\begin{split} & \left( \rho\_0 + \epsilon (\rho\_1 + \rho\_2) \right) \mathbb{D}\_l \mu\_l - \sum\_k \mathbf{d}\_l \left\langle \int\_0^l \rho \sigma\_s \mathbf{d} \mathbf{B}\_s^k, \int\_0^l \frac{\partial}{\partial \mathbf{x}\_k} \left( \frac{1}{\rho} \frac{\partial \mathbf{d} p\_s^\sigma}{\partial \mathbf{x}\_l} \right) \right\rangle \\ &= -\frac{\partial \rho\_2}{\partial \mathbf{x}\_l} \mathbf{d} t - \frac{\partial \mathbf{d} p\_l^\sigma}{\partial \mathbf{x}\_l} - \epsilon \rho\_2 \mathbf{g} \delta\_{l\bar{\varepsilon}} \mathbf{d} t. \end{split} \tag{45}$$

Similarly as in Sect. 4, the covariation term is O*()*. By neglecting O*()* terms, we obtain

$$\mathbb{D}\_{l}u\_{l} = -\frac{1}{\rho\_{0}}\frac{\partial p\_{2}}{\partial x\_{l}}\,\mathrm{d}t - \frac{1}{\rho\_{0}}\frac{\partial \mathrm{d}p\_{l}^{\sigma}}{\partial x\_{l}}\,\underbrace{-\epsilon\frac{\rho\_{2}}{\rho\_{0}}g\,\delta\_{l\varepsilon}\,\mathrm{d}t}{}\_{b} \tag{46}$$

Finally, *p*<sup>2</sup> and the random pressure are determined through a generalisation of the hydrostatic balance, accounting for a part of non-hydrostatic effects by balancing in the vertical momentum equation the vertical pressure gradient with buoyancy, stochastic diffusion, corrective drift and stochastic advection of *w*. We consider a regime where the hydrostatic approximation in the deterministic framework is only roughly valid (in other words at the limit of validity), such that a noise with a strong amplitude can break this assumption—or changing viewpoint, the regime is intermediate and we aim at modelling some weak non-hydrostatic effects through stochastic modelling. By scaling analysis (weak aspect ratio and noise with strong amplitude), d*tw* and *(u* **·** ∇*) w* are neglected while terms associated with the noise are kept. Indeed, denoting *L<sup>σ</sup>* the scale amplitude of *σt*d*B<sup>t</sup>* , and *τ* the decorrelation time, the advection of *<sup>σ</sup>t*d*B<sup>t</sup>* cannot be neglected if<sup>1</sup> *<sup>L</sup><sup>σ</sup> /L*ref <sup>∼</sup> <sup>1</sup>*/(F r D)*<sup>2</sup> and stochastic diffusion and drift velocity are important if *(L<sup>σ</sup> /L*ref*)*2*τ/T*ref <sup>∼</sup> <sup>1</sup>*/(F r D)*2, with the Froude number *F r* <sup>=</sup> *<sup>u</sup>*ref*/(NH )*. Since <sup>d</sup>*tw* is neglected, martingale and time-differentiable terms can then be safely separated, such that the remaining pressure term can be determined by vertical integration through a scheme similar to the one applied in the classical hydrostatic regime:

$$\begin{split} p\_2 &= \rho\_0 \int\_{\mathbb{Z}}^{\eta} \left( \left( \frac{1}{2} \nabla \cdot \mathbf{a} \cdot \nabla \right) w + \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla w) - b \right) \mathrm{d}z \\ \mathrm{d}p\_1^{\sigma} &= -\rho\_0 \int\_{\mathbb{Z}}^{\eta} \left( \sigma\_l \mathrm{d} \mathbf{B}\_l \cdot \nabla \right) w \, \mathrm{d}z. \end{split} \tag{47}$$

Here, we have neglected d*tw*, but random vertical transport could generate some random vertical acceleration instead of only random pressure fluctuations. An intermediary assumption could be to consider that the time-differentiable part of d*tw* is negligible (classical hydrostatic balance), but that its martingale part is not. It could be obtained by diagnosing the vertical velocity time increment and thus bringing and additional correction to d*p<sup>σ</sup> <sup>t</sup>* in Eq. (47).

#### **Summary**

By collecting the Eqs. (42), (46), and (47), we obtain the following stochastic Boussinesq system with thermodynamic forcing

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ <sup>D</sup>*tui* = − <sup>1</sup> *ρ*0 *∂p*<sup>2</sup> *∂xi* <sup>d</sup>*<sup>t</sup>* <sup>−</sup> <sup>1</sup> *ρ*0 *∂*d*p<sup>σ</sup> t ∂xi* for *i* = {*u, v*} *<sup>w</sup>* <sup>=</sup> <sup>1</sup> 2 *(*∇ **· a***)z* − *z* −*H ∂u ∂x* + *∂v ∂y* d*z* ∇ **·***(σt*d*Bt)* = 0 *p*<sup>2</sup> = *ρ*<sup>0</sup> *η z* 1 2 ∇ **· a ·** ∇ *w* + 1 2 ∇ **·***(***a**∇*w)* − *b* d*z* d*p<sup>σ</sup> <sup>t</sup>* = −*ρ*<sup>0</sup> *η z (σt*d*B<sup>t</sup>* **·** ∇*) w* d*z* <sup>D</sup>*tb* <sup>+</sup> *<sup>w</sup>* <sup>d</sup>*<sup>t</sup>* <sup>+</sup> *(σt*d*Bt)z <sup>N</sup>*<sup>2</sup> <sup>−</sup> <sup>1</sup> 2 ∇ **·** *<sup>a</sup>*•*zN*<sup>2</sup> <sup>d</sup>*<sup>t</sup>* <sup>=</sup> *gβT γρ*<sup>0</sup> *(*d*Q* + d*Wd )* d*Q* = d*Q*rad + 1 *ReP r* ∇ **·***(*∇*T )* d*t* d*Wd* = 1 2 ∇ **· a** d*t* − *σt*d*B<sup>t</sup>* **·***(*∇*p*<sup>2</sup> + *ρ*0*bez)* + 1 2 <sup>∇</sup> **· <sup>a</sup> ·** <sup>∇</sup>d*p<sup>σ</sup> t .* (48)

<sup>1</sup> If the frame rotation is taken into account, the ratio *Ro/Bu* Rossby over Burger is additionally involved, but does not change the existence of an intermediary regime.

In system (48), the pressure is obtained through a relaxed hydrostatic balance, and the vertical velocity is deduced kinematically from the divergence-free condition of the drift velocity. Neglecting the thermodynamic effects, together with a strong hydrostatic balance assumption (weak to moderate noise regime), we recover the simple Boussinesq system presented in [16], without the Coriolis correction. Obviously, this latter could be added without any major difficulty.

In some applications, a more accurate evaluation of the buoyancy is required, and it can be obtained through an equation of state *ρ*BQ*(T , p)* (salinity is not taken into account here and left for future works) associated with a transport equation of temperature (and salinity when considered). Under the aformentioned assumptions, the transport equation of temperature (23) is simplified, and the full system can be written

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ <sup>D</sup>*tui* = − <sup>1</sup> *ρ*0 *∂p*<sup>2</sup> *∂xi* <sup>d</sup>*<sup>t</sup>* <sup>−</sup> <sup>1</sup> *ρ*0 *∂*d*p<sup>σ</sup> t ∂xi* for *i* = {*u, v*} *<sup>w</sup>* <sup>=</sup> <sup>1</sup> 2 *(*∇ **· a***)z* − *z* −*H ∂u ∂x* + *∂v ∂y* d*z* ∇ **·***(σt*d*Bt)* = 0 *p*<sup>2</sup> = *ρ*<sup>0</sup> *η z* 1 2 ∇ **· a ·** ∇ *w* + 1 2 ∇ **·***(***a**∇*w)* − *b* d*z* d*p<sup>σ</sup> <sup>t</sup>* = −*ρ*<sup>0</sup> *η z (σt*d*B<sup>t</sup>* **·** ∇*) w* d*z ρ γ* <sup>D</sup>*tT* <sup>=</sup> 1 2 ∇ **· a** d*t* − *σt*d*B<sup>t</sup>* **·***(*∇*p*<sup>2</sup> + *ρ*0*bez)* + 1 2 <sup>∇</sup> **· <sup>a</sup> ·** <sup>∇</sup>d*p<sup>σ</sup> t* + 1 *ReP r* ∇ **·***(*∇*T )* d*t* + d*Q*rad *<sup>b</sup>* = − *<sup>g</sup> ρ*0 *ρ*BQ*(T ,* −*ρ*0*gz).* (49)

Usually, a simple stochastic advection-diffusion equation is considered for the transport of temperature, but in the system (49), we can point out that the drift works remain. These source/sink terms in the temperature evolution equation is one of the principal outcome of this study. As outlined this additional terms for parameterising discrepancies to hydrostatic physics and primitive equations. In the next section, we explore systems at finer resolution.

## **6 Extension to Non-Boussinesq**

The aim of this section is to propose a formulation to relax the Boussinesq assumption in the LU stochastic framework while avoiding the resolution of a 3D Poisson equation. We consider now an intermediate model between the fully non-Boussinesq non-hydrostatic formulation and the system (49). For sea water, the equation of state is formulated in terms of *ρ*BQ*(T , p)* instead of *p(ρ , T )* as in gas dynamics. To take this aspect into consideration, we follow [3] in order to obtain an explicit expression of the pressure. This is at the cost of resolving in time sound waves, or a pseudo-compressibility information propagating at the velocity *c*. The density is decomposed as

$$\rho = \rho\_{\mathbb{AO}}(T, p) + \underbrace{\frac{\partial \rho}{\partial p} \delta p}\_{\delta \rho} \delta p + \mathcal{O}(\delta p^2), \tag{50}$$

with *ρ*BQ*(T ,* −*ρ*0*gz)* the Boussinesq density determined by the equation of state under an hydrostatic balance condition. The deviation to this density is then assumed to be ensue from an isentropic transformation, i.e. of acoustic nature. The term *∂ρ ∂p* <sup>=</sup> <sup>1</sup> *<sup>c</sup>*<sup>2</sup> , is then directly related to the sound speed (or more precisely to the fastest wave considered in the model). We determine now a transport equation for *δρ*.

#### **Continuity**

We start from the continuity equation of the stochastic compressible Navier–Stokes equations

$$\mathbf{d}\_l \rho + \nabla \cdot \left( \left( (\mathbf{u} - \frac{1}{2} \nabla \cdot \mathbf{a}) \, \mathbf{d} + \sigma\_l \, \mathbf{d} \, \mathbf{B}\_l \right) \rho \right) = \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla \rho) \, \mathbf{d} \, \mathbf{d} \,. \tag{51}$$

A transport equation for *δρ* can be deduced as

$$\begin{split} \mathrm{d}\_{l}(\delta\rho) &= -\,\mathrm{d}\_{l}(\rho\_{\mathrm{\mathbb{N}}}) - \nabla \cdot \left( \left( (\mathrm{u} - \frac{1}{2} \nabla \cdot \mathbf{a}) \, \mathrm{d}t + \sigma\_{I} \mathrm{d}B\_{I} \right) (\rho\_{\mathrm{\mathbb{N}}} + \delta\rho) \right) \\ &+ \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla \rho\_{\mathrm{\mathbb{N}}} + \delta\rho) \, \mathrm{d}t. \end{split} \tag{52}$$

At the scales considered here we assume that the unresolved contribution is of hydrodynamic nature associated to a divergence free noise ∇ **·***(σt*d*Bt)* = 0.

#### **Momentum**

The pressure can be decomposed as well as

$$p = p\_{\text{un}} + \int\_{\mathcal{z}}^{\eta} \rho\_{\text{\textquotedblleft}}(z')g \, \text{d}z' + p\_{\text{\textquotedblleft}} + c^2 \delta \rho,\tag{53}$$

where *p*atm is the atmospheric pressure, which will be neglected later for simplicity. The second term is the hydrostatic pressure associated with the Boussinesq density. The third term, *p*NH, is associated with Boussinesq non-hydrostatic effects balancing vertical advection, and finally *c*2*δρ* corresponds to a non-Boussinesq component of acoustic nature.

For the martingale random pressure, two components are considered: a Boussinesq non-hydrostatic term as in Eq. (48), and a non-Boussinesq component of acoustic nature as in Eq.(27)

$$\mathrm{d}p\_l^{\sigma} = -\rho\_0 \int\_{\mathfrak{z}}^{\eta} (\sigma\_l \mathrm{d}\mathbf{B}\_l \cdot \nabla) \, w \, \mathrm{d}\boldsymbol{z}' - \mathrm{\mathfrak{z}} \, c^2 \left( \sigma\_l \mathrm{d}\mathbf{B}\_l \cdot \nabla \right) \rho. \tag{54}$$

Neglecting the viscous terms, we obtain for the momentum equation

$$\rho \, \mathbb{D}\_l \mu\_l - \sum\_k \mathbf{d}\_l \left\langle \int\_0^l \rho (\sigma\_s \mathbf{d} \mathbf{B}\_s)^k, \int\_0^l \frac{\partial}{\partial \mathbf{x}\_k} \left( \frac{1}{\rho} \frac{\partial \mathbf{d} p\_s^\sigma}{\partial \mathbf{x}\_l} \right) \right\rangle = -\frac{\partial p}{\partial \mathbf{x}\_l} \, \mathbf{d}t - \frac{\partial \mathbf{d} p\_l^\sigma}{\partial \mathbf{x}\_l} + \rho \mathbf{g}. \tag{55}$$

Assuming that *<sup>ρ</sup>* <sup>D</sup>*tui* <sup>≈</sup> *<sup>ρ</sup>*BQ <sup>D</sup>*tui*, and following the same arguments as in Sect. <sup>4</sup> to neglect the quadratic variation term, one finally get

$$\begin{split} \rho\_{\mathrm{u}0} \mathbb{D}\_{l} \boldsymbol{\mu} &= -\frac{\partial p}{\partial \boldsymbol{x}} \, \mathrm{d}\mathbf{r} - \frac{\partial \mathrm{d}\boldsymbol{p}\_{l}^{\sigma}}{\partial \boldsymbol{x}} \\ \rho\_{\mathrm{u}0} \mathbb{D}\_{l} \boldsymbol{v} &= -\frac{\partial p}{\partial \boldsymbol{y}} \, \mathrm{d}\mathbf{r} - \frac{\partial \mathrm{d}\boldsymbol{p}\_{l}^{\sigma}}{\partial \boldsymbol{y}} \\ \rho\_{\mathrm{u}0} \mathbb{D}\_{l} \boldsymbol{w} &= -\frac{\partial p}{\partial \boldsymbol{z}} \, \mathrm{d}\mathbf{r} - \frac{\partial \mathrm{d}\boldsymbol{p}\_{l}^{\sigma}}{\partial \boldsymbol{z}} + (\rho\_{\mathrm{b}0} + \delta \rho) \operatorname{g} \, \mathrm{d}\mathbf{r} \\ \boldsymbol{p} &= \int\_{\boldsymbol{z}}^{\eta} \left( (\rho\_{\mathrm{b}0}(\boldsymbol{z}') + \delta \rho) \mathbf{g} + \rho\_{\mathrm{b}0} \left( \left( \frac{1}{2} \nabla \cdot \mathbf{a} \cdot \nabla \right) \boldsymbol{w} + \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla \boldsymbol{w}) \right) \right) \mathrm{d}\mathbf{z}' + c^{2} \delta \rho \\ \mathbf{d} \boldsymbol{p}\_{l}^{\sigma} &= -\rho\_{\mathrm{0}} \int\_{\boldsymbol{z}}^{\eta} \left( \sigma\_{l} \mathrm{d} \mathbf{B}\_{l} \cdot \nabla \right) \boldsymbol{w} \, \mathrm{d}\mathbf{z}' - \tau c^{2} \left( \sigma\_{l} \mathrm{d} \mathbf{B}\_{l} \cdot \nabla \right) \rho . \end{split} \tag{66}$$

The system (52)–(56) can be solved explicitly and does not require the expensive resolution of a 3D Poisson equation. Although system (49) proposes a deviation to the hydrostatic hypothesis through the martingale random pressure, the system (56) considers a non-hydrostatic model that fully accounts for stochastic vertical accelerations while relaxing the effect of fast waves truncation through the martingale pressure term. This system remains restricted by a CFL condition depending on the propagation speed of pseudo-compressibility informations. We believe this modelling strategy opens some new research directions on the role of unresolved small scales on non-hydrostatic and non-Boussinesq effects in oceanic flows.

# **7 Conclusion**

This paper proposes a stochastic representation under location uncertainty of the compressible Navier–Stokes equations. It as been obtained from conservation of density, momentum and total energy, undergoing a stochastic transport. The structure of equations remains similar to the compressible deterministic case. Nevertheless, because of the specificities related to stochastic transport, we have identified additional terms such as work induced by the alignment between the timedifferentiable pressure gradient and the drift velocity. This small scale induced work is alike the baropycnal work known in compressible large eddy simulations and includes also terms reminiscent to mass flux parameterisation of atmospheric and oceanic penetrative convection phenomenon. These terms are obtained by the mean of a rigorous derivation from the conservation laws coupled with stochastic calculus rules associated to stochastic transport, instead of phenomenological arguments.

We have verified that applying low-Mach and Boussinesq approximations on the stochastic compressible system enabled us to recover the known incompressible and simple Boussinesq stochastic systems respectively. The general set of stochastic compressible equations allowed us to incorporate thermodynamic effects on the Boussinesq system. Finally, this formulation has lead us to propose a way to relax the Boussinesq and hydrostatic assumptions. This study opens some new research directions to exploit the potential of stochastic modelling for the numerical simulations of oceanic flows we will exploit in future works.

**Acknowledgments** The authors acknowledge the support of the ERC EU project 856408-STUOD and warmly thank Valentin Resseguier for very fruitful and stimulating discussions on this work.

# **Appendix A: Stochastic Reynolds Transport Theorem from Stratonovich to Ito¯**

The aim of this section is to rewrite the stochastic Reynolds transport theorem with a Stratonovich convention, and verify that we get the same relation as the ones obtained in Sect. 2 in the Ito setting. To that end, we follow the steps of ¯ [16, Appendix D] but we use the Stratonovich convention. Then, we pass from the Stratonovich to the Ito form. These forms are equivalent for regular enough ¯ processes, thus allowing us to verify the consistency of the Eq. (9).

To pass from Ito to Stratonovich, we use the following relation ¯

$$X\_I \diamond \mathbf{d}Y\_I = X\_I \mathbf{d}Y\_I + \frac{1}{2} \mathbf{d}\_I \left\langle \int\_0^I \mathbf{d}X\_s, \int\_0^I \mathbf{d}Y\_s \right\rangle. \tag{57}$$

We define the characteristic function *φ(x, t)* transported by the flow, such that

$$
\phi(X\_I(\mathbf{x}\_0)) = \mathbf{g}(\mathbf{x}\_0),
\tag{58}
$$

with a compact spatial support V*(t)* of non-zero values that does not include points on the domain boundary. We can then write

$$\begin{split} \mathop{\mathbf{d}} \int\_{\mathcal{V}(t)} (q\phi)(\mathbf{x}, t) \, \mathrm{d}\mathbf{x} &= \mathop{\mathbf{d}} \int\_{\mathcal{Q}} (q\phi)(\mathbf{x}, t) \, \mathrm{d}\mathbf{x} \\ &= \int\_{\mathcal{Q}} \mathbf{d}\_{l} \diamond q \, \phi + q \, \mathrm{d}\_{l} \diamond \phi \, \mathrm{d}\mathbf{x} . \end{split} \tag{59}$$

Since *φ* is transported, we have

$$\begin{aligned} \mathbf{d}\phi(X\_I, t) &= \mathbf{d}\_l \diamond \phi + \nabla \phi \cdot \mathbf{d}X\_I = 0 \\\\ \mathbf{d}\_l \diamond \phi + (\mu - \frac{1}{2} \nabla \cdot \mathbf{a} + \frac{1}{2} \sigma\_I (\nabla \cdot \sigma\_I)) \cdot \nabla \phi + (\nabla \phi \cdot \sigma\_I) \diamond \mathbf{d}B\_I &= 0. \end{aligned} \tag{60}$$

We have then

$$\begin{split} & \mathbf{d} \int\_{\mathcal{V}(t)} (q\phi)(\mathbf{x}, t) \, \mathrm{d}\mathbf{x} \\ &= \int\_{\mathcal{Q}} \mathbf{d}\_{l} \circ q \, \phi - q \left( (u - \frac{1}{2} \nabla \cdot \mathbf{a} + \frac{1}{2} \sigma\_{l} (\nabla \cdot \boldsymbol{\sigma}\_{l})) \cdot \nabla \phi + (\nabla \phi \cdot \boldsymbol{\sigma}\_{l}) \circ \mathrm{d}\mathbf{B}\_{l} \right) \mathrm{d}\mathbf{x} \\ &= \int\_{\mathcal{Q}} \left[ \mathbf{d}\_{l} \diamond q + \nabla \cdot \left( q \left( (u - \frac{1}{2} \nabla \cdot \mathbf{a} + \frac{1}{2} \sigma\_{l} (\nabla \cdot \boldsymbol{\sigma}\_{l})) \, \mathrm{d}\mathbf{t} + \boldsymbol{\sigma}\_{l} \circ \mathrm{d}\mathbf{B}\_{l} \right) \right) \right] \phi \, \mathrm{d}\mathbf{x} . \end{split} \tag{61}$$

We add now a force and obtain the SRTT in Stratonovich form:

$$\mathbf{d}\_l \diamond q + \nabla \cdot \left( q \left( (u - \frac{1}{2} \nabla \cdot \mathbf{a} + \frac{1}{2} \sigma\_I (\nabla \cdot \sigma\_I)) \, \mathrm{d}t + \sigma\_I \diamond \mathrm{d}B\_I \right) \right) = \mathcal{Q}\_l \, \mathrm{d}t + \mathcal{Q}\_\sigma \diamond \mathrm{d}B\_I. \tag{62}$$

Let us now write this in Ito form: ¯

$$\begin{split} \frac{\partial}{\partial \boldsymbol{\alpha}\_{l}} \left( q \boldsymbol{\sigma}\_{l,lj} \diamond \mathbf{d} \mathbf{B}\_{l}^{j} \right) &= \frac{\partial}{\partial \boldsymbol{\alpha}\_{l}} \left( q \boldsymbol{\sigma}\_{l,lj} \mathbf{d} \mathbf{B}\_{l}^{j} \right) \\ &+ \frac{1}{2} \underbrace{\mathbf{d}\_{l} \left( \int\_{0}^{l} \mathbf{d}\_{l} \left( \frac{\partial}{\partial \boldsymbol{\alpha}\_{l}} \left( q \boldsymbol{\sigma}\_{s,lj} \right) \right), \int\_{0}^{l} \mathbf{d} \mathbf{B}\_{s}^{j} \right)}\_{J} . \end{split} \tag{63}$$

Since *σ<sup>t</sup>* is time differentiable in the Eulerian grid, we have

$$\mathbf{d}\_l \left\langle \int\_0^l \mathbf{d}\_l \sigma\_{s,ij} , \int\_0^t \mathbf{d} \mathbf{B}\_s^j \right\rangle = 0.1$$

Then,

*J* =d*<sup>t</sup> t* 0 *∂ ∂xi* d*tqσs,ij , t* 0 d*B<sup>j</sup> s* =d*<sup>t</sup> t* 0 *∂ ∂xi Q<sup>σ</sup>* **·** <sup>d</sup>*B<sup>s</sup>* <sup>−</sup> *<sup>∂</sup> ∂xk qσs,lm*d*B<sup>m</sup> s σs,ij , t* 0 d*B<sup>j</sup> s* = *∂ ∂xi Qj <sup>σ</sup> σt,ij* <sup>d</sup>*<sup>t</sup>* <sup>−</sup> *<sup>∂</sup> ∂xi ∂ ∂xl qσt,lj σt,ij* d*t* = *∂ ∂xi Qj <sup>σ</sup> σt,ij* <sup>d</sup>*<sup>t</sup>* <sup>−</sup> *<sup>∂</sup> ∂xi ∂q ∂xl σt,lj σt,ij* <sup>d</sup>*<sup>t</sup>* <sup>−</sup> *<sup>∂</sup> ∂xi q ∂σt,lj ∂xl σt,ij* d*t* =∇ **·***(σtQ<sup>σ</sup> )* d*t* − ∇ **·***(***a**∇*q)* d*t* − ∇ **·***(qσt(*∇ **·** *σt))* d*t.* (64)

In addition, we make the hypothesis that d*Q<sup>σ</sup>* is time-differentiable in the Lagrangian frame, such that we have

$$\operatorname{d} \int\_{\mathcal{V}(t)} \mathcal{Q}^j\_{\sigma} \operatorname{d} \mathbf{x} = \int\_{\mathcal{V}(t)} \operatorname{d}\_l \mathcal{Q}^j\_{\sigma} + \nabla \cdot (\mathcal{Q}^j\_{\sigma} (\mathbf{u}^\star \operatorname{d} t + \sigma\_l \operatorname{d} \mathbf{B}\_l)) + \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla \mathcal{Q}^j\_{\sigma}) \operatorname{d}t \operatorname{d} \mathbf{x}$$
 
$$= \int\_{\mathcal{V}(t)} F \operatorname{d} t \operatorname{d} \mathbf{x}. \tag{65}$$

We can then write


Assembling everything and dropping the space integral, we obtain

$$\begin{split} & \mathbf{d}\_{l} \diamond q + \nabla \cdot \left( q \left( (\mathbf{u} - \frac{1}{2} \nabla \cdot \mathbf{a} + \frac{1}{2} \sigma\_{l} (\nabla \cdot \boldsymbol{\sigma}\_{l})) \, \mathrm{d}t + \sigma\_{l} \diamond \mathrm{d}B\_{l} \right) \right) \\ & - \underline{Q}\_{l} \, \mathrm{d}t - \underline{Q}\_{\sigma} \diamond \mathrm{d}B\_{l} \\ & = \mathbf{d}\_{l}q + \nabla \cdot \left( q \left( (\mathbf{u} - \frac{1}{2} \nabla \cdot \mathbf{a} + \frac{1}{2} \sigma\_{l} (\nabla \cdot \boldsymbol{\sigma}\_{l})) \, \mathrm{d}t + \sigma\_{l} \mathrm{d}B\_{l} \right) \right) \\ & + \frac{1}{2} \left[ \nabla \cdot (\sigma\_{l} \, \underline{Q}\_{\sigma}) \, \mathrm{d}t - \nabla \cdot (\mathbf{a} \nabla q) \, \mathrm{d}t - \nabla \cdot (q \boldsymbol{\sigma}\_{l} (\nabla \cdot \boldsymbol{\sigma}\_{l})) \, \mathrm{d}t \right] \\ & - \underline{Q}\_{l} \, \mathrm{d}t - \underline{Q}\_{\sigma} \mathrm{d}B\_{l} + \frac{1}{2} \nabla \cdot (\sigma\_{l} \, \underline{Q}\_{\sigma}) \, \mathrm{d}t . \end{split} (67)$$

We note that <sup>d</sup>*t*◦*<sup>q</sup>* <sup>=</sup> <sup>d</sup>*tq* since <sup>1</sup> 2 d*t* "# *t* <sup>0</sup> d*s,* # *t* 0 *q* \$ = 0. After simplification, we obtain

$$\begin{split} \mathbf{d}\_l q + \nabla \cdot \left( q \left( (\mathbf{u} - \frac{1}{2} \nabla \cdot \mathbf{a}) \, \mathrm{d}t + \sigma\_l \, \mathrm{d}B\_l \right) \right) + \nabla \cdot (\sigma\_l \, \mathcal{Q}\_\sigma) \, \mathrm{d}t \\ = \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla q) \, \mathrm{d}t + \mathcal{Q}\_l \, \mathrm{d}t + \mathcal{Q}\_\sigma \mathrm{d}B\_l, \end{split} \tag{68}$$

which is exactly Eq. (9).

To obtain Eq. (68), we had to assume that d*Q<sup>σ</sup>* is time-differentiable in the Lagrangian frame, which renders the demonstration slightly more restrictive concerning the shape of the forces. We do not have to perform such an assumption in Ito form, and we consider the present appendix as a sanity check of Eq. ¯ (9).

## **Appendix B: Calculation Rules**

## *Distributivity of the Stochastic Transport Operator*

The distributivity of the stochastic transport operator is detailed in this section, in the case where the stochastic transport operator is balanced by random RHS. If the evolution of two variables *f* and *g* are given by

$$\begin{aligned} \mathbb{D}\_l f &= F\_l \,\mathrm{d}t + F\_\sigma \cdot \mathrm{d}\mathcal{B}\_l \\ \mathbb{D}\_l g &= G\_l \,\mathrm{d}t + G\_\sigma \cdot \mathrm{d}\mathcal{B}\_l, \end{aligned} \tag{69}$$

we have then

$$\mathbb{D}\_{l}(fg) = f \cdot \mathbb{D}\_{l}g + g \cdot \mathbb{D}\_{l}f + \mathbb{F}\_{\sigma} \cdot \mathbb{G}\_{\sigma} \, \mathrm{d}t - (\sigma\_{l}(F\_{\sigma}) \cdot \nabla) \, g \, \mathrm{d}t - (\sigma\_{l}(\mathbb{G}\_{\sigma}) \cdot \nabla) \, f \, \mathrm{d}t,\tag{70}$$

or less formally:

Stochastic Compressible Navier–Stokes Equations Under Location Uncertainty 315

$$\begin{split} \mathbb{D}\_{l}(fg) &= f \, \mathbb{D}\_{l}g + g \, \mathbb{D}\_{l}f \\ &\quad + \mathrm{d}\_{l} \left\langle \int\_{0}^{l} \mathbf{F}\_{\sigma} \cdot \mathrm{d}\mathbf{B}\_{s} , \int\_{0}^{l} \mathbf{G}\_{\sigma} \cdot \mathrm{d}\mathbf{B}\_{s} \right\rangle \\ &\quad - \mathrm{d}\_{l} \left\langle \int\_{0}^{l} \mathbf{F}\_{\sigma} \cdot \mathrm{d}\mathbf{B}\_{s} , \int\_{0}^{l} \left(\sigma\_{s} \mathrm{d}\mathbf{B}\_{s} \cdot \nabla\right) g \right\rangle \\ &\quad - \mathrm{d}\_{l} \left\langle \int\_{0}^{l} \mathbf{G}\_{\sigma} \cdot \mathrm{d}\mathbf{B}\_{s} , \int\_{0}^{l} \left(\sigma\_{s} \mathrm{d}\mathbf{B}\_{s} \cdot \nabla\right) f \right\rangle. \end{split} \tag{71}$$

This relation is useful to transform the conservative form of the Navier-Stokes equations into non-conservative form.

#### *Proof*

<sup>D</sup>*t(fg)* <sup>=</sup> d*t(fg)* + *<sup>u</sup>* **·** <sup>∇</sup> *(fg)* <sup>d</sup>*<sup>t</sup>* <sup>+</sup> *(σt*d*B<sup>t</sup>* **·** <sup>∇</sup>*)(fg)* <sup>−</sup> <sup>1</sup> 2 ∇ **·***(***a**∇*(fg))* d*t* = *f* d*tg* + *g*d*tf* + d*tf, g* + *f <sup>u</sup>* **·** <sup>∇</sup> *g* d*t* + *g <sup>u</sup>* **·** <sup>∇</sup> *f* d*t* + 1 2 ∇ **·***(f* **a**∇*g* + *g***a**∇*f )* = *f* d*tg* + *g*d*tf* + d*<sup>t</sup> t* 0 − *(σs*d*B<sup>s</sup>* **·** ∇*) f* + *F<sup>σ</sup>* **·** d*Bs, t* 0 − *(σs*d*B<sup>s</sup>* **·** ∇*) g* + *G<sup>σ</sup>* **·** d*B<sup>s</sup>* + *f <sup>u</sup>* **·** <sup>∇</sup> *g* d*t* + *g <sup>u</sup>* **·** <sup>∇</sup> *f* d*t* − 1 2 *f* ∇ **·***(***a**∇*g)* + *((***a**∇*g)* **·** ∇*) f* + *g*∇ **·***(***a**∇*f )* + *((***a**∇*f )* **·** ∇*) g* d*t.* (72)

Developing only the covariation term:

$$\begin{split} & \mathbf{d}\_{l} \Big\langle \int\_{0}^{l} - \left( \boldsymbol{\sigma}\_{s} \mathbf{d} \mathbf{B}\_{s} \cdot \nabla \right) f + \mathbf{F}\_{\sigma} \cdot \mathbf{d} \mathbf{B}\_{s}, \int\_{0}^{l} - \left( \boldsymbol{\sigma}\_{s} \mathbf{d} \mathbf{B}\_{s} \cdot \nabla \right) g + \mathbf{G}\_{\sigma} \cdot \mathbf{d} \mathbf{B}\_{s} \Big\rangle = \\ & \mathbf{d}\_{l} (\mathbf{(a} \nabla f) \cdot \nabla) g \, \mathbf{d} \mathbf{r} \\ & \quad + \mathbf{d}\_{l} \left\langle \int\_{0}^{l} \mathbf{F}\_{\sigma} \cdot \mathbf{d} \mathbf{B}\_{s}, \int\_{0}^{l} \mathbf{G}\_{\sigma} \cdot \mathbf{d} \mathbf{B}\_{s} \right\rangle \\ & \quad - \mathbf{d}\_{l} \left\langle \int\_{0}^{l} \mathbf{F}\_{\sigma} \cdot \mathbf{d} \mathbf{B}\_{s}, \int\_{0}^{l} \left( \boldsymbol{\sigma}\_{s} \mathbf{d} \mathbf{B}\_{s} \cdot \nabla \right) \mathbf{g} \right\rangle \\ & \quad - \mathbf{d}\_{l} \left\langle \int\_{0}^{l} \mathbf{G}\_{\sigma} \cdot \mathbf{d} \mathbf{B}\_{s}, \int\_{0}^{l} \left( \boldsymbol{\sigma}\_{s} \mathbf{d} \mathbf{B}\_{s} \cdot \nabla \right) f \right\rangle, \end{split} \tag{73}$$

whose right hand side is written formally

$$(\mathbf{a} \nabla f \cdot \nabla) \, \mathbf{g} \, \mathrm{d}t + F\_{\sigma} \cdot \mathbf{G}\_{\sigma} \, \mathrm{d}t - (\sigma\_{l}(F\_{\sigma}) \cdot \nabla) \, \mathbf{g} \, \mathrm{d}t - (\sigma\_{l}(\mathbf{G}\_{\sigma}) \cdot \nabla) \, \mathrm{f} \, \mathrm{d}t. \tag{74}$$

Substituting (74) into (72), we obtain Eq. (70).

## *Work of Random Forces*

For sake of clarity, we first detail calculation rules in the context of point mechanics to define the work of random forces. We consider a random force, whose impulse is a martingale d*F<sup>σ</sup>* . We write its elementary work in a weak sense for any differentiable function *φ(t)* such that *φ(*0*)* = *φ(T )* = 0:

$$\int\_{0}^{T} \phi(t) \mathrm{d}W\_{\sigma} = \int\_{0}^{T} \phi(t) \left(\frac{\partial}{\partial t} \int\_{0}^{t} \mathrm{d}F\_{\sigma}\right) \cdot \mathrm{d}X. \tag{75}$$

The random force *∂ ∂t* # *t* <sup>0</sup> d*F<sup>σ</sup>* is written here formally, since d*F<sup>σ</sup>* is a martingale and cannot be differentiated in time. The work in Eq. (75) can be split in two contributions: d*Wσ,u* associated with the displacement *u* d*t*, and d*Wσ,σ* associated with the displacement *σt*d*B<sup>t</sup>* . We treat them separately.

$$\begin{split} \int\_{0}^{T} \phi(t) \mathrm{d}W\_{\sigma,u} &= \int\_{0}^{T} \phi(t) \left( \frac{\partial}{\partial t} \int\_{0}^{t} \mathrm{d}F\_{\sigma} \right) \cdot \mathbf{u}(\mathbf{x},t) \, \mathrm{d}t. \\ &= - \int\_{0}^{T} \left( \int\_{0}^{t} \mathrm{d}F\_{\sigma} \right) \cdot (\phi(t)\mathbf{u}(\mathbf{x},t))' \, \mathrm{d}t. \end{split} \tag{76}$$

The last expression is well defined and is a proper way to write this term. We can remark that # *<sup>t</sup>* <sup>0</sup> d*F<sup>σ</sup>* is homogeneous to a Brownian. If we expand the following expression for a time-differentiable function *ψ(x, t)*

$$\begin{aligned} \int\_0^T \mathbf{d}\left(\boldsymbol{\Psi}(\mathbf{x},t) \cdot \int\_0^t \mathbf{d}F\_{\sigma}\right) &= \int\_0^T \left(\int\_0^l \mathbf{d}F\_{\sigma}\right) \cdot \mathbf{d}\boldsymbol{\Psi}(\mathbf{x},t) + \int\_0^T \boldsymbol{\Psi}(\mathbf{x},t) \mathbf{d}F\_{\sigma} \\ &= \int\_0^T \left(\int\_0^l \mathbf{d}F\_{\sigma}\right) \cdot \boldsymbol{\Psi}'(\mathbf{x},t) \, \mathrm{d}t + \int\_0^T \boldsymbol{\Psi}(\mathbf{x},t) \mathrm{d}F\_{\sigma}, \end{aligned} \tag{77}$$

by taking *ψ(x, t)* = *φ(t)u(x, t)*, we obtain

$$\int\_{0}^{T} \phi(t) \mathrm{d}W\_{\sigma,\mathfrak{u}} = \int\_{0}^{T} \phi(t) \mathfrak{u}(\mathfrak{x},t) \mathrm{d}F\_{\sigma} - \int\_{0}^{T} \mathrm{d}\left(\phi(t)\mathfrak{u}(\mathfrak{x},t) \cdot \int\_{0}^{t} \mathrm{d}F\_{\sigma}\right)$$

$$= \int\_{0}^{T} \phi(t) \mathfrak{u}(\mathfrak{x},t) \mathrm{d}F\_{\sigma} - \phi(T)\mathfrak{u}(\mathfrak{x},T) \cdot F\_{\sigma}(\mathfrak{x},T) \tag{78}$$

$$= \int\_{0}^{T} \phi(t) \mathfrak{u}(\mathfrak{x},t) \mathrm{d}F\_{\sigma}.$$

We can identify

$$\mathrm{d}W\_{\sigma,u} = \mathfrak{u}(\mathfrak{x},t)\mathrm{d}F\_{\sigma}.\tag{79}$$

The second term d*Wσ,σ* is not well defined, even in weak form. Informally, it should balance with kinetic energy of *σt*d*B<sup>t</sup>* , which is not well defined (possibly infinite), and which has not been considered in the definition of total energy. Discarding this term is consistent with the derivation of the momentum in [15], where the acceleration associated with *σt*d*B<sup>t</sup>* being highly irregular is assumed to be in balance with some forces components which are equally irregular. As a consequence in our model, there is no work of the random forces associated with the Brownian motion displacement of the control surface.

## **Appendix C: Displacement of a Transported Control Surface**

Let us apply the SRTT to a characteristic function (*q* = 1 in *Ω(t)*, *q* = 0 outside) transported by the flow [16], and use the divergence theorem. We obtain the volume variation associated with a control surface transported by the stochastic flow.

$$\begin{split} \operatorname{d}V(t) = \operatorname{d} \int\_{\varOmega(t)} \operatorname{l} \, \operatorname{d} \mathbf{x} = \int\_{\varOmega(t)} \nabla \cdot (\mathsf{u}^{\star} \, \operatorname{d}t + \sigma\_{l} \operatorname{d} \mathbf{B}\_{l}) \, \operatorname{d} \mathbf{x} \\ = \int\_{\varOmega(t)} \underbrace{(\mathsf{u}^{\star} \, \operatorname{d}t + \sigma\_{l} \operatorname{d} \mathbf{B}\_{l})}\_{\operatorname{d} \mathbf{x}\_{d,l}} \cdot \mathfrak{n} \, \operatorname{d} \mathbf{S}. \end{split} \tag{80}$$

Hence, the normal displacement of the control surface is d*Xd,t***·***n*, which involves the modified advection velocity. As a consequence, the modified advection velocity has to be considered for the definitions of elementary works based on surface integrals.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Data Driven Stochastic Primitive Equations with Dynamic Modes Decomposition**

**Francesco L. Tucciarone, Etienne Mémin, and Long Li**

**Abstract** As planetary flows are characterised by interaction of phenomenons in a huge range of scales, it is unaffordable today to resolve numerically the complete ocean dynamics. In this work, a stochastic version of primitive equations are implemented into the NEMO community ocean model to assess the capability of the so-called Location Uncertainty framework in representing the small scales of the ocean flows.

**Keywords** Stochastic parametrization · Ocean modelling

# **1 Introduction**

Numerical resolution of planetary flows is nowadays a key tool to investigate possible future climates. The Ocean is a key actor of climate regulation and its evolution is for that reason of major interest. High resolution simulations are however extremely expensive and their usage remains limited to small domains. Large-scale simulations remain the primary tool to investigate future states of the Ocean (and of the Atmosphere as well). These simulations however do not resolve the complex interdependence of mesoscale and sub-mesoscale dynamics that characterises the global circulation, and thus great care must be put in the choice of the parametrization of all the scales that are too small to be efficiently resolved. Recent approaches incorporate noise terms to the dynamics of the flow with the goal of modelling the unresolved (and parametrised) processes, including small-scale turbulence, boundary value and scale coarsening uncertainty, as well as discretization and numerical errors. Rigorously justified methodologies have been introduced by Mémin [1] and Holm [2], providing a theoretically consistent stochastic large scale representation of the Navier-Stokes equations [3] conserving either energy

F. L. Tucciarone (-) · E. Mémin · L. Li

INRIA Centre de l' Université de Rennes, UMR CNRS 6625, Rennes, France e-mail: francesco.tucciarone@inria.fr

B. Chapron et al. (eds.), *Stochastic Transport in Upper Ocean Dynamics II*, Mathematics of Planet Earth 11, https://doi.org/10.1007/978-3-031-40094-0\_15

or circulation, respectively. Both models rely on a stochastic decomposition of the Lagrangian trajectory into a smooth-in-time component induced by the large scale velocity and a random fast-evolving uncorrelated part. From such models, a largescale representation with a stochastic definition of the small-scale effect emerges naturally. Moreover, compared to classical large-scale deterministic modelling, the additional degree of freedom brought by the stochastic component allows us to devise new intermediate models [4, 5, 6, 7, 8]. The Location Uncertainty (LU) approach [1] has been tested within the barotropic quasi-geostrophic model, the rotating shallow water model and the surface quasi-geostrophic model, where it has proven to be more accurate in structuring the large-scale flow [4], reproducing longterm statistics [9] and providing a good trade-off between model error representation and ensemble spread [10, 11] This work investigates the benefits of such model in the hydrostatic primitive equations, following the work of [12] in which noise based on Empirical Orthogonal Functions (EOF) was proposed. Here, a more elaborate noise defined from a Dynamics Mode Decomposition (DMD) strategy is proposed.

## **2 Location Uncertainty (LU)**

The Location Uncertainty principle consists in describing the trajectory **X***<sup>t</sup>* of a fluid particle with a stochastic decomposition of the Lagrangian trajectory, represented with the following stochastic differential equation (SDE):

$$\mathbf{dX}\_{l} = \mathbf{v}\left(\mathbf{X}\_{l}, t\right)\,\mathrm{d}t + \sigma\left(\mathbf{X}\_{l}, t\right)\,\mathrm{dB}\_{l},\tag{1}$$

where **X**: *Ω* × R<sup>+</sup> → *Ω* is the fluid flow map, i.e. the trajectory followed by fluid particles starting at initial map **<sup>X</sup>**|*t*=<sup>0</sup> <sup>=</sup> **<sup>x</sup>**<sup>0</sup> of the bounded domain *<sup>Ω</sup>* <sup>⊂</sup> <sup>R</sup>3. The first component, **v** *(***X***t, t)*, acts as the smooth-in-time component of the (Lagrangian) velocity of the flow, which is correlated both in space and time and associated with the integration of the equations of motion. The second component, *σ (***X***t, t)* d**B***<sup>t</sup>* , is a stochastic contribution (referred to as noise) that accounts for the processes that cannot be resolved at a given resolution or that have been neglected through a given numerical or physical modelling approximation. To completely define this last component, let *H* be the Hilbert space, *H* = - *<sup>L</sup>*<sup>2</sup> *(*S*),* <sup>R</sup>*<sup>d</sup>* , the space of square integrable functions over <sup>S</sup> with value in <sup>R</sup>*<sup>d</sup>* , with the inner product *<sup>f</sup> , <sup>g</sup><sup>H</sup>* <sup>=</sup> <sup>S</sup>*(<sup>f</sup>* †*g)* <sup>d</sup>**<sup>x</sup>** and induced norm *<sup>f</sup> <sup>H</sup>* <sup>=</sup> *f , f <sup>H</sup>* , and let *T* be a finite time, *T <* +∞. In this framework {**B***t*}0≤*t*≤*<sup>T</sup>* is defined as an *H*−valued (cylindrical) Brownian motion [13]:

$$\mathbf{B}\_{l} = \sum\_{i \in \mathbb{N}} \hat{\beta}^{l} \mathbf{e}\_{l}, \tag{2}$$

where *(***e***i)i*∈<sup>N</sup> is a Hilbertian orthonormal basis of *H* and *(β*ˆ *i)i*∈<sup>N</sup> is a sequence of independent standard Brownian motions on a stochastic basis *(Ω,* F*, (*F*t)t*∈[0*,T* ]*,* P*)*. The noise is then properly defined as the application of an Hilbert-Schmidt symmetric integral kernel *σtf (***x***)* = <sup>S</sup> *<sup>σ</sup>*˘ *(***x***,* **<sup>y</sup>***, t) <sup>f</sup> (***y***)* <sup>d</sup>**<sup>y</sup>** to the *H*−valued cylindrical Wiener process **B** as

$$(\sigma\_l \,\mathrm{d}\mathbf{B}\_l)^l\,(\mathbf{x}) = \int\_{\mathcal{S}} \check{\sigma}\_{lk}\,(\mathbf{x}, \mathbf{y}, t) \,\mathrm{d}\mathbf{B}\_l^k\,(\mathbf{y}) \,\mathrm{d}\mathbf{y},\tag{3}$$

where the Einstein summation notation is adopted. The role of the integrable kernel *σ*˘ is to impose a fast/small scale spatial correlation. It leads to the covariance tensor *Q*

$$\begin{aligned} \mathcal{Q}\_{lj} \left( \mathbf{x}, \mathbf{y}, t, s \right) &= \mathbb{E} \left[ \left( \sigma\_{l} \text{dB}\_{l} \left( \mathbf{x} \right) \right)^{j} \left( \sigma\_{l} \text{dB}\_{s} \left( \mathbf{y} \right) \right)^{j} \right] \\ &= \delta \left( t - s \right) \text{d}t \int\_{\mathcal{S}} \check{\sigma}\_{lk} \left( \mathbf{x}, \mathbf{z}, t \right) \check{\sigma}\_{kj} \left( \mathbf{z}, \mathbf{y}, s \right) \text{d}\mathbf{z}, \end{aligned}$$

of the centred Gaussian process *σt*d**B***<sup>t</sup>* ∼ N *(*0*,* **Q***)*. The diagonal components of the covariance tensor per unit of time, defined as **a***(***x***, t)δ(t* − *t )*d*t* = **Q***(***x***,* **x***,t,t )*, are referred to as the variance tensor. This tensor provides a measure of the strength of the noise. Notably, the variance tensor has the dimension of a viscosity in m2s−<sup>1</sup> and is symmetric and positive definite. The operator **a** is a compact auto-adjoint positive definite operator on *H*, that admits hence eigenfunctions *ξ <sup>n</sup> (***·***, t)* with (strictly) positive eigenvalues *λn (t)* satisfying *<sup>n</sup>*∈IN *λn (t) <sup>&</sup>lt;* +∞. As a consequence, the noise and the variance tensor *a* can be expressed (with another sequence of independent standard Brownian motions) through the spectral representation

$$\sigma\_l \mathbf{dB}\_l(\mathbf{x}) \;= \sum\_{n \in \mathbb{N}} \lambda^{1/2}(t) \,\mathfrak{k}\_n(\mathbf{x}, t) \,\mathrm{d}\beta\_n \tag{4}$$

$$\mathfrak{a}\left(\mathbf{x},t\right) \underset{n \in \mathbb{N}}{\sum} \lambda\left(t\right) \mathfrak{k}\_n\left(\mathbf{x},t\right) \mathfrak{k}\_n^\dagger\left(\mathbf{x},t\right) \,. \tag{5}$$

This noise term is centred, however a modification can be applied in order to consider a Lagrangian displacement of the form

$$\mathbf{dX}\_{l} = \left[\mathbf{v}\left(\mathbf{X}\_{l}, t\right) - \sigma\_{l}\mathbf{Y}\_{l}\left(\mathbf{x}\right)\right]\mathbf{d}t + \sigma\_{l}\mathbf{dB}\_{l}\left(\mathbf{X}\_{l}\right). \tag{6}$$

In contrast with (1), this decomposition sees the contribution of a centred Wiener process **<sup>B</sup>***<sup>t</sup>* that is drifted by a correlated component *<sup>σ</sup>t*Y*<sup>t</sup> (***x***)*. A proof of this statement can be given with Girsanov theorem, as a new probability measure <sup>P</sup> can be built in such a way that a non centred Wiener process as

$$
\widetilde{\mathbf{B}}\_l = \mathbf{B}\_l + \int\_0^l \mathbf{Y}\_s \, \mathrm{d}s,\tag{7}
$$

where {**Y***t*}*<sup>t</sup>* is a random process shifting the process **<sup>B</sup>***<sup>t</sup>* , remains centred on {*Ω,* <sup>F</sup>*,*<sup>P</sup>*,*{F*t*}0≤*t*≤*<sup>T</sup>* }. The new definition of the noise reads then

$$
\sigma\_l \text{d}\mathbf{B}\_l \left( \mathbf{x} \right) = \sigma\_l \text{d}\ddot{\mathbf{B}}\_l \left( \mathbf{x} \right) - \sigma\_l \text{Y}\_l \left( \mathbf{x} \right) \text{ dt}, \tag{8}
$$

with {**<sup>B</sup>***t*}*<sup>t</sup>* <sup>a</sup> <sup>P</sup>−Wiener process. All the arguments provided in the following will hold for this process under P, but the usage of a drifted noise is of paramount importance when the phenomenon to be modelled displays a non-zero time average and the physical processes involved cannot be regarded as completely uncorrelated, like in the case of ocean eddies and gyres. In the following, Eq. (6) will define our Lagrangian trajectory and the tilde notation will be dropped for simplicity.

## **3 Stochastic Boussinesq Equations**

Within the Location Uncertainty formalism the evolution of a random tracer *q* transported along the stochastic flow is described by the stochastic Reynolds transport theorem, introduced in [1]. The rate of change of a scalar *q*, integrated within the volume *Vt* , is described by

$$\mathrm{d} \int\_{V\_{l}} q\left(\mathbf{x}, t\right) \,\mathrm{d}\mathbf{x} = \int\_{V\_{l}} \left\{ \mathrm{D}\_{l}q + q \nabla \cdot \left[ \mathbf{v}^{\star} \,\mathrm{d}t + \sigma\_{l} \mathrm{d}\mathbf{B}\_{l} \right] \right\}(\mathbf{x}, t) \,\mathrm{d}\mathbf{x},\tag{9}$$

and summarised by the operator

$$\mathbf{D}\_l q = \mathbf{d}\_l q + \left[\mathbf{v}^\star \, \mathrm{d}t + \sigma\_l \, \mathrm{d}\mathbf{B}\_l\right] \cdot \nabla q - \frac{1}{2} \nabla \cdot (\mathbf{a} \nabla q) \, \mathrm{d}t. \tag{10}$$

In this formula, the first component of the right-hand side is the *increment in time* at a fixed location of the process *q*, that is d*tq* = *q (***X***t, t* + d*t)* − *q (***X***t, t)*, playing the role of a derivative in time for a non differentiable process. Encased in the square brackets there is the *stochastic advection displacement*, composed of a time correlated modified advection **v** and a fast evolving, time uncorrelated noise *σ<sup>t</sup>* d**B***<sup>t</sup>* , both advecting the scalar *<sup>q</sup>*. Under the probability measure <sup>P</sup> the velocity **<sup>v</sup>** is defined as

$$\mathbf{v}^{\star} = \mathbf{v} - \frac{1}{2}\nabla \cdot \mathbf{a} + \sigma\_{\!\!\! }^{\ast}(\nabla \cdot \sigma\_{\!\!\! }) - \sigma\_{\!\!\! }\mathbf{Y}\_{\!\!\! },\tag{11}$$

where **<sup>v</sup>** is the resolved component of the velocity, **<sup>v</sup>***<sup>s</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup>∇ **· a** is the effective transport velocity resulting from the noise inhomogeneities and the last term is the Girsanov drift due to a non centred noise. With this operator it is possible to formulate the Boussinesq equations under location uncertainty as done in [12] and reported below, split into horizontal and vertical equations using the convention **<sup>v</sup>** <sup>=</sup> *(***u***, w)* and with the buoyancy defined as *<sup>b</sup>* = −*<sup>g</sup> <sup>ρ</sup>*−*ρ*<sup>0</sup> *<sup>ρ</sup>*<sup>0</sup> .

Horizontalmomentum :

$$\mathbf{D}\_{I}\mathbf{u} + f\mathbf{e}\_{3} \times \left(\mathbf{u}\,\mathrm{d}t + \frac{1}{2}\sigma\_{I}\mathrm{d}\mathbf{B}\_{I}^{H}\right) = \nabla\_{\mathrm{H}}\left(-p' + \frac{\nu}{3}\nabla \cdot \mathbf{v}\right)\,\mathrm{d}t - \nabla\_{\mathrm{H}}\mathrm{d}p\_{I}^{\sigma} \qquad (12)$$

Verticalmomentum :

$$\mathbf{D}\_{l}w = \frac{\partial}{\partial z}\left(-p' + \frac{\nu}{3}\nabla \cdot \mathbf{v}\right)\,\mathrm{d}t - \frac{\partial}{\partial z}\mathrm{d}p\_{l}^{\sigma} + b\,\mathrm{d}t \tag{13}$$

Temperatureandsalinity :

$$\mathbf{D}\_{l}T = \kappa\_{T}\Delta T \,\mathrm{d}t,\tag{14}$$

$$\mathbf{D}\_{\mathrm{I}}\mathbf{S} = \kappa\_{\mathrm{S}} \Delta \mathbf{S} \,\mathrm{d}t,\tag{15}$$

Incompressibility :

Equationofstate :

$$
\nabla \cdot \left[\mathbf{v} - \mathbf{v}^s\right] = 0, \qquad \nabla \cdot \sigma\_I \mathbf{d} \mathbf{B}\_I = 0, \qquad \nabla \cdot \sigma\_I \mathbf{Y}\_I = 0 \tag{16}
$$

$$b = b \left( T, S, z \right). \tag{17}$$

In this formulation Temperature *T* and Salinity *S* are introduced as active tracers and transported along the stochastic flow, thus impacting the stochastic transport of momentum through the equation of state. The term d*p<sup>σ</sup> <sup>t</sup>* in Eqs. (12) and (13) is a martingale term representing (under the measure P) a zero-mean turbulent pressure related to the noise, termed *stochastic pressure*. From this starting point, Primitive equations can be achieved through a hydrostatic hypothesis on the vertical acceleration equation, resulting in the two conditions

$$-\frac{\partial p'}{\partial z} + b = 0 \quad \text{and} \quad \frac{\partial \operatorname{d} p\_l^{\sigma}}{\partial z} = 0,\tag{18}$$

the first one being the usual hydrostatic balance, the second being the result of the uniqueness of the semi-martingale decomposition. In this work and within the scaling used in [12], the stochastic pressure is constant along depth and is supposed to be in a pure geostrophic balance [10, 14], thus not impacting the rate of change of momentum. In this hydrostatic setting the vertical component of momentum is a diagnostic variable computed through integration of the incompressibility condition (16), and the large scale pressure *p* is obtained through vertical integration of the buoyancy term.

# **4 Methods**

The proposed method is implemented in the level-coordinate free-surface primitive equation model NEMO [15] in a double-gyre configuration consisting of a 45◦ degrees rotated beta plane centred at ∼30◦N, 3180 km long, 2120 km wide and 4 km deep, bounded by vertical walls and with a flat bottom. Seasonal winds and buoyancy changes are imposed as external forcings to induce the creation of a strong jet that separates a cold sub-polar gyre from a warm sub-tropical gyre. The complete details of this configuration are given in [16, 17], and the parameters of the simulation were chosen accordingly to the reference papers. The only change is in the values of eddy viscosity and diffusivity, enhanced of a factor five to suppress aliasing in the salinity field observed for smaller values (see Table 1 for an overview of their values). In order to assess the benefits brought by this stochastic approach, two purely deterministic simulations at different resolutions, 1/27◦ (R27d) and 1/3◦ (R3d), are compared to a stochastic simulation at 1/3◦ (R3LU). Each simulation consists of 10 years of data, collected every 5 days and averaged over the 5 days. The R27d simulation has been spun-up for 100 years before collecting data for the LU framework. Similarly, an additional 1/9◦ deterministic simulation has been spun up for 100 years in similar conditions in order to construct an initial state for the deterministic and stochastic 1/3◦ simulations.

## *4.1 High Resolution Data Filtering*

The high resolution data used to force the low resolution stochastic model need to be filtered before being used, in order to avoid the injection of energy scales that can jeopardise the stability of the simulation. The low resolution velocity fluctuations are obtained through spatial filtering of high resolution temporal fluctuations. First, a time average is defined on the high resolution fields as

$$
\overline{\mathbf{u}}\_{\text{HR}}^{l}\left(\mathbf{x}\right) = \frac{1}{T} \int\_{T} \mathbf{u}\_{\text{HR}}\left(\mathbf{x}, t\right) \,\mathrm{d}t,\tag{19}
$$


**Table 1** Parameters of the model experiments

so to obtain with Reynolds decomposition the high resolution fluctuations:

$$\mathbf{u}'\_{\rm HR}\left(\mathbf{x},t\right) = \mathbf{u}\_{\rm HR}\left(\mathbf{x},t\right) - \overline{\mathbf{u}}^{\rm I}\_{\rm HR}\left(\mathbf{x}\right). \tag{20}$$

The corresponding low resolution fluctuations are obtained through a band-pass filter as **u** LR = *(*G<sup>1</sup> − G2*)* **u** HR , where G represents a Gaussian filter. These filtered fields have a smaller amount of energy compared to the original snapshots, and they are re-scaled to this amount as

$$\mathbf{u}'\_{\rm LR} = \frac{\left\| \mathbf{u}'\_{\rm HR} \right\|\_2}{\left\| \left[ \left( \mathcal{G}\_1 - \mathcal{G}\_2 \right) \mathbf{u}'\_{\rm HR} \right]\_{\rm LR}^{\downarrow} \right\|\_2} \left[ \left( \mathcal{G}\_1 - \mathcal{G}\_2 \right) \mathbf{u}'\_{\rm HR} \right]\_{\rm LR}^{\downarrow},\tag{21}$$

where the downward arrow represents downscaling towards low resolution. The result of this procedure sees the velocity fluctuations have the same spatial structure as before but enhanced level of energy.

## *4.2 Off-Line Noise Modelling Through DMD*

Dynamical Mode Decomposition is a methodology [19] to construct a proxy linear dynamical system to describe an unknown non-linear dynamics. In this paper DMD is applied to the evolution in time of the velocity fluctuations, that is thus approximated as

$$\mathbf{u}'(\mathbf{x}, t\_{l+1}) \sim A \mathbf{u}'(\mathbf{x}, t\_l) \,. \tag{22}$$

Such (finite dimensional) linear dynamical system is known to have a general solution:

$$\mathbf{u}'(\mathbf{x},t) = \sum\_{m=1}^{N} b\_{m} \exp\left(\mu\_{m}t\right) \boldsymbol{\Phi}\_{m}\left(\mathbf{x}\right),\tag{23}$$

where *<sup>φ</sup><sup>m</sup> (***x***)* <sup>∈</sup> <sup>C</sup>*<sup>d</sup>* are the eigenvectors of *<sup>A</sup>* associated to the eigenvalues *μm* <sup>∈</sup> <sup>C</sup> and *bm* ∈ C are amplitudes. In particular *μm* = *σm* + i*ωm*, the real part *σm* is the growth rate of the mode and *ωm* is the periodic frequency of the mode *m*. Since the initial data are real valued fields, the eigenvectors, eigenvalues and amplitudes will be two-by-two complex conjugate, that is *φ*2*<sup>p</sup>* = *φ*2*p*+1. Following the successful proposition of [18], we split the DMD modes into correlated and uncorrelated modes in order to define the Girsanov drift through the slow component of the dynamics and the random noise through the fast component. The two sets of modes, <sup>M</sup>*<sup>u</sup>* for the uncorrelated noise and <sup>M</sup>*<sup>c</sup>* for the correlated part are defined as

**Fig. 1** Illustration of the selection of the DMD modes. On the left, frequencies of the modes are plotted on the unitary circle; they are coloured differently to represent their characteristic physical time scale. At this point, a threshold *τc* = 25d is chosen to differentiate the correlated from the uncorrelated modes. On the right, over violet background are plotted the correlated modes, over orange background the uncorrelated modes. The amplitude threshold for the correlated mode *C<sup>c</sup>* is set to zero, while for uncorrelated modes *C<sup>u</sup>* is set to 2. The grey dots represent the set of uncorrelated modes below this threshold, that are thus discarded

$$\mathcal{M}^{\mu} = \left\{ m \in [1, N] \; : \; |\mu\_m| \sim 1, \; |\phi\_m| > \frac{\pi}{\mathfrak{r}\_c}, \; |b\_m| \ge C^{\mu} \right\}, \tag{24}$$

$$\mathcal{M}^c = \left\{ m \in [1, N] \; : \; |\mu\_m| \sim 1, \; |o\_m| \le \frac{\pi}{\mathfrak{r}\_c}, \; |b\_m| \ge C \right\}, \tag{25}$$

where *τc* is a temporal separation scale between correlated and uncorrelated (usually set to a value for which a spectral gap is observed and fixed here to twentyfive days) and *Cu*, *C<sup>c</sup>* are empirical cut-off of amplitudes. A visual representation of the aforementioned procedure is given in Fig. 1. As the DMD modes are not orthogonal, a scaling is applied to avoid spurious effects and to make sure that the reconstructed data corresponds to an orthogonal projection onto the subspaces spanned by the set of modes contained in <sup>M</sup>*<sup>u</sup>* and <sup>M</sup>*c*. The procedure reads as follow:


Such procedure is applied for <sup>M</sup> <sup>=</sup> <sup>M</sup>*<sup>u</sup>* and <sup>M</sup> <sup>=</sup> <sup>M</sup>*<sup>c</sup>* separately.

## *4.3 On-Line Noise Reconstruction*

Inside the NEMO core, during the simulation, the noise and Girsanov drift are defined as:

$$\sigma\_{1,\theta} \mathrm{d}\mathbf{B}\_l = \sqrt{\pi\_\theta} \sum\_{m \in \mathcal{M}^\mu} \exp\left(\mathrm{i}\omega\_m t\right) \mathfrak{p}\_m \left(\mathbf{x}\right) \mathrm{d}\beta\_m,\tag{26}$$

$$\sigma\_I \mathbf{Y}\_I = \overline{\mathbf{u}}^t + \sum\_{m \in \mathcal{M}^c} \exp\left(\mathrm{i}o\_m t\right) \mathfrak{g}\_m \left(\mathbf{x}\right), \tag{27}$$

with *<sup>ξ</sup><sup>m</sup>* <sup>=</sup> <sup>√</sup>*τθϕm*. In Eqs. (26) and (27) *τθ* is the process decorrelation time. It is supposed to be different for each component evolving in the system, as momentum, temperature and salinity do not diffuse with the same decorrelation time. The subscript *θ* can thus indicate momentum **u**, temperature *T* or salinity *S* and the corresponding noise and variance will be indicated as *σt,θ* d**B***<sup>t</sup> (x)* and *a<sup>θ</sup> (x, t)*. The different decorrelation times are difficult to characterize precisely (as they depends in space), but their ratio can be justified by physical reasoning. The decorrelation times are chosen in such a way that *<sup>τ</sup>***<sup>u</sup>** <sup>=</sup> *Δt*, *τT* <sup>=</sup> *κT <sup>κ</sup>***<sup>u</sup>** *Δt* and *τS* <sup>=</sup> *κS <sup>κ</sup>***<sup>u</sup>** *Δt*, where *Δt* is the simulation time step and *κ* the molecular diffusion coefficients. Each (eigen) frequency *ωn* comes in pairs and each pair of complex Brownian motion are conjugates. The real and imaginary parts of the Brownian motion are independent. As such, both the noise and Girsanov drift are real-valued fields. The variance tensor of such noise remains stationary:

$$\mathfrak{a}\_{\theta} \left( \mathbf{x} \right) = \mathfrak{r}\_{\theta} \sum\_{m \in \mathcal{M}^{\mu}} \mathfrak{o}\_{m} \left( \mathbf{x} \right) \mathfrak{o}\_{m}^{\dagger} \left( \mathbf{x} \right) \,. \tag{28}$$

After construction with the offline data through Eqs. (26) and (27), the noise *σt,θ* d**B***<sup>t</sup>* is constrained to live on the tangent space of the isopycnal surfaces. This procedure is operationally implemented as the application of an isopycnal projection operator P*<sup>ρ</sup>*

$$\mathbf{P}^{\rho} = I - \frac{\nabla \rho \ (\nabla \rho)^{\mathbf{T}}}{|\nabla \rho|^{2}} \tag{29}$$

to the noise. Being the density function of temperature and salinity, *ρ* = *ρ (T,S,z)*, the isopycnal projection operator carries information about the current state of the simulation. The projected noise *σt*d**B***<sup>ρ</sup> <sup>t</sup> (***x***)* <sup>=</sup> <sup>P</sup>*<sup>ρ</sup> <sup>σ</sup>t*d**B***<sup>t</sup> (***x***)* is thus strongly tied to the evolution of the flow density.

# **5 Results**

In this work we analyse the result of a single realisation. Qualitatively speaking, a coarser resolution simulation is far from being as representative as the fine resolution simulation, as can be seen in both Figs. 2 and 3, where the leftmost panel contains results from the R27 deterministic simulation, the centre panel contains the R3 deterministic simulation and the rightmost panel contains the R3 stochastic simulation. The effect of the forcing is manifested in the R27 simulation in a jet current roughly aligned with latitude (tilted by a 45 degrees angle in the domain frame). The presence of this structure and of a secondary and smaller jet stream, is visible in the reference papers [16, 17]. This feature is absent in the R3 deterministic simulation. Figure <sup>2</sup> depicts the average relative vorticity *<sup>ζ</sup>* 10Y <sup>=</sup> - *∂x v* − *∂yu /f* 10Y and shows primarily the difference between a high resolution simulation and a coarse resolution simulation. In the centre panel, representing the deterministic low resolution simulation R3d, there is no sign of the characteristic jet that can be seen in the left panel as a strong contraposition of opposite sign vorticity. Viscosity on the boundary creates a sequence of alternating bands of opposite vorticity, related to jet separation problem in low resolution simulations [20]. The stochastic R3LU simulation conversely presents a much better representation of the dynamics, as a jet can now be identified clearly. The time averaged sea surface height, represented in Fig. 3 shows for the high resolution simulation the characteristic geostrophic properties of the jet stream: the northern, cold sub-polar gyre is characterised by a smaller height than the southern, warm sub-tropical gyre. This characteristic is not visible in the deterministic low resolution simulation R3d, where one can only find the effects of the boundary, while the stochastic R3LU simulation presents

**Fig. 2** 10-years averaged relative vorticity *ζ* = - *∂x v* − *∂yu /f* at the surface layer of the model for deterministic high-resolution (1/27◦, left), for deterministic low resolution (1/3◦, middle) and for stochastic low resolution (1/3◦, right)

**Fig. 3** 10-years averaged sea surface height for deterministic high-resolution (1/27◦, left), for deterministic low resolution (1/3◦, middle) and for stochastic low resolution (1/3◦, right)

**Fig. 4** 10-years averaged kinetic energy for deterministic high-resolution (1/27◦, left), for deterministic low resolution (1/3◦, middle) and for stochastic low resolution (1/3◦, right)

a much more faded picture of this process, which is not as intense as in the R27d but definitely present. Viscosity on the left boundary provides a strong constraint to the dynamics, biasing the representation of the jet stream in both lowresolution cases. Figure <sup>4</sup> shows the ten years average of kinetic energy, KE10Y <sup>=</sup> - *u*<sup>2</sup> + *v*<sup>2</sup> */*2 10Y . From this picture is clear that, while the stochastic simulation includes much more features when compared to its deterministic counterpart, it is still suffering the influence of the boundary, affecting the position of the jet stream separation. Figure 5 shows the vertical profiles of horizontally averaged temperature at different times for the three simulation, *<sup>T</sup> x,y (z, t)* <sup>=</sup> <sup>1</sup> *A <sup>A</sup> T (x, y, z, t)* d*x*d*y*. The deterministic high resolution profile is plotted in green, the deterministic low resolution profile is plotted in orange and the stochastic low resolution simulation is plotted in dark red. From this plot it is difficult to assess the benefits of the stochastic

**Fig. 5** Vertical profile of temperature at 5 equidistant points in time. In red, the stochastic simulation, in orange the deterministic simulation, in green the reference R27 simulation

formalism to the temperature equation, as the stochastic and deterministic low resolutions simulation are close. However, no spurious vertical mixing is observed. This corresponds to an improvement with respect to [12], and is brought by the new methodology for noise generation detailed in the previous section. In particular, constraining the noise on the isopycnal surfaces tangent planes considerably reduces the spurious mixing in POD as well. Furthermore, the projection operator has the effect of localising the effects of the noise on the jet region. From a quantitative point of view the simulations are compared using the Root Mean Square

$$\text{RMS}(f\_{\text{M}}) = \sqrt{\mathbb{E}\left[ (f\_{\text{M}})^2 \right]},\tag{30}$$

providing a measure of the energy content for the variable *f*<sup>M</sup> , and the Root Mean Square Error

$$\text{RMSE}(f\_{\mathsf{M}}) = \sqrt{\mathbb{E}\left[ (\left[ f\_{\mathsf{R}27} \right]\_{\mathsf{M}}^{\downarrow} - f\_{\mathsf{M}})^2 \right]},\tag{31}$$


Boldface is used to highlight the best performance according to the metric.

R3LU **2.92 1.46 7.49**

that in turns describes the energy content of the errors. In all previous equations the expected value is considered to be taken in the volume, i.e. <sup>E</sup> [*<sup>f</sup>* ] <sup>=</sup> <sup>1</sup> *V <sup>V</sup> f* d*V* . Finally, defining *<sup>f</sup> <sup>t</sup>* <sup>=</sup> <sup>1</sup> *T <sup>T</sup> <sup>f</sup>* <sup>d</sup>*<sup>t</sup>* the time average and *<sup>σ</sup>*<sup>2</sup> *<sup>f</sup>* <sup>=</sup> <sup>1</sup> *T <sup>T</sup> (f* <sup>−</sup> *<sup>f</sup> <sup>t</sup> )*<sup>2</sup> d*t* the time variance, the Relative Gaussian Entropy [21] (GRE) at a single point

$$\text{GRE} = \frac{1}{2} \left[ \frac{(\overline{f}\_{\text{R27}}^t - \overline{f}\_{\text{M}}^t)^2}{\sigma\_{f,\text{M}}^2} + \frac{\sigma\_{f,\text{R27}}^2}{\sigma\_{f,\text{M}}^2} - 1 - \ln \left( \frac{\sigma\_{f,\text{R27}}^2}{\sigma\_{f,\text{M}}^2} \right) \right], \tag{32}$$

measures with a single criterion both the mean and variance reconstructions. The first term on the right-hand side of GRE represents the error in the mean weighted by the variance of the model. The remaining terms measure the error in model variability and is referred to as "dispersion". The lower this criterion the better the reconstruction. It can be observed from (32) that this criterion is minimal if, for all points, the mean is perfectly reconstructed and if the variance of the reference equals the one of the coarse model tested. These quantitative measures have been evaluated for the mean component of three quantities, namely vorticity *ζ* , the horizontal energy of the flow <sup>1</sup> 2 - *<sup>u</sup>*<sup>2</sup> <sup>+</sup> *<sup>v</sup>*<sup>2</sup> and temperature *T* . The values of the proposed metrics for each simulation are given in Table 2. Figure 6 provides a visual representation of the behaviour of the fluctuations around the mean states of vorticity and energy, where the fluctuations are computed in time in a Reynolds splitting fashion, *<sup>f</sup>* <sup>=</sup> *<sup>f</sup>* <sup>−</sup> *<sup>f</sup> <sup>t</sup>* . The first 2 years can be considered as the time required for some adjustment of the filtered and downsampled 1/9◦ initial condition. It can be outlined that, while the deterministic simulation cannot sustain the initial level of variability, its stochastic counterpart shows opposite behaviour, maintaining a higher variability. Concerning the benefits of the stochastic model to the dynamical quantities *ζ* and <sup>1</sup> 2 - *<sup>u</sup>*<sup>2</sup> <sup>+</sup> *<sup>v</sup>*<sup>2</sup> , all the presented results show that the stochastic model outperforms the deterministic simulation. The mean flow contains much more energy than the deterministic counterpart, as stated by higher values of RMS*(f <sup>t</sup> )*, the RMSE*(f <sup>t</sup> )* of the average fields seems to be reduced , with the exception of vorticity, for which a systematic bias in the positioning of the jet stream jeopardise the computation of the RMSE. In other words the *L*<sup>2</sup> norm is lower for a null vorticity field than for a vorticity field exhibiting clearly a meaningful jet

**Fig. 6** Comparison between the fluctuations Root Mean Square (in space) along 10 years. In blue, the fluctuation RMS for the deterministic high resolution simulation (R27d). In green, the low resolution stochastic simulation (R3LU), in orange the low resolution deterministic simulation (R3d). The solid line shows for the three cases the values of vorticity, with corresponding scale on the left; the dotted line shows for the three cases the values of Energy, with corresponding scale on the right

but with bias. The corresponding fluctuations show a higher RMS in time during the whole simulation, thus the stochastic simulation is energetically closer to the high resolution simulation. The distance between the stochastic simulation and the high resolution simulation, as measured by the GRE, is lower for both vorticity and energy when compared to a deterministic simulation. The effect of the stochastic parametrisation to the distribution of temperature *T* is of more difficult assessment, as the metrics show very similar behaviours. The distance measured by the GRE is lower than for the stochastic case, as much as the corresponding RMSE. However, the RMS of the mean temperature are very similar between deterministic and stochastic simulation. A comparison of the temperature fluctuations RMS in time (similar to that of Fig. 6 and not presented in this paper) confirms this similarity by showing no sensible difference between the simulations.

## **6 Conclusions**

The considered stochastic model has been implemented in the ocean model NEMO. A dynamical mode decomposition based noise was considered and has been shown to be beneficial to improve the variability of the coarse resolution models and to represent temporal statistics in a more accurate fashion. The intrinsic variability of the model has been greatly enhanced for dynamical variables as vorticity and energy, as much as the qualitative behaviour of both long time average and time-snapshots appearance. The same benefits do not seem to apply to thermodynamic quantities like temperature.

**Acknowledgments** The authors acknowledge the support of the ERC EU project 856408- STUOD.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Index**

# **B**

Bella, A., 1

#### **C**

Chapron, B., 159, 223, 247, 281 Cifani, P., 17 Collard, F., 247 Crisan, D., 53, 159

#### **D**

Dumas, F., 143

#### **E**

Ephrati, S., 17

#### **F**

Fablet, R., 247 Flandoli, F., 29

#### **G**

Garreau, P., 143 Gaultier, L., 247 Geurts, B., 193 Goodair, D., 53

#### **H**

Holm, D.D., 111, 159 Hu, R., 111

**J** Jamet, Q., 143, 293

#### **L**

Lahaye, N., 1, 207, 223 Lang, O., 159 Li, L., 143, 223, 321 Lobbe, A., 159 Luesink, E., 193

#### **M**

Maingonnat, I., 207 Mémin, E., 143, 159, 223, 293, 321 Morlacchi, S., 29

#### **O**

Ouala, S., 247

# **P**

Papini, A., 29

#### **R**

Reich, S., 261 Resseguier, V., 281

© The Editor(s) (if applicable) and The Author(s) 2024 B. Chapron et al. (eds.), *Stochastic Transport in Upper Ocean Dynamics II*, Mathematics of Planet Earth 11, https://doi.org/10.1007/978-3-031-40094-0 337

#### **S** Street, O.D., 111

**T**

Tissot, G., 1, 207, 223, 293 Tucciarone, F.L., 321

**V** Viviani, M., 17

**Z** Zhen, Y., 281