Generalised stability theory

Introduction

The stability of a physical system is a classical problem of mechanics, with contributions from authors such as Lagrange, Dirichlet and Lyapunov [AM-Lei10]. Stability investigates the response of the system to small perturbations in its initial condition: if the solutions of the perturbed systems remain within a neighbourhood of the unperturbed solution, then the system is stable; otherwise, the system is unstable at that initial condition.

The traditional approach for investigating the stability of physical systems was given by Lyapunov [AM-Lya92]. The (nonlinear) equations of motion are linearised about a base solution, and the eigenvalues of the linearised system are computed. If all eigenvalues have negative real part, then there exists a finite region of stability around the initial condition: perturbations within that region decay to zero, and the system is asymptotically stable [AM-Par92].

While this approach has had many successes, several authors have noted that it does not give a complete description of the finite-time stability of a physical system. While the eigendecomposition determines the asymptotic stability of the linearised equations as \(t \rightarrow \infty\), some systems permit transient perturbations which grow in magnitude, before being predicted to decay. However, if the perturbations grow too large, the linearised equations may cease to hold, and the system may become unstable due to nonlinear effects. More specifically, this transient growth occurs when the system is non-normal, i.e. when the eigenfunctions of the system do not form an orthogonal basis [5M-Sch07]. For example, Trefethen [1M-TTRD93] describes how the traditional approach fails to give accurate stability predictions for several classical problems in fluid mechanics, and resolves the problem by analysing the nonnormality of the system in terms of pseudospectra [5M-TE05].

Therefore, this motivates the development of a finite-time theory of stability, to investigate and predict the transient growth of perturbations. While Lorenz [5M-Lor65] discussed the core ideas (without using modern nomenclature), the development of this so-called generalised stability theory (GST) has been driven by the work of B. F. Farrell and co-workers [AM-Far82] [AM-Far85] [1M-FI96] [AM-FI96]. The main idea is to consider the linearised propagator of the system, the operator (linearised about the time-dependent trajectory) that maps perturbations in the initial conditions to perturbations in the final state. Essentially, the propagator is the inverse of the tangent linear system associated with the nonlinear forward model, along with operators to load the initial perturbation and select the final perturbation. The perturbations that grow maximally over the time window are given by the singular functions of the propagator associated with the largest singular values. Since the linearised propagator depends on the base solution, it follows that the predictability of the system depends on the conditions of the base solution itself: some states are inherently more predictable than others [5M-Lor65] [5M-Kal02].

The singular value decomposition of the propagator

This presentation of generalised stability theory will consider the stability of the system to perturbations in the initial conditions, but the same approach can be applied to analysing the stability of the system to perturbations in other parameters.

Consider the solution of the model at the final time \(u_T\) as a pure function of the initial condition \(u_0\):

\[u_T = M(u_0),\]

where \(M\) is the nonlinear propagator that advances the solution in time over a given finite time window \([0, T]\). Other parameters necessary for the solution (e.g. boundary conditions, material parameters, etc.) are considered fixed. Assuming the model is sufficiently differentiable, the response of the model \(M\) to a perturbation \(\delta u_0\) in \(u_0\) is given by

\[\delta u_T = M(u_0 + \delta u_0) - M(u_0) = \frac{\textrm{d} M}{\textrm{d} u_0} \delta u_0 + O(\left|\left|\delta u_0\right|\right|^2).\]

Neglecting higher-order terms, the linearised perturbation to the final state is given by

\[\delta u_T \approx L \delta u_0,\]

where \(L\) is the linearised propagator (or just propagator) \({\textrm{d} M}/{\textrm{d} u_0}\) that advances perturbations in the initial conditions to perturbations to the final solution.

To quantify the stability of the system, we wish to identify perturbations \(\delta u_0\) that grow the most over the time window \([0, T]\). For simplicity, equip both the initial condition and final solutions with the conventional inner product \(\left\langle \cdot, \cdot \right\rangle\). We seek the initial perturbation \(\delta u_0\) of unit norm \(\left|\left|\delta u_0\right|\right| = \sqrt{\left\langle \delta u_0, \delta u_0 \right\rangle} = 1\) such that

\[\delta u_0 = \operatorname*{arg\,max}_{\left|\left|\delta u_0\right|\right|} \left\langle \delta u_T, \delta u_T \right\rangle.\]

Expanding \(\delta u_T\) in terms of the propagator,

\[\left\langle \delta u_T, \delta u_T \right\rangle = \left\langle L \delta u_0, L \delta u_0 \right\rangle = \left\langle \delta u_0, L^*L \delta u_0 \right\rangle,\]

we see that the leading perturbation is the eigenfunction of \(L^*L\) associated with the largest eigenvalue \(\mu\), and the growth of the norm of the perturbation is given by \(\sqrt{\mu}\). In other words, the leading initial perturbation \(\delta u_0\) is the leading right singular function of \(L\), the resulting final perturbation \(\delta u_T\) is the associated left singular function, and the growth rate of the perturbation is given by the associated singular value \(\sigma\). The remaining singular functions offer a similar physical interpretation: if a singular function \(v\) has an associated singular value \(\sigma > 1\), the perturbation will grow over the finite time window \([0, T]\); if \(\sigma < 1\), the perturbation will decay over that time window.

If the initial condition and final solution spaces are equipped with inner products \(\left\langle \cdot, \cdot \right\rangle_I \equiv \left\langle \cdot, X_I \cdot \right\rangle\) and \(\left\langle \cdot, \cdot \right\rangle_F \equiv \left\langle \cdot, X_F \cdot \right\rangle\) respectively, then the leading perturbations are given by the eigenfunctions

\[X_I^{-1} L^* X_F L \delta u_0 = \mu \delta u_0.\]

The operators \(X_I\) and \(X_F\) must be symmetric positive-definite. In the finite element context, \(X_I\) and \(X_F\) are often the mass matrices associated with the input and output spaces, as these matrices induce the functional \(L_2\) norm.

Computing the propagator

In general, the nonlinear propagator \(M\) that maps initial conditions to final solutions is not available as an explicit function; instead, a PDE is solved. For clarity, let \(m\) denote the data supplied for the initial condition. The PDE may be written in the abstract implicit form

\[F(u, m) = 0,\]

with the understanding that \(u_0 = m\). We assume that for any initial condition \(m\), the PDE can be solved for the solution trajectory \(u\), and the nonlinear propagator \(M\) can then be computed by returning the solution at the final time. Differentiating the PDE with respect to the initial condition data \(m\) yields

\[\frac{\partial F}{\partial u} \frac{\textrm{d}u}{\textrm{d}m} = - \frac{\partial F}{\partial m},\]

the tangent linear system associated with the PDE. The term \({\partial F}/{\partial u}\) is the PDE operator linearised about the solution trajectory \(u\): therefore, it is linear, even when the original PDE is nonlinear. \({\partial F}/{\partial m}\) describes how the equations change as the initial condition data \(m\) changes, and acts as the source term for the tangent linear system. \({\textrm{d}u}/{\textrm{d}m}\) is the prognostic variable of the tangent linear system, and describes how the solution changes with changes to \(m\). To evaluate the action of the propagator \(L\) on a given perturbation \(\delta m\), the tangent linear system is solved with that particular perturbation, and evaluated at the final time:

\[L \delta m \equiv - \left.\left(\frac{\partial F}{\partial u}\right)^{-1}\frac{\partial F}{\partial m} \delta m\right|_T.\]

Therefore, to automate the generalised stability analysis of a PDE, it is necessary to automatically derive and solve the associated tangent linear system. Furthermore, as the GST analysis also requires the adjoint of the propagator, it is also necessary to automatically derive and solve the adjoint of the tangent linear system. This is why GST is considered as an application of adjoints.

Singular value computation

Once the propagator \(L\) is available, its singular value decomposition may be computed. There are two main computational approaches. The first approach is to compute the eigendecomposition of the cross product matrix \(L^*L\) (or \(LL^*\), whichever is smaller). The second is to compute the eigendecomposition of the cyclic matrix

\[\begin{split}H(L) = \begin{pmatrix} 0 & L \\ L^* & 0 \end{pmatrix}\end{split}\]

As explained in [5M-TB97], the latter option is more accurate for computing the small singular values, but is more expensive. As we are only interested in a small number of the largest singular triplets, the cross product approach is used throughout this work. Note that regardless of which approach is taken, the adjoint propagator \(L^*\) is necessary to compute the SVD of \(L\).

The algorithm used to compute the eigendecomposition of the cross product matrix is the Krylov-Schur algorithm [AM-Ste01], as implemented in SLEPc [AM-HRV05] [AM-HRTV07]. As the matrix is Hermitian (whether norms are used or not), this algorithm reduces to the thick-restart variant [AM-WS00] of the Lanczos method [AM-Lan50]. This algorithm was found experimentally to be faster than all other algorithms implemented in SLEPc for the computation of a small number of singular triplets, which is the case of interest in stability analysis.

Rather than representing the propagator as a matrix, the action of the propagator is computed in a matrix-free fashion, using the tangent linear model. In turn, the entire time-dependent tangent linear model is not stored, but its action is computed in a global-matrix-free fashion, using the matrices associated with each individual equation solve. In turn, the solution of each equation solve may optionally be achieved in a matrix-free fashion; the automatic derivation of the tangent linear and adjoint systems supports such an approach. Similarly, the adjoint propagator is computed in a matrix-free fashion using the adjoint model. SLEPc elegantly supports such matrix-free computations through the use of PETSc shell matrices [AM-BBB+11] [AM-BGMS97].

For more information on how to perform a generalised stability analysis with dolfin-adjoint, see the chapter in the documentation on generalised stability analysis.

References

[AM-BBB+11]S. Balay, J. Brown, K. Buschelman, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, B. F. Smith, and H. Zhang. PETSc users manual. Technical Report ANL-95/11, Argonne National Laboratory, 2011. Revision 3.2.
[AM-BGMS97]S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. Efficient management of parallelism in object oriented numerical software libraries. In E. Arge, A. M. Bruaset, and H. P. Langtangen, editors, Modern Software Tools in Scientific Computing, 163–202. Birkhäuser Press, 1997.
[AM-Far82]B. F. Farrell. The initial growth of disturbances in a baroclinic flow. Journal of Atmospheric Sciences, 39(8):1663–1686, 1982. doi:10.1175/1520-0469(1982)039<1663:TIGODI>2.0.CO;2.
[AM-Far85]B. F. Farrell. Transient growth of damped baroclinic waves. Journal of Atmospheric Sciences, 42(24):2718–2727, 1985. doi:10.1175/1520-0469(1985)042<2718:TGODBW>2.0.CO;2.
[AM-FI96]B. F. Farrell and P. J. Ioannou. Generalized stability theory. Part II: Nonautonomous operators. Journal of Atmospheric Sciences, 53(14):2041–2053, 1996. doi:10.1175/1520-0469(1996)053<2041:GSTPIN>2.0.CO;2.
[AM-HRTV07]V. Hernandez, J. E. Roman, A. Tomas, and V. Vidal. Krylov-Schur methods in SLEPc. Technical Report STR-7, Universitat Politècnica de València, 2007.
[AM-HRV05]V. Hernandez, J. E. Roman, and V. Vidal. SLEPc: a scalable and flexible toolkit for the solution of eigenvalue problems. ACM Transactions on Mathematical Software, 31(3):351–362, 2005. doi:hernandez2005.
[AM-Lan50]C. Lanczos. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. Journal of Research of the National Bureau of Standards, 45(4):255–282, 1950.
[AM-Lei10]R. Leine. The historical development of classical stability concepts: Lagrange, Poisson and Lyapunov stability. Nonlinear Dynamics, 59(1):173–182, 2010. doi:10.1007/s11071-009-9530-z.
[AM-Lya92]A. M. Lyapunov. The General Problem of the Stability of Motion. Control Theory and Applications Series. Taylor & Francis, 1892. ISBN 978-0748400621. Translated by A. T. Fuller.
[AM-Par92]P. C. Parks. A. M. Lyapunov’s stability theory—100 years on. IMA Journal of Mathematical Control and Information, 9(4):275–303, 1992. doi:10.1093/imamci/9.4.275.
[AM-Ste01]G. Stewart. A Krylov–Schur algorithm for large eigenproblems. SIAM Journal on Matrix Analysis and Applications, 23(3):601–614, 2001. doi:10.1137/S0895479800371529.
[AM-WS00]K. Wu and H. Simon. Thick-restart Lanczos method for large symmetric eigenvalue problems. SIAM Journal on Matrix Analysis and Applications, 22(2):602–616, 2000. doi:10.1137/S0895479898334605.