# Learning epidemic trajectories through Kernel Operator Learning: from modelling to optimal control
**Authors**: Giovanni Ziarelli, Nicola Parolini, Marco Verani
MOX
(April 2024)
Abstract
Since infectious pathogens start spreading into a susceptible population, mathematical models can provide policy makers with reliable forecasts and scenario analyses, which can be concretely implemented or solely consulted. In these complex epidemiological scenarios, machine learning architectures can play an important role, since they directly reconstruct data-driven models circumventing the specific modelling choices and the parameter calibration, typical of classical compartmental models. In this work, we discuss the efficacy of Kernel Operator Learning (KOL) to reconstruct population dynamics during epidemic outbreaks, where the transmission rate is ruled by an input strategy. In particular, we introduce two surrogate models, named KOL-m and KOL- $∂$ , which reconstruct in two different ways the evolution of the epidemics. Moreover, we evaluate the generalization performances of the two approaches with different kernels, including the Neural Tangent Kernels, and compare them with a classical neural network model learning method. Employing synthetic but semi-realistic data, we show how the two introduced approaches are suitable for realizing fast and robust forecasts and scenario analyses, and how these approaches are competitive for determining optimal intervention strategies with respect to specific performance measures. Keywords: epidemiology; operator learning; scenario analysis; optimal epidemic control; kernel regression; Highlights:
- We formalize the novel Kernel Operator Learning (KOL) framework in the context of epidemic problems;
- We build two KOL strategies for estimating susceptible, infectious and recovered individuals given an input level of reduction of transmission rate;
- We numerically test the two approaches on synthetic data for assessing their reliability in making scenario-analyses;
- We compare the solutions of two optimal control problems on standard compartmental models with respect to the solutions retrieved with KOL approaches.
1 Introduction
The recent global SARS-CoV-2 pandemic has underlined the paramount importance of developing mathematical models and numerical schemes for predictions and forecasts of epidemic illnesses: from the perspective of policy-makers, it is often useful to dispose of qualitative and quantitative results for making scenario analyses and forecasts; from the social point of view sharing information about possible outcomes can be beneficial in order to increase social awareness and public knowledge on the illness current spreading and its future evolutions. A typical approach relies on traditional compartmental mathematical frameworks, where specific modelling choices and parameters embody the different features that characterize the epidemic spreading, the virological effects of the illness and the impact of different pharmaceutical interventions. However, in presence of new epidemic outbreaks many key-features could still be unknown or difficult to isolate in clinical trials, and consequently the illness itself could be difficult to be completely described by the classical compartmental models. Indeed, clinical symptoms of different illnesses are multifaceted and strictly depend on the origin of the pathogenic microbial agent responsible of the disease, which can be bacterial, parasitic, fungal, viral or originated by prions, i.e. other kinds of toxic proteins, and on the pathway through which the illness naturally diffuses [1, 2]. Moreover, in order to accurately describe the disease through compartmental models, it is fundamental to account for possible DNA or RNA mutations from the wildtype strain in long-term outbreaks, as well as for possible preventive measures and control, including vaccination, treatments, prophylaxis, quarantine, isolation or other measures minimizing social activities (like the use of face-masks, compulsory home-schooling, different levels of lockdowns). In these highly complex and rapidly changing scenarios, the efficacy of compartmental models for making fast scenario-analyses may be severely limited by the delicate and sometimes ad-hoc parameter calibration process that becomes even harder if one aims at embodying age-dependency or other geographical features, see, e.g., [3, 4, 5].
Alongside with scenario analysis, recent upcoming epidemic events have shown the importance of disposing of computational tools measuring the impact of pharmaceutical resources [6] and other Non-Pharmaceutical Interventions (NPIs) [7] so to guide policy-makers in choosing how to intervene limiting the social and economic burden. From the mathematical perspective, we can leverage on the versatility of optimal control theory in order to derive useful quantitative and qualitative guidelines for minimizing the amount of infectious or deceased individuals [8, 9], the total incidence of the spreading disease [10], or the amount of contacts and, consequently, of cases [11]. Other problems which have been further investigated, analytically and numerically, are more delicate from the mathematical viewpoint such as the minimization of epidemic peaks [12] or the minimization of the eradication time [13].
In view of the above discussion, it is of paramount importance to provide the society with mathematical tools able to output computationally cheap and reliable scenario analyses, so to compare different prevention measures and solve optimal control problems. Among recently developed mathematical frameworks, a prominent position is covered by Operator Learning, which, roughly speaking, deal with the development and application of algorithms designed to build approximation of operators starting from a given set of input/output pairs. In the family of Operator Learning tools, an increasing attention has been devoted to the so called Deep Operator Networks (DeepONets) introduced in [14]. Since then, other ML algorithms exploiting deep neural networks have been developed, such as PINNDeepONets [15] and Fourier Neural Operators [16]. More recently, in [17] Kernel Operator Learning (KOL) has been proposed as a competitive alternative to the previous approaches in terms of cost-accuracy trade-off and the capability of matching (sometimes outperforming) in several benchmark problems the generalization properties of learning methods based on neural networks. Moreover, the simple closed formula of the learnt kernel operator makes KOL very attractive.
In this work we propose two different numerical approaches based on Kernel Operator Learning (KOL) [17], that starting from epidemic data provide surrogate models describing the dependency of different stages of the illness on a given control function representing NPIs. To the best of our knowledge, these approaches are new in the epidemic context. Moreover, the numerical experiments contained in the sequel of the paper show that the presented approaches can be efficiently employed for making fast scenario analyses and solving optimization tasks, since they can be rapidly trained directly from data, circumventing delicate calibration phases. For the sake of simplicity and to present the main features of our approaches, we work with synthetic data generated by standard epidemic differential models governed by control functions modelling NPIs (or any other effect reducing the transmissibility of the illness), which we assume to be given.
As pointed out in [17], the performance of KOL depends on the choice of the kernel employed to perform the regression tasks, see also [18]. Very recently, Neural Tangent Kernels (NTKs) have been introduced in a series of pioneering works (cf. [19, 20, 21]) that opened the door to a very intensive research activity. Loosely speaking, NTK arise from the connection with infinite-width neural networks and, within the infinite-width limit, the generalization properties of the neural network could be described by the generalization properties of the associated NTK. In view of this connection and encouraged by the results presented in [19, 22], in this paper we employ NTKs in the construction of our kernel operator approaches and we validate its efficacy through a wide campaign of numerical tests.
The paper is organized as follows. In Section 2, we synthetically illustrate and derive the mathematical formulation of our KOL approaches and we briefly recall the compartmental models that we adopt for generating the numerical data to test the methods. In Section 3 we present and discuss different numerical tests for assessing the generalization properties of our approached together with their efficacy in providing solutions to optimization tasks.
2 Materials and Methods
This section is organized in three parts: Section 2.1 is devoted to introducing Kernel Operator Learning in the epidemic context, whilst Section 2.2 briefly recalls the standard compartmental epidemic models that will be employed to produce the synthetic data feeding the KOL. In Section 2.3 we gather some preliminary considerations about the numerical technicalities for retrieving both KOL regressors.
2.1 KOL and epidemic modelling: Basic principles
<details>
<summary>2404.11130v2/x1.png Details</summary>

### Visual Description
# Technical Document Extraction: Flowchart Analysis
## Diagram Overview
The image depicts a mathematical/functional mapping diagram with four panels connected by arrows. The diagram illustrates transformations between function spaces and sets through operators.
---
## Panel Descriptions
### Left Column
1. **Panel 1 (u ∈ U)**
- **Graph Type**: Time-series plot
- **X-axis**: `t` (time)
- **Y-axis**: Unlabeled (implied function output)
- **Data**: Blue line with oscillations
- **Label**: `u ∈ U` (input function space)
2. **Panel 2 (ψ ∈ ℝⁿ)**
- **Graph Type**: Scatter plot
- **X-axis**: `t_k` (discrete time points)
- **Y-axis**: Unlabeled (implied ℝⁿ space)
- **Data**: Red points clustered at `t_k` intervals
- **Label**: `ψ ∈ ℝⁿ` (intermediate space)
### Right Column
3. **Panel 3 (χ ∈ ℝᵐ)**
- **Graph Type**: Scatter plot
- **X-axis**: `t_k` (discrete time points)
- **Y-axis**: Unlabeled (implied ℝᵐ space)
- **Data**: Green points clustered at `t_k` intervals
- **Label**: `χ ∈ ℝᵐ` (intermediate space)
4. **Panel 4 (v ∈ V)**
- **Graph Type**: Time-series plot
- **X-axis**: `t` (time)
- **Y-axis**: Unlabeled (implied function output)
- **Data**: Purple line with oscillations
- **Label**: `v ∈ V` (output function space)
---
## Mappings and Operators
1. **Primary Mapping (G)**
- `U ⊂ ℝⁿ → V ⊂ ℝᵐ`
- Arrow label: `G`
- Vertical mapping: `ψ ↦ φ` (transformation between spaces)
2. **Secondary Mapping (f†)**
- `ℝⁿ → ℝᵐ`
- Arrow label: `f†`
- Vertical mapping: `χ ↦ φ` (transformation between spaces)
3. **Composition**
- `u ∈ U → ψ ∈ ℝⁿ → χ ∈ ℝᵐ → v ∈ V`
- Final output: `v ∈ V`
---
## Color-Coding Legend
| Color | Label | Associated Panel |
|--------|-------|------------------|
| Blue | `u` | Panel 1 |
| Red | `ψ` | Panel 2 |
| Green | `χ` | Panel 3 |
| Purple | `v` | Panel 4 |
---
## Key Observations
1. **Temporal Relationships**
- Continuous-time signals (`u`, `v`) vs. discrete-time samples (`ψ`, `χ`)
- `t_k` represents sampling instants for intermediate spaces
2. **Functional Transformations**
- `G`: Global operator mapping input space `U` to output space `V`
- `f†`: Dual operator enabling space transformation between ℝⁿ and ℝᵐ
3. **Output Consistency**
- Final output `v` shows similar oscillatory behavior to input `u`, suggesting `G` preserves temporal structure
---
## Mathematical Notation Summary
- `U ⊂ ℝⁿ`: Input function space
- `V ⊂ ℝᵐ`: Output function space
- `ψ ∈ ℝⁿ`: Intermediate representation
- `χ ∈ ℝᵐ`: Transformed intermediate representation
- `f†`: Adjoint operator enabling space transformation
- `G`: Composite operator defining the overall mapping
---
## Diagram Flow
</details>
Figure 1: Kernel Operator Learning (KOL) diagram. Operating among RKHS justifies the reduction of the problem to learning the behaviour of the vector-value function $f^{\dagger}$ , operating between input/output observations (cf. Appendix B).
We briefly summarize some of the principal results on Kernel Operator Learning (KOL) contained in [17], that will be instrumental to build surrogate epidemic models and to solve optimal control problems.
To this aim, let us consider two possibly infinite-dimensional Hilbert spaces $\mathcal{U},\mathcal{V}$ and assume there exists an unknown operator mapping between the two spaces, i.e.
$$
\mathcal{G}:\mathcal{U}\rightarrow\mathcal{V}. \tag{1}
$$
Roughly speaking, the goal of operator learning is to approximate $\mathcal{G}$ based on pairs of input/output that are accessible through finite dimensional linear measurements, as formalized by the following:
**Problem**
*Let $\{u_{i},v_{i}\}_{i=1}^{N}$ be $N$ samples in $\mathcal{U}×\mathcal{V}$ , i.e.
$$
\mathcal{G}(u_{i})=v_{i},\;\mathrm{with}\,i=1,2\ldots N. \tag{2}
$$
Moreover, define the (bounded and linear) observation operators $\phi:\mathcal{U}→\mathbb{R}^{n}$ and $\varphi:\mathcal{V}→\mathbb{R}^{m}$ acting on the input and the target functions, respectively. The goal of operator learning is the approximation of the operator $\mathcal{G}$ based on the observation input/output pairs $\{\phi(u_{i}),\varphi(v_{i})\}_{i=1}^{N}$ .*
In the rest of the section, to ease the reading, we restrict the presentation to the scalar case, while the extension to the vector-valued case, relevant for the numerical examples in the following sections, can be straightforwardly obtained from the scalar one by resorting to the theory of vector-valued Reproducing Kernel Hilbert Spaces (RKHS) in [23]. Therefore, let us consider $\mathcal{U}$ and $\mathcal{V}$ as functional spaces made of scalar functions, where the dependant variable is named $t$ and it is assumed to vary in the interval $D⊂\mathbb{R}$ . In this scenario, a standard choice for the observation operators $\phi$ and $\varphi$ consists in considering the pointwise evaluation at specific collocation points $\{t_{k}\}_{k=1}^{m}$ and $\{\tilde{t}_{k}\}_{k=1}^{n}$ respectively, which in general can be different (see Figure 1). However, for simplicity, in this work we consider the same collocation points, i.e.
$$
\begin{split}\phi:u\rightarrow U:=(u({t}_{1}),u({t}_{2})\ldots u({t}_{n}))^{T}%
\in\mathbb{R}^{n},\\
\varphi:v\rightarrow V:=(v(t_{1}),v(t_{2})\ldots v(t_{n}))^{T}\in\mathbb{R}^{n%
}.\end{split} \tag{3}
$$
Then, assuming we are given the training dataset $\{U_{j},V_{j}\}_{j=1}^{N}$ where, consistently with our notation, $U_{j}:=(u_{j}({t}_{1}),u_{j}({t}_{2})... u_{j}({t}_{n}))^{T}∈\mathbb{R}^{n}$ and $V_{j}:=(v_{j}(t_{1}),v_{j}(t_{2})... v_{j}(t_{n}))^{T}∈\mathbb{R}^{n}$ , we aim at constructing an approximation $\bar{\mathcal{G}}$ of $\mathcal{G}$ . More specifically, following [17], endowing $\mathcal{U}$ and $\mathcal{V}$ with a RKHS structure and using kernel regression to identify the maps $\psi$ and $\chi$ (cf. Figure 1 and see Appendix B for more details), the approximated operator can be written explicitly in closed form as
$$
\bar{\mathcal{G}}(u)(t)=K(t,T)K(T,T)^{-1}\left(\sum_{j=1}^{N}S(\phi(u),U_{j})%
\alpha_{j}\right), \tag{4}
$$
where $K$ is the kernel function induced by the RKHS structure of $\mathcal{V}$ , the vector $T=[t_{1},t_{2}... t_{n}]^{T}∈\mathbb{R}^{n}$ contains the collocation points, and $S:\mathbb{R}^{n}×\mathbb{R}^{n}→\mathbb{R}$ is a properly chosen vector-valued kernel. Moreover, $K(·,T):D→\mathbb{R}^{n}$ is a row vector such that $K(t,T)_{i}=K(t,t_{i})$ , and $K(T,T)$ is an $n× n$ matrix such that $K(T,T)_{ij}=K(t_{i},t_{j})$ . Parameters $\{\alpha_{j}\}_{j=1}^{N}∈\mathbb{R}^{n}$ are the kernel regression parameters over the input/output training pairs. Since we consider pointwise observation operators, $K(t,T)$ is the composite linear interpolant of nodes in $T$ evaluated at the desired time $t$ . We refer to [17] for a complete discussion of the convergence properties of this approach and for the formal derivation of the learnt operator in a more general functional framework.
Although the above framework is quite general and can be applied to a variety of problems, here, given the goal of the paper, we embody it in the context of reconstructing processes ruled by dynamical systems steered by control variables. More precisely, we consider the following general scalar dynamical system ruled by the control function $u$
$$
\begin{cases}\dot{{v}}(t)={F}({v}(t),{u}(t)),\;\forall t\in(0,t^{*}]\\
{v}(0)={v}_{0}\end{cases} \tag{0}
$$
where $F:\mathbb{R}×\mathbb{R}→\mathbb{R}$ is assumed to be sufficiently smooth to guarantee the well-posedness of the problem [24]. Clearly, we can associate to (5) the mapping $\mathcal{G}$ that given the control $u$ returns the solution $v$ . This mathematical framework fits a typical epidemic context, where it is often necessary to take into account the impact of different NPIs acting in reducing the contact rate among vulnerable people, such as, for instance, partial or total lockdowns, the mandatory use of face-masks in public spaces or the isolation of people showing mild symptoms which could be linked to the illness of concern.
Now, we are ready to introduce our strategies. The first strategy, named KOL-m, consists in directly determining an approximation of the solution map $\cal{G}$ from $u(t)$ to $v(t)$ , as described in the following.
**KOL-m**
*Let $\mathcal{U}=\{u∈ L^{2}(0,t^{*})\}$ and $\mathcal{V}=C^{0}([0,t^{*}])$ . Given the input-output data trajectories $\{(\hat{u}_{k},\hat{v}_{k})\}_{k=1}^{N}$ , where each $\hat{u}_{k}∈\mathcal{U}$ is a control function and each $\hat{v}_{k}∈\mathcal{V}$ is the associated (known) state vector function, the Kernel Operator Learning Map method (KOL-m) builds $\mathcal{\bar{G}}_{m}:\mathcal{U}→\mathcal{V}$ according to (4).*
Hence, the quantity $\mathcal{\bar{G}}_{m}\left(u^{*}\right)(t)$ represents an approximation to the solution $v(t)$ of (5) with control variable $u^{*}(t)$ .
Let us now describe the second approach, namely KOL- $∂$ , which approximates the operator that given the prescribed control returns the derivative of the solution to (5).
**KOL-∂\partial∂**
*Let $\mathcal{U}=\{u∈ L^{2}(0,t^{*})\}$ and $\hat{\mathcal{V}}=L^{1}(0,t^{*})$ . Given the input-output data trajectories $\{(\hat{u}_{k},\dot{\hat{v}}_{k})\}_{k=1}^{N}$ , where each $\hat{u}_{k}∈\mathcal{U}$ is a control function and each $\dot{\hat{v}}_{k}∈\hat{\mathcal{V}}$ is the associated (known) derivative of the state vector function, the Kernel Operator Learning Derivative method (KOL- $∂$ ) builds $\mathcal{\bar{G}_{∂}}:\mathcal{U}→\hat{\mathcal{V}}$ according to (4).*
Therefore, the quantity $\mathcal{\bar{G}}_{∂}\left(u^{*}\right)(t)$ represents an approximation to $\dot{v}(t)$ , i.e. the derivative of the solution to (5) with control variable $u^{*}(t)$ . Specifically, we have
$$
v(t)=v_{0}+\int_{0}^{t}\mathcal{\bar{G}}_{\partial}(u)(\tau)d\tau\;\in\mathcal%
{V}\;\mathrm{where}\;\mathcal{\bar{G}}_{\partial}:\mathcal{U}\rightarrow\hat{%
\mathcal{V}}\;\mathrm{satisfies}\;\eqref{eq:KOLGeneral}, \tag{6}
$$
or alternatively
$$
\begin{cases}\begin{split}\dot{v}(t)&=K(t,T)K(T,T)^{-1}\left(\sum_{j=1}^{N}S(%
\phi(u),U_{j})\alpha_{j}\right)\\
&=K(t,T)K(T,T)^{-1}\left(\sum_{j=1}^{N}S(\phi(u),U_{j})\begin{pmatrix}[\textbf%
{S}(\textbf{U},\textbf{U})^{-1}\hat{\textbf{V}}_{\cdot,1}]_{j}\\
[\textbf{S}(\textbf{U},\textbf{U})^{-1}\hat{\textbf{V}}_{\cdot,2}]_{j}\\
\vdots\\
[\textbf{S}(\textbf{U},\textbf{U})^{-1}\hat{\textbf{V}}_{\cdot,n}]_{j}\\
\end{pmatrix}\right)\;\;\;\forall t\in(0,t^{*}]\end{split}\\
v(0)=v_{0},\end{cases} \tag{0}
$$
where $\hat{\textbf{V}}_{·,k}=[[\varphi(\dot{\hat{v}}_{1})]_{k},[\varphi(\dot{%
\hat{v}}_{2})]_{k}...[\varphi(\dot{\hat{v}}_{N})]_{k}]^{T},\;∀ k=1,2...
n$ is the vector of the evaluations at the $k$ -th point of each output, U is a vector collecting all the $\{U_{j}\}_{j}^{N}$ and $\textbf{S}(\textbf{U},\textbf{U})∈\mathbb{R}^{N× N}$ can be defined as $[\textbf{S}(\textbf{U},\textbf{U})]_{ij}=S(U_{i},U_{j})$ (cf. Appendix B). In this case we assume that an accurate approximation of the derivative of each state $(\{v_{i}\}_{i=1}^{N})$ is observable, hence the sequence $\{\varphi(v_{i})\}_{i=1}^{N}$ is available (cf. Figure 1).
We conclude this section with a methodological remark on the application of the KOL-m approach to the reconstruction of compartments when dealing with epidemic problems or other problems where the reconstructed operator has to preserve positivity. This modified version of KOL-m will be employed to derive the numerical results presented in the subsequent sections.
**Remark (On the positivity preserving property of KOL-m)**
*The solutions to the epidemic differential problems are intrinsically positive, since they represent positive fractions of the population corresponding to different states with respect to the illness spreading. However, there is no reason why KOL-m should preserve the positivity of the prediction, even in presence of positive data $\{\hat{\textbf{x}}_{k}\}$ . For this reason, in order to enforce the positivity of the prediction of the learnt operator $\mathcal{\bar{G}}_{m}$ , we proceed as follows: given $\{(\hat{u}_{k},\hat{\textbf{x}}_{k})\}_{k≤ N}$ , with positive $\hat{\textbf{x}}_{k}$ , we apply KOL-m to the modified input-output dataset $\{(\hat{u}_{k},\sqrt{\hat{\textbf{x}}_{k}})\}_{k≤ N}$ so to obtain the intermediate operator $\mathcal{\widetilde{G}}_{m}$ . Then, we set $\mathcal{\bar{G}}_{m}=\mathcal{\widetilde{G}}^{2}_{m}$ . The efficiency of this approach is showed in Section 3.*
2.2 Overview of compartmental models
In order to test KOL-m and KOL- $∂$ we generate synthetic data employing classical compartmental models and use the latter to validate the accuracy of the surrogate models. More precisely, for simplicity and without loss of generality, we restrict our attention to the four classical compartmental models reported in Figure 2, where the control variable $u$ represents the instantaneous reduction of the basic transmission rate $\beta$ dictated by the virological and transmissibility properties of the illness by reducing the contact rate. We assume that $u_{lb}≤ u(t)≤ u_{ub},\;∀\,t∈[0,t^{*}]$ , where the upper bound value of the control function $u_{ub}=1$ models total lockdown, instead the lower bound $u_{lb}=0$ stands for null NPIs. For suitable choices of the function $\textbf{F}:\mathbb{R}^{d}×\mathbb{R}→\mathbb{R}^{d}$ , all those models (and other high dimensional disease-specific compartmental models, e.g. [25, 26, 5]) can be written in the following general form:
$$
\begin{cases}\dot{\boldsymbol{x}}(t)=\textbf{F}(\boldsymbol{x}(t),u(t)),\;%
\forall t\in(0,t^{*}]\\
\boldsymbol{x}(0)=\boldsymbol{x}_{0}.\end{cases} \tag{0}
$$
Clearly, the solution of (8) can be written in terms of the operator $\mathcal{G}$ , that given the input control $u(t):[0,t^{*}]→\mathbb{R}$ returns the state $\boldsymbol{x}(t):[0,t^{*}]→\mathbb{R}^{d}$ . For each model, $∀\,t\,∈\,[0,t^{*}]$ the evolution function F is sufficiently smooth in order to guarantee existence and uniqueness of the solution of Cauchy problem (8) when $u$ admits at most a countable amount of jump discontinuities (see, e.g. [27]). For a more comprehensive review on epidemiological models, we refer, e.g., to [1, 2].
<details>
<summary>2404.11130v2/x2.png Details</summary>

### Visual Description
# Epidemiological Model Equations and Parameters
## SIR Model
### System of Equations
$$
\begin{cases}
\dot{S} = -\beta(1 - u)SI \\
\dot{I} = \beta(1 - u)SI - \gamma I & \forall t \in (0, t^*] \\
\dot{R} = \gamma I
\end{cases}
$$
### Initial Conditions
$$
[S(0), I(0), R(0)]^T = [S_0, I_0, R_0]^T
$$
### Parameters
- Basic Reproduction Number: $\mathcal{R}_0 = \frac{\beta}{\gamma}$
- Modified Reproduction Number: $\mathcal{R}_u = \frac{\beta(1 - u)}{\gamma}$
---
## SIS Model
### System of Equations
$$
\begin{cases}
\dot{S} = -\beta(1 - u)SI + \gamma I \\
\dot{I} = \beta(1 - u)SI - \gamma I & \forall t \in (0, t^*]
\end{cases}
$$
### Initial Conditions
$$
[S(0), I(0)]^T = [S_0, I_0]^T
$$
### Parameters
- Basic Reproduction Number: $\mathcal{R}_0 = \frac{\beta}{\gamma}$
- Modified Reproduction Number: $\mathcal{R}_u = \frac{\beta(1 - u)}{\gamma}$
---
## SIRD Model
### System of Equations
$$
\begin{cases}
\dot{S} = -\beta(1 - u)SI + \delta R \\
\dot{I} = \beta(1 - u)SI - (\gamma + \varepsilon)I & \forall t \in (0, t^*] \\
\dot{R} = \gamma I - \delta R \\
\dot{D} = \varepsilon I
\end{cases}
$$
### Initial Conditions
$$
[S(0), I(0), R(0), D(0)]^T = [S_0, I_0, R_0, D_0]^T
$$
### Parameters
- Basic Reproduction Number: $\mathcal{R}_0 = \frac{\beta}{\gamma + \varepsilon}$
- Modified Reproduction Number: $\mathcal{R}_u = \frac{\beta(1 - u)}{\gamma + \varepsilon}$
---
## SEIRD Model
### System of Equations
$$
\begin{cases}
\dot{S} = -\beta(1 - u)SI + \delta R \\
\dot{E} = \beta(1 - u)SI - \phi E \\
\dot{I} = \phi E - (\gamma + \varepsilon)I & \forall t \in (0, t^*] \\
\dot{R} = \gamma I - \delta R \\
\dot{D} = \varepsilon I
\end{cases}
$$
### Initial Conditions
$$
[S(0), E(0), I(0), R(0), D(0)]^T = [S_0, E_0, I_0, R_0, D_0]^T
$$
### Parameters
- Basic Reproduction Number: $\mathcal{R}_0 = \frac{\beta}{\gamma + \varepsilon + \phi}$
- Modified Reproduction Number: $\mathcal{R}_u = \frac{\beta(1 - u)}{\gamma + \varepsilon + \phi}$
</details>
Figure 2: $SIR$ , $SIS$ , $SIRD$ and $SEIRD$ compartmental models. Each model is provided with its respective basic reproduction number ( $\mathcal{R}_{0}$ ) and the reproduction number depending on the control ( $\mathcal{R}_{u}$ ).
2.3 Numerical aspects of KOL-m and KOL- $∂$
In this section we collect some useful information on the computational aspects involved in the training process of KOLs. For the purpose of this work we assume that the space of our discrete counterpart of the control functions and the corresponding discrete vectors coming from the evaluation of the solution are of the same dimension, i.e. $m=n$ in Figure 1. It is worth to remark that, in the epidemic context, it is natural to consider, as observation operators for the state and the control, the pointwise evaluation, which is coherent with the way open access datasets for epidemic events are often organized and real-time forecasts are delivered (see, e.g. [28]).
For what concerns the generation of synthetic data, we employ $N$ different control functions to be observed via the observation operators, together with the $N$ associated compartmental trajectories (vector in $\mathbb{R}^{d}$ ) obtained by numerically solving one of the differential systems in Figure 2 using Explicit Euler with time step $dt$ .
From the computational perspective, once the scalar kernel $S(·,·)$ between the two discrete vector spaces has been chosen, we need to solve $n$ linear systems of dimension $N× N$ where the matrix is $\textbf{S}(\textbf{U},\textbf{U})$ (cf. Appendix A). For this purpose, we compute the Cholesky Factorization of the matrix and solve the systems with the standard substitution methods (cf. [29] for more advanced strategies). Finally, in solving the regression problem we add a regularization term, with penalty parameter equal to $1e-10$ .
3 Results and Discussion
In this section we present a wide campaign of numerical tests with the aim of: (a) understanding the impact of the choice of different kernels in KOL-m and KOL- $∂$ (cf. Section 3.1); (b) comparing our KOL methods with a popular neural-network based model learning method (cf. Section 3.2); (c) assessing the robustness of the introduced approaches for solving two optimization tasks (cf. Section 3.3).
3.1 On the choice of the Kernel
One crucial ingredient, driving the approximation and generalization properties of the corresponding KOL method, is the choice of the scalar kernel $S$ in (35). Which kernel function is optimal for kernel regression is still an open debate and depends on the specific application. For instance, there exists some innovative approaches which learn the kernels by simulating data driven dynamical systems enabling scalability of Kernel Regression [30]. Here, we consider the following popular choices for $S$ .
- Linear Kernel: $S(U_{1},U_{2})=U_{1}^{T}U_{2}$ . This kernel evaluates the alignment in the $n$ -dimensional space between input vectors;
- Matérn Kernel: $S(U_{1},U_{2})=\dfrac{2^{1-\nu}}{\Gamma(\nu)}\left(\sqrt{2\nu}\dfrac{\|U_{1}-U%
_{2}\|_{2}}{\rho}\right)^{\nu}K_{\nu}\left(\sqrt{2\nu}\dfrac{\|U_{1}-U_{2}\|_{%
2}}{\rho}\right)$ , where $\nu>0$ controls the smoothness of the kernel function, $\rho$ is a characteristic length scale, $\Gamma$ is the Gamma function and $K_{\nu}$ is the modified second kind Bessel function of order $\nu$ . This family of kernels is stationary and isotropic, given the Euclidean distance between input points. If $\nu→∞$ one obtains the Gaussian kernel. It is often used for image analysis [31] and other machine learning regression tasks [32];
- RBF kernel: $S(U_{1},U_{2})=e^{-\frac{\|U_{1}-U_{2}\|_{2}^{2}}{2\sigma^{2}}}$ , known also as Gaussian kernel. It has the interpretation of a similarity measure since it is bounded in $[0,1]$ and decreases as long as the distance between points increases;
- Rational Quadratic kernel: $S(U_{1},U_{2})=\left(1+\dfrac{\|U_{1}-U_{2}\|_{2}^{2}}{2\alpha l^{2}}\right)^{%
-\alpha}$ , with $l,\alpha>0$ . It can be regarded as the infinite sum of different RBF kernels with different length scales;
- Neural Tangent Kernel (NTK): Given a neural network regressor $f(x;\theta)$ of depth $d_{nn}$ , width $l_{nn}$ and activation function $\sigma_{nn}$ , with $\theta$ denoting the vector collecting all weights and biases, we define the family of finite-width Neural Tangent Kernels $\{S\}_{\tau>0}:\mathbb{R}^{n}×\mathbb{R}^{n}→\mathbb{R}$ (cf. A) as
$$
S_{\tau}(x_{i},x_{j}):=\langle\partial_{\theta}f(x_{j};\theta(\tau)),\,%
\partial_{\theta}f(x_{i};\theta(\tau))\rangle, \tag{9}
$$
where $\tau$ represents a fictitious iteration time. It has been proven that, if the initialization of the weights follows the so-called NTK initialization [20], in the infinite-width limit each element in $\{S_{\tau}\}_{\tau}$ converges in probability to a stationary kernel independently on $\tau$ , i.e.
$$
S_{\tau}(x_{i},x_{j})\underset{\mathbb{P}}{\rightarrow}S(x_{i},x_{j}),\;%
\forall\,\tau>0,\;\forall\,x_{i},x_{j}. \tag{10}
$$
Hence, the family of NTKs strictly depends on two parameters: activation function and depth of the associated neural network.
In this paper, with NTK we refer to the infinite-width limit kernel function. We rely on the library presented in [33] for an efficient and user-friendly computation of the considered NTKs. In practice, given the network hyperparameters, we compute the scalar NTK $S:\mathbb{R}^{d}×\mathbb{R}^{d}→\mathbb{R}$ and then evaluate it at desired couples of points. More details about the derivation of NTKs associated to neural networks can be found in Appendix A.
For each of the above kernels, we evaluate the generalization properties of KOL-m and KOL- $∂$ trained on the epidemic $SIS$ , $SIR$ and $SEIRD$ trajectories generated by control functions chosen among the following families (see Figure 3) which are representative of possible interventions that policy makers can implement.
For training and testing KOL-m and KOL- $∂$ , we define a mixed dataset constituted by functions belonging to four distinct functional families (see Figure 3):
1. Linear Pulse (Figure 3(a)):
$$
u(t)=\begin{cases}u_{0},&t\leq t_{0},\\[10.0pt]
\dfrac{3(u_{1}-u_{0})}{\Delta t}(t-t_{0})+u_{0}&t_{0}<t\leq t_{0}+\frac{\Delta
t%
}{3},\\
u_{1}&t_{0}+\frac{\Delta t}{3}<t\leq t_{0}+\frac{4\Delta t}{3},\\[10.0pt]
\dfrac{3(u_{0}-u_{1})}{\Delta t}(t-t_{0}-\frac{4\Delta t}{3})+u_{1}&t_{0}+%
\frac{4\Delta t}{3}<t\leq t_{0}+\frac{5\Delta t}{3},\\
u_{0}&t>t_{0}+\frac{5\Delta t}{3},\end{cases} \tag{11}
$$
depending on 4 degrees of freedom (dof): $u_{0},\,u_{1}∈[0,1]$ , $t_{0}∈[0,t^{*}]$ and $\Delta t∈[0,\frac{t^{*}}{3}]$ ;
1. Step function (Figure 3(b)):
$$
u(t)=\begin{cases}u_{0}&t\leq t_{0},\\
u_{1}&t>t_{0},\end{cases} \tag{12}
$$
with 3 dofs: $u_{0},\,u_{1}∈[0,1]$ , $t_{0}∈[0,t^{*}]$ ;
1. Continuous Seasonality (Figure 3(c)):
$$
u(t)=\dfrac{u_{0}}{2}\left(1+\dfrac{1}{2}\cos{\left(\dfrac{2\pi t}{t^{*}}+%
\dfrac{\Delta t}{t^{*}}\dfrac{\pi}{2}\right)}\right), \tag{13}
$$
with 2 dofs: $u_{0}∈[0,1]$ and $\Delta t∈[0,\frac{t^{*}}{3}]$ ;
1. Double step (Figure 3(d)):
$$
u(t)=\begin{cases}u_{0}&t\leq t_{0},\\
u_{1}&t_{0}<t\leq t_{0}+\frac{\Delta t}{2},\\
u_{0}&t_{0}+\frac{\Delta t}{2}<t\leq t_{0}+\Delta t,\\
u_{1}&t>t_{0}+\Delta t,\\
\end{cases} \tag{14}
$$
with 3 dofs: $u_{0},\,u_{1}∈[0,1]$ , $t_{0}∈[0,t^{*}]$ .
To test the generalization properties we proceed as follows. For each given input control function $\{u^{n}\}_{n}$ , $n=1,2,...,N_{p}$ with $N_{p}$ number of test points, we generate the output data $\{\boldsymbol{x}^{i,n}\}_{i,n}$ , $i=1,2,...,N_{c}$ with $N_{c}$ the number of compartments, by numerically solving the specific systems of ODEs ( $SIS$ , $SIR$ , $SIRD$ and $SEIRD$ ) up to a final time $t^{*}=100$ , with a discretization step $dt=1$ . The pairs control/output data form the training dataset. We define the prediction relative error as
$$
p_{\mathrm{err}}=\displaystyle\dfrac{1}{N}\sum_{n=1}^{N}\displaystyle\sum_{i=1%
}^{N_{c}}\dfrac{\|\boldsymbol{x}_{p}^{i,n}-\boldsymbol{x}^{i,n}\|_{2}}{\|%
\boldsymbol{x}^{i,n}\|_{2}}, \tag{15}
$$
where $\{\boldsymbol{x}_{p}^{i,n}\}$ are the predictions generated by the KOL methods. Clearly, the prediction samples belong to a different batch with respect to the training input data.
In all cases, the value of the basic reproduction number is set at $\mathcal{R}_{0}=4$ , and the recovery rate $\gamma=0.05$ . The epidemic event starts with 1% of infected (or, respectively, exposed) individuals in all cases. The rest of the population is assumed to be susceptible to the disease at the initial time. For the $SEIRD$ model, the remaining parameters have been chosen as $\delta=0.4,\,\varepsilon=\phi=0.05$ . The transmission rate $\beta$ without control is, therefore, computed starting from the definition of $\mathcal{R}_{0}$ of each model in Figure 2.
For the comparison of the different generalization properties of the KOL acting on different kernels, the training dataset is built employing 500 control functions equally distributed among the above 4 classes of control functions, where the dofs are randomly sampled following a uniform distribution in the respective interval of definition of each parameter. As for the prediction samples, we test the KOL approaches over a dataset built employing 100 samples equally distributed among the same families of control functions, but with different values for the dofs.
<details>
<summary>2404.11130v2/x3.png Details</summary>

### Visual Description
# Technical Document Extraction: Line Graph Analysis
## Axes and Labels
- **Y-Axis**: Labeled as $ u(t) $, representing the dependent variable.
- **X-Axis**: Labeled as $ t $, representing time or an independent variable.
- **Range**:
- Y-Axis: 0 to 1 (discrete steps, no intermediate values shown).
- X-Axis: Unspecified numerical range (no explicit markers or units provided).
## Line Behavior
- **Line Color**: Blue.
- **Key Trends**:
1. **Initial State**: $ u(t) = 0 $ for $ t < t_1 $.
2. **Rise**: Sharp linear increase from $ (t_1, 0) $ to $ (t_2, 1) $.
3. **Sustained State**: $ u(t) = 1 $ for $ t_2 \leq t \leq t_3 $.
4. **Drop**: Sharp linear decrease from $ (t_3, 1) $ to $ (t_4, 0) $.
5. **Final State**: $ u(t) = 0 $ for $ t > t_4 $.
## Critical Points
- **Transition Points**:
- $ t_1 $: Start of rise.
- $ t_2 $: End of rise (beginning of sustained state).
- $ t_3 $: Start of drop.
- $ t_4 $: End of drop (return to baseline).
## Observations
- **Discontinuities**: No intermediate values between $ u(t) = 0 $ and $ u(t) = 1 $; suggests a binary or threshold-based system.
- **Symmetry**: The rise and drop phases appear symmetric in slope magnitude but not in duration (unspecified $ t_1, t_2, t_3, t_4 $).
- **No Legend**: No additional categories or sub-categories defined.
## Notes
- The graph lacks explicit numerical values for $ t_1, t_2, t_3, t_4 $, making temporal resolution undefined.
- The absence of a legend or secondary lines suggests a single-variable system.
</details>
(a)
<details>
<summary>2404.11130v2/x4.png Details</summary>

### Visual Description
# Technical Document Extraction: Step Function Graph
## Axes and Labels
- **X-axis**: Labeled `t` (time variable).
- **Y-axis**: Labeled `u(t)` (function output).
## Key Data Points
1. **Point 1**: `(t=0, u(t)=0)`
- Horizontal line segment from `(0, 0)` to `(1, 0)`.
2. **Point 2**: `(t=1, u(t)=1)`
- Vertical jump from `(1, 0)` to `(1, 1)`.
3. **Point 3**: `(t→∞, u(t)=1)`
- Horizontal line segment from `(1, 1)` to the right edge of the plot.
## Graph Structure
- **Shape**: Step function (unit step function `u(t)`).
- Constant value `0` for `t < 1`.
- Constant value `1` for `t ≥ 1`.
- **Transition**: Abrupt vertical rise at `t=1`.
## Visual Details
- **Background**: Light gray (`#f0f0f0`).
- **Border**: Thin dark gray (`#cccccc`).
- **Line Color**: Blue (`#0000ff`).
- **Grid**: No grid lines present.
## Observations
- No legend, annotations, or additional text embedded in the plot.
- No data table or heatmap categories present.
- The function `u(t)` is a standard unit step function, commonly used in control systems and signal processing.
## Cross-Reference Check
- Axis labels (`t`, `u(t)`) match the plotted data points.
- No discrepancies in line placement or labels.
## Summary
The graph represents a unit step function `u(t)`, transitioning from `0` to `1` at `t=1`. All textual and structural elements have been extracted as described.
</details>
(b)
<details>
<summary>2404.11130v2/x5.png Details</summary>

### Visual Description
# Technical Document Extraction: Line Graph Analysis
## Axis Labels and Markers
- **Y-Axis**: Labeled as $ u(t) $, with numerical range from **0** to **1**.
- **X-Axis**: Labeled as $ t $, with no explicit numerical markers or gridlines visible.
- **Units**: No explicit units provided for $ u(t) $ or $ t $.
## Line Characteristics
- **Color**: Solid blue line.
- **Behavior**:
1. **Initial Rise**: Starts at the lower-left corner (near $ u(t) = 0 $) and ascends smoothly.
2. **Peak**: Reaches a local maximum (highest point on the curve).
3. **Decline**: Descends to a trough (lowest point after the peak).
4. **Final Rise**: Slightly ascends again toward the right edge of the graph.
## Observations
- **No Gridlines**: The graph lacks gridlines, making precise value extraction difficult.
- **No Legend**: No legend is present to associate the line with specific data categories or variables.
- **No Data Points**: No discrete data points or markers are labeled along the curve.
- **Asymptotic Behavior**: The line does not reach $ u(t) = 1 $ or $ u(t) = 0 $ at any point, suggesting bounded oscillations.
## Key Trends
1. **Monotonic Increase**: The first segment of the curve is strictly increasing.
2. **Single Peak**: A single global maximum is observed.
3. **Bimodal Decline**: The curve exhibits a concave-down decline followed by a concave-up rise, forming a "valley" (trough).
4. **Final Asymptote**: The line approaches a higher value than its starting point but does not complete a full oscillation cycle.
## Missing Elements
- **No Title**: The graph lacks a descriptive title.
- **No Annotations**: No annotations or explanatory notes are present.
- **No Scale**: The x-axis ($ t $) lacks a defined scale or time units.
## Conclusion
The graph represents a time-dependent function $ u(t) $ with a single peak and trough, bounded between 0 and 1. The absence of gridlines, legends, and data points limits quantitative analysis but highlights qualitative trends in the function's behavior.
</details>
(c)
<details>
<summary>2404.11130v2/x6.png Details</summary>

### Visual Description
# Technical Analysis of the Provided Graph
## Axes and Labels
- **Vertical Axis (y-axis):** Labeled as $ u(t) $, representing the dependent variable.
- **Horizontal Axis (x-axis):** Labeled as $ t $, representing time or an independent variable.
- **Units:** No explicit units provided for $ u(t) $ or $ t $.
## Line Description
- **Line Style:** Solid blue line with sharp transitions (no smoothing).
- **Behavior:** Piecewise step function with discontinuities at specific $ t $-values.
- **Key Segments:**
1. **Segment 1:** $ t \in [0, 1) $
- $ u(t) = 1 $ (constant value).
2. **Segment 2:** $ t \in [1, 2) $
- $ u(t) = 0 $ (constant value).
3. **Segment 3:** $ t \in [2, 3) $
- $ u(t) = 1 $ (constant value).
4. **Segment 4:** $ t \geq 3 $
- $ u(t) = 0 $ (constant value).
## Key Data Points and Discontinuities
- **Discontinuities at:**
- $ t = 1 $: Transition from $ u(t) = 1 $ to $ u(t) = 0 $.
- $ t = 2 $: Transition from $ u(t) = 0 $ to $ u(t) = 1 $.
- $ t = 3 $: Transition from $ u(t) = 1 $ to $ u(t) = 0 $.
- **Notable Features:**
- The function alternates between $ u(t) = 1 $ and $ u(t) = 0 $ at integer $ t $-values.
- No intermediate values observed; purely binary states.
## Mathematical Representation
The function $ u(t) $ can be expressed as a piecewise function:
$$
u(t) =
\begin{cases}
1 & \text{if } t \in [0, 1) \cup [2, 3), \\
0 & \text{if } t \in [1, 2) \cup [3, \infty).
\end{cases}
$$
## Observations
- The graph represents a **square wave** with a period of 2 units in $ t $, though the final segment at $ t \geq 3 $ introduces an asymmetry (duration of $ u(t) = 0 $ is longer than subsequent segments).
- No legend or additional annotations present in the image.
## Conclusion
The graph depicts a time-dependent binary signal $ u(t) $ with periodic on/off states, characterized by sharp transitions at integer $ t $-values. The function is fully defined by its piecewise segments and discontinuities.
</details>
(d)
Figure 3: Examples of control functions employed to generate the training-testing dataset for the KOL methods.
Results in Figures 4 - 6 show the prediction relative error of the different approaches. In particular, for each kernel we trained each KOL-m and KOL- $∂$ with 20 different training datasets of size 500 as detailed above and we represented the results in the boxplot form. We perform the same error analysis for different compartmental models: $SIS$ (Figure 4), $SIR$ (Figure 5) and $SEIRD$ (Figure 6). The boxplots corresponding to the blue median lines are the one referring to the KOL- $∂$ approach, whilst the red ones correspond to the KOL-m approach. For those kernels depending on parameters to be tuned, such as RBF, Matérn and Rational Quadratic, we underdid a sensitivity analysis showing similar results exploring the space of parameters (Matérn: $\nu∈[0,0.1],\,\rho∈[0.1,1]$ , RBF: $\sigma∈[0,0.1]$ , Rational Quadratic: $\alpha∈[0.01,0.1],\,l∈[0.01,1]$ ). For what concerns the NTKs, which depend on the choice of the depth $d_{nn}$ and on the activation function $\sigma_{nn}$ of the associated neural network, we perform the same sensitivity analysis considering both Neural Tangent Kernels with Rectified Linear Unit ( $ReLu$ ) and Sigmoidal activation functions for different depths ( $d_{nn}∈\{2,3,4\}$ ). In the boxplots in Figures 4 - 6 we reported the results of the NTKs corresponding to two-layer networks, which showed to be a suitable trade-off between computational complexity and generalization properties.
We notice that for the $SIS$ and the $SIR$ model, the prediction errors are higher when compared to the one obtained for the $SEIRD$ model. This seems to indicate that KOL methods generalize with lower errors for growing complexity. For the $SIS$ model, the operator approximated via the Linear Kernel does not succeed in surrogating the epidemic dynamics. On the other hand, median errors of KOL methods with $ReLu$ -based NTK achieve the lower results, granting errors always lower than 2% across the test samples. Instead, for the $SEIRD$ and $SIR$ model, all the KOLs with the different kernels seem to be good proxies in terms of generalization properties. Specifically, the KOL- $∂$ methods have better approximation properties, with a particular mention to the NTK- $Relu$ based. With the 5 dimensional model $SEIRD$ , among the KOL-m approaches the lowest median error with less IQR is attained by the NTK-sigmoidal one.
In virtue of these results, from now on, we select the NTK-sigmoidal kernel for the KOL-m method, and NTK- $ReLu$ kernel for the KOL- $∂$ method.
<details>
<summary>2404.11130v2/x7.png Details</summary>

### Visual Description
# Technical Document Analysis: Prediction Relative Error Box Plot
## Chart Overview
The image presents a comparative box plot analysis of prediction relative errors across multiple machine learning models. The y-axis uses a logarithmic scale (10⁻¹ to 10⁻²) to represent error magnitudes, while the x-axis categorizes models by architecture and parameter type.
## Axis Labels & Markers
- **Y-Axis**: "Prediction relative error" (logarithmic scale: 10⁻¹, 10⁻²)
- **X-Axis**: Model categories with parameter indicators:
- `Lin-KOL-δ`, `Lin-KOL-m`
- `Mat-KOL-δ`, `Mat-KOL-m`
- `RBF-KOL-δ`, `RBF-KOL-m`
- `RatQuad-KOL-δ`, `RatQuad-KOL-m`
- `NTK-σ-KOL-δ`, `NTK-σ-KOL-m`
- `NTK-Relu-KOL-δ`, `NTK-Relu-KOL-m`
## Legend
- **Location**: Bottom-right corner
- **Color Coding**:
- Blue: Represents "δ" parameter configurations
- Red: Represents "m" parameter configurations
## Key Trends & Data Points
1. **Highest Errors**:
- `Lin-KOL-δ` (blue): Median ~10⁻¹, outliers at ~1.2×10⁻¹
- `Lin-KOL-m` (red): Median ~1.1×10⁻¹, outliers at ~1.3×10⁻¹
2. **Mid-Range Performance**:
- `RBF-KOL-δ` (blue): Median ~3×10⁻², range 2×10⁻²–4×10⁻²
- `RBF-KOL-m` (red): Median ~2.5×10⁻², range 2×10⁻²–3×10⁻²
3. **Lowest Errors**:
- `NTK-Relu-KOL-δ` (blue): Median ~1×10⁻², range 8×10⁻³–1.2×10⁻²
- `NTK-Relu-KOL-m` (red): Median ~9×10⁻³, range 7×10⁻³–1×10⁻²
4. **Notable Outliers**:
- `RatQuad-KOL-δ` (blue): Single outlier at ~1.5×10⁻²
- `Mat-KOL-m` (red): Single outlier at ~1.8×10⁻²
## Model Performance Patterns
- **δ vs. m Comparison**:
- For all models, "δ" configurations consistently show higher median errors than "m" counterparts
- Example: `Mat-KOL-δ` (median ~1.5×10⁻²) vs. `Mat-KOL-m` (median ~1×10⁻²)
- **Architectural Impact**:
- Linear (`Lin-KOL`) models exhibit 10× higher errors than NTK-based models
- NTK-Relu models demonstrate the most stable performance (narrowest interquartile ranges)
## Spatial Component Analysis
1. **Legend Positioning**: Bottom-right quadrant (coordinates: [x=0.85, y=0.15] relative to plot bounds)
2. **Model Grouping**: X-axis categories ordered by error magnitude (left=highest, right=lowest)
3. **Error Magnitude Clustering**:
- Top cluster: Linear models (~10⁻¹)
- Middle cluster: RBF/RatQuad models (~10⁻²)
- Bottom cluster: NTK models (~10⁻² to 10⁻³)
## Data Validation
- All box plot medians align with legend color coding (blue=δ, red=m)
- Outlier positions match expected error distributions for each model type
- Logarithmic scale accurately represents 10× magnitude differences between clusters
</details>
Figure 4: ( $SIS$ model) Comparison of the prediction errors over 100 control functions with KOL methods trained on 20 batches of size 500, where the control functions are chosen in the mixed training dataset. Bullet points represent outliers whose prediction error is outside the 1.5 IQR (length of the whiskers) of the set of simulations.
<details>
<summary>2404.11130v2/x8.png Details</summary>

### Visual Description
# Technical Document Extraction: Prediction Relative Error Analysis
## Chart Description
The image is a **box plot chart** comparing **prediction relative error** across multiple computational methods. The y-axis uses a **logarithmic scale** (10⁻² to 10⁻¹), while the x-axis categorizes methods by algorithm and configuration.
---
### **Key Components**
1. **X-Axis Labels** (Methods):
- `Lin-KOL-δ`, `Lin-KOL-m`
- `Mat-KOL-δ`, `Mat-KOL-m`
- `RBF-KOL-δ`, `RBF-KOL-m`
- `RatQuad-KOL-δ`, `RatQuad-KOL-m`
- `NTK-σ-KOL-δ`, `NTK-σ-KOL-m`
- `NTK-Relu-KOL-δ`, `NTK-Relu-KOL-m`
2. **Y-Axis Label**:
- `Prediction relative error` (log scale: 10⁻² to 10⁻¹)
3. **Legend**:
- **Red line**: Median prediction error
- **Blue line**: Mean prediction error
4. **Outliers**:
- Represented as **open circles** (individual data points outside the whiskers).
---
### **Key Trends**
1. **Method Suffixes**:
- Methods with the suffix `-m` (e.g., `Lin-KOL-m`, `NTK-Relu-KOL-m`) generally exhibit **lower median errors** compared to their `-δ` counterparts.
- Example: `Lin-KOL-m` (median ~10⁻²) vs. `Lin-KOL-δ` (median ~10⁻¹).
2. **Algorithm Performance**:
- **`RatQuad-KOL-δ`** shows the **highest median error** (~10⁻¹) and extreme outliers (~10⁻⁰).
- **`NTK-Relu-KOL-m`** demonstrates the **lowest median error** (~10⁻²) with minimal outliers.
3. **Outlier Distribution**:
- Outliers are most frequent in `RatQuad-KOL-δ` and `RBF-KOL-δ`, suggesting instability in these configurations.
---
### **Data Extraction**
| Method | Median Error (Red) | Mean Error (Blue) | Notable Outliers |
|----------------------|--------------------|-------------------|------------------|
| `Lin-KOL-δ` | ~10⁻¹ | ~10⁻¹ | None |
| `Lin-KOL-m` | ~10⁻¹ | ~10⁻¹ | None |
| `Mat-KOL-δ` | ~10⁻¹ | ~10⁻¹ | None |
| `Mat-KOL-m` | ~10⁻¹ | ~10⁻¹ | None |
| `RBF-KOL-δ` | ~10⁻¹ | ~10⁻¹ | None |
| `RBF-KOL-m` | ~10⁻¹ | ~10⁻¹ | None |
| `RatQuad-KOL-δ` | ~10⁻¹ | ~10⁻¹ | ~10⁰ |
| `RatQuad-KOL-m` | ~10⁻¹ | ~10⁻¹ | None |
| `NTK-σ-KOL-δ` | ~10⁻¹ | ~10⁻¹ | None |
| `NTK-σ-KOL-m` | ~10⁻¹ | ~10⁻¹ | None |
| `NTK-Relu-KOL-δ` | ~10⁻¹ | ~10⁻¹ | None |
| `NTK-Relu-KOL-m` | ~10⁻² | ~10⁻² | None |
---
### **Legend Cross-Reference**
- **Red lines** (median) align with the central tendency of each box plot.
- **Blue lines** (mean) are consistently positioned near the median, indicating symmetric distributions for most methods.
---
### **Conclusion**
The chart highlights that methods with `-m` configurations outperform their `-δ` counterparts in terms of prediction accuracy. `NTK-Relu-KOL-m` emerges as the most robust method, while `RatQuad-KOL-δ` exhibits significant instability. Outliers are sparse except in `RatQuad-KOL-δ` and `RBF-KOL-δ`.
</details>
Figure 5: ( $SIR$ model) Comparison of the prediction errors over 100 control functions with KOL methods trained on 20 batches of size 500, where the control functions are chosen in the mixed training dataset. Bullet points represent outliers whose prediction error is outside the 1.5 IQR (length of the whiskers) of the set of simulations.
<details>
<summary>2404.11130v2/x9.png Details</summary>

### Visual Description
# Technical Document Extraction: Prediction Relative Error Analysis
## Chart Description
The image is a **box plot chart** comparing **prediction relative errors** across multiple models. The y-axis uses a **logarithmic scale** (10⁻⁴ to 10⁻³), while the x-axis categorizes models by type and configuration.
---
### **Key Components**
1. **X-Axis Labels** (Model Configurations):
- **Lin-KOL-δ**, **Lin-KOL-m**
- **Mat-KOL-δ**, **Mat-KOL-m**
- **RBF-KOL-δ**, **RBF-KOL-m**
- **RatQuad-KOL-δ**, **RatQuad-KOL-m**
- **NTK-Relu-KOL-δ**, **NTK-Relu-KOL-m**
2. **Y-Axis Title**:
`Prediction relative error` (log scale: 10⁻⁴ to 10⁻³)
3. **Legend**:
- **Blue**: `Prediction error`
- **Red**: `Ground truth error`
4. **Title**:
`Prediction relative error for different models`
---
### **Data Categories & Sub-Categories**
- **Model Types**:
- **Lin** (Linear)
- **Mat** (Matrix-based)
- **RBF** (Radial Basis Function)
- **RatQuad** (Rational Quadratic)
- **NTK-Relu** (Neural Tangent Kernel with ReLU)
- **Configuration Sub-Categories**:
- **δ (Delta)**: Likely represents a specific parameter or variant.
- **m (Mean)**: Likely represents an averaged or baseline variant.
---
### **Key Trends & Observations**
1. **Prediction vs. Ground Truth Errors**:
- **Blue boxes** (prediction error) consistently show **lower median values** than **red boxes** (ground truth error) across all models.
- Example: `Lin-KOL-δ` and `RBF-KOL-m` have the smallest prediction errors (~10⁻³), while `NTK-Relu-KOL-m` exhibits the largest (~10⁻²).
2. **Outliers**:
- Outliers (open circles) are present for several models, e.g., `Mat-KOL-δ` and `NTK-Relu-KOL-m`, indicating occasional high-error predictions.
3. **Model Performance**:
- **NTK-Relu** models generally have higher errors compared to **Lin** and **RBF** models.
- **RatQuad-KOL-m** shows moderate prediction errors (~10⁻³ to 10⁻²).
---
### **Cross-Referenced Legend & Data**
- **Blue (Prediction Error)**:
- Aligns with the lower median values in all box plots.
- **Red (Ground Truth Error)**:
- Corresponds to higher median values, reflecting the true error distribution.
---
### **Summary**
The chart demonstrates that **prediction errors** (blue) are systematically lower than **ground truth errors** (red) across all models. Performance varies by model type, with **NTK-Relu** configurations showing the poorest prediction accuracy. Outliers suggest variability in specific cases.
</details>
Figure 6: ( $SEIRD$ model) Comparison of the prediction errors over 100 control functions with KOL methods trained on 20 batches of size 500, where the control functions are chosen in the mixed training dataset. Bullet points represent outliers whose prediction error is outside the 1.5 IQR (length of the whiskers) of the set of simulations.
3.2 Comparison between KOL and a Model Learning based on neural networks
In this section, we enlight some important features of our KOL approaches that make them reliable and competitive tools for model learning. To this aim, we compare KOL-m and KOL- $∂$ with a popular and paradigmatic machine learning approach based on neural networks, namely the approach described in [34], in the sequel called NN-ModL, and shortly described in what follows.
Let $\mathcal{F}=\{\boldsymbol{f}_{nn}:\mathbb{R}^{d}×\mathbb{R}→%
\mathbb{R}^{d}\}$ be the space of feed-forward neural network functions. The NN-ModL problem reads as follows:
**Problem 1 (NN-ModL )**
*Solve the constrained optimization problem
$$
\min_{\boldsymbol{f}_{nn}\,\in\mathcal{F}}\frac{1}{2}\sum_{j=1}^{N}\int_{0}^{t%
^{*}}|\textbf{x}_{j}(t)-\textbf{x}_{nn,j}(t)|^{2}dt, \tag{16}
$$
s.t.
$$
\begin{cases}\dot{\textbf{x}}_{nn,j}(t)=\textbf{f}_{nn}(\textbf{x}_{nn,j}(t),u%
_{j}(t))&t\in(0,t^{*}],\,j=1,2\ldots N,\\
\textbf{x}_{nn,j}(0)=\textbf{x}_{0},\end{cases} \tag{0}
$$
where $\boldsymbol{f}_{nn}\,∈\,\mathcal{F}$ and $N$ is the size of the training set.*
Hence, NN-ModL reconstructs the map from the control $u(t)$ to the state $\boldsymbol{x}_{nn}(t)$ . The problem is recast as a discrete optimization problem where the optimizing variables are weights and biases associated to $\boldsymbol{f}_{nn}$ . NN-ModL has shown valuable performances for building reduced models of cardiac electromechanics [35], or for deriving a data-driven model for active force generation in cardiomyocites [36]. As illustrated in [34], the discretized finite-dimensional version of Problem 17 is equivalent to a nonlinear least-square problem, which can been solved employing the Levenberg-Marquardt iterative method [37].
In the sequel, we compare our KOL approaches with NN-ModL in the epidemic context, in terms of: (a) the wall-clock time employed for training the operator learning; (b) the generalization error in the testing phase. The comparison has been drawn by accounting for progressively larger training set sizes for both KOL approaches and NN-ModL in order to approximate the population dynamics generated synthetically by $SIR$ , $SIRD$ and $SEIRD$ models. In particular, we trained NN-ModL with different training sizes and different number of maximum iterations. We performed a sensitivity analysis on the choice of the number of neurons per layers in the NN-ModL case, and concluded, following the Occam’s razor principle of parsimony, to go for a shallow neural network of 6 neurons.
All the results have been obtained by executing the code in parallel on a 8-core machine Intel i7. In Table 1 we summarize the outputs of the comparison between the NN-ModL and the two KOL approaches, where in each case we considered $dt=0.05,\;t^{*}=5,\;\mathcal{R}_{0}=2$ , $\gamma=0.05$ , $\delta=0.4$ and $\varepsilon=\phi=0.05$ . The initial conditions correspond to those of Section 3.1. The comparison has been done by considering 50, 100 or 200 iterations of the optimization scheme for the NN-ModL, trained with 25, 50 and 100 elements uniformly sampled in the mixed training dataset. Both KOL approaches have also been trained on larger datasets constituted by 200, 500, 800, 1000, 2000, 5000, and 10000 elements. The prediction error has been evaluated according to (15) over the same test set of 100 samples constituted by four batches of dimension 25, each belonging to the four functional families in the mixed dataset. The wall-clock time of the running processes has been measured in seconds.
For each epidemic model, the wall-clock time taken for the training stage of NN-ModL is of orders of magnitude higher than the one needed for KOL approaches, even when considering solely 50 iterations. The NN-ModL approach ends with a prediction error close to $10^{-2}$ in the three cases after a total wall-clock time ranging from $10^{2}$ (best scenario) and $10^{3}$ seconds (worst scenario). Instead, for all models in less than 8 seconds the KOL-m approach trained with 500 samples reaches a generalization error which is always lower than the corresponding ones reached by NN-ModL. In addition, for all three models, up to 2000-sample training sets, both KOL approaches still employ less wall-clock time than NN-ModL. In particular, in the case of $SEIRD$ model, the prediction error is 3 orders of magnitude lower than the one of NN-ModL. The gain in terms of wall-clock time is more evident for $SEIRD$ model with size 5000 against NN-ModL with size 100.
As expected, further increasing the amount of training samples makes the prediction errors continuously decrease though with a slower rate with respect to the training size. For what concerns the KOL- $∂$ schemes, they achieve lower prediction errors with respect to the other approaches even with the baseline size of 100 input functions, except for $SIRD$ . Moreover, we deduce that in both KOL-m and KOL- $∂$ approaches, the prediction errors are not affected by the increasing in dimensions of the differential problem going from the 3 dimensional $SIR$ to the 5 dimensional $SEIRD$ . Even the computational times surprisingly do not undergo significant changes across the different models.
In conclusion, both KOL approaches return more accurate predictions in a lower computational time with respect to the NN-ModL, trained for different number of iterations. Between the two KOL approaches, KOL- $∂$ is preferable in terms of accuracy across different epidemic differential models. In general, the KOL approaches turn out to possess a desirable property, i.e. the ability of returning reliable predictions in a fast way, which is extremely useful in the epidemic context, for instance when scenario analyses or optimization tasks are required. For the sake of fairness it should be mentioned that, regarding NN-ModL , it is theoretically possible to find a neural-network function able to build an approximation of the target state with any desired accuracy. However, the proof of the existence of such architecture is not constructive.
| NN-ModL | $SIR$ Iterations 50 | $SIRD$ Size 25 | $SEIRD$ Error $1.5\text{×}{10}^{-2}\text{\,}$ | Time [s] 279 | Error $1.0\text{×}{10}^{-2}\text{\,}$ | Time [s] 380 | Error $1.1\text{×}{10}^{-2}\text{\,}$ | Time [s] 383 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 100 | 25 | $1.5\text{×}{10}^{-2}\text{\,}$ | 478 | $1.0\text{×}{10}^{-2}\text{\,}$ | 615 | $1.1\text{×}{10}^{-2}\text{\,}$ | 755 | |
| 200 | 25 | $1.5\text{×}{10}^{-2}\text{\,}$ | 1186 | $1.0\text{×}{10}^{-2}\text{\,}$ | 1139 | $1.1\text{×}{10}^{-2}\text{\,}$ | 1673 | |
| 50 | 50 | $1.2\text{×}{10}^{-2}\text{\,}$ | 576 | $1.1\text{×}{10}^{-2}\text{\,}$ | 559 | $1.1\text{×}{10}^{-2}\text{\,}$ | 875 | |
| 100 | 50 | $1.2\text{×}{10}^{-2}\text{\,}$ | 1363 | $1.1\text{×}{10}^{-2}\text{\,}$ | 1456 | $1.1\text{×}{10}^{-2}\text{\,}$ | 1476 | |
| 200 | 50 | $1.2\text{×}{10}^{-2}\text{\,}$ | 3448 | $1.1\text{×}{10}^{-2}\text{\,}$ | 2116 | $1.1\text{×}{10}^{-2}\text{\,}$ | 3046 | |
| 50 | 100 | $8.8\text{×}{10}^{-3}\text{\,}$ | 893 | $9.2\text{×}{10}^{-3}\text{\,}$ | 1394 | $9.5\text{×}{10}^{-3}\text{\,}$ | 1364 | |
| 100 | 100 | $8.8\text{×}{10}^{-3}\text{\,}$ | 1672 | $9.2\text{×}{10}^{-3}\text{\,}$ | 2801 | $9.5\text{×}{10}^{-3}\text{\,}$ | 3262 | |
| 200 | 100 | $8.8\text{×}{10}^{-3}\text{\,}$ | 1725 | $9.2\text{×}{10}^{-3}\text{\,}$ | 4511 | $9.4\text{×}{10}^{-3}\text{\,}$ | 6668 | |
| KOL-m | | 25 | $8.3\text{×}{10}^{-2}\text{\,}$ | 0.425 | $8.2\text{×}{10}^{-2}\text{\,}$ | 0.385 | $7.5\text{×}{10}^{-2}\text{\,}$ | 0.209 |
| 50 | $5.4\text{×}{10}^{-2}\text{\,}$ | 0.521 | $5.5\text{×}{10}^{-2}\text{\,}$ | 0.486 | $4.8\text{×}{10}^{-2}\text{\,}$ | 0.479 | | |
| 100 | $1.6\text{×}{10}^{-2}\text{\,}$ | 0.725 | $1.5\text{×}{10}^{-2}\text{\,}$ | 0.731 | $4.9\text{×}{10}^{-3}\text{\,}$ | 0.755 | | |
| 200 | $1.2\text{×}{10}^{-2}\text{\,}$ | 1.47 | $1.1\text{×}{10}^{-2}\text{\,}$ | 1.492 | $2.6\text{×}{10}^{-3}\text{\,}$ | 1.46 | | |
| 500 | $5.6\text{×}{10}^{-3}\text{\,}$ | 7.60 | $5.4\text{×}{10}^{-3}\text{\,}$ | 7.09 | $9.1\text{×}{10}^{-5}\text{\,}$ | 6.85 | | |
| 800 | $5.5\text{×}{10}^{-3}\text{\,}$ | 17.8 | $5.3\text{×}{10}^{-3}\text{\,}$ | 18.8 | $7.0\text{×}{10}^{-5}\text{\,}$ | 20.7 | | |
| 1000 | $4.7\text{×}{10}^{-3}\text{\,}$ | 32.5 | $4.5\text{×}{10}^{-3}\text{\,}$ | 25.7 | $6.4\text{×}{10}^{-5}\text{\,}$ | 29.4 | | |
| 2000 | $3.0\text{×}{10}^{-3}\text{\,}$ | 116 | $2.9\text{×}{10}^{-3}\text{\,}$ | 94.5 | $3.4\text{×}{10}^{-5}\text{\,}$ | 95.8 | | |
| 5000 | $2.3\text{×}{10}^{-3}\text{\,}$ | 572 | $2.2\text{×}{10}^{-3}\text{\,}$ | 568 | $1.3\text{×}{10}^{-5}\text{\,}$ | 559 | | |
| 10000 | $1.5\text{×}{10}^{-3}\text{\,}$ | 2269 | $1.4\text{×}{10}^{-3}\text{\,}$ | 2265 | $8.4\text{×}{10}^{-6}\text{\,}$ | 2288 | | |
| KOL- $∂$ | | 25 | $7.0\text{×}{10}^{-3}\text{\,}$ | 0.420 | $1.2\text{×}{10}^{-2}\text{\,}$ | 1.27 | $2.6\text{×}{10}^{-3}\text{\,}$ | 0.501 |
| 50 | $4.2\text{×}{10}^{-3}\text{\,}$ | 0.524 | $8.6\text{×}{10}^{-3}\text{\,}$ | 0.872 | $1.2\text{×}{10}^{-3}\text{\,}$ | 0.856 | | |
| 100 | $2.5\text{×}{10}^{-3}\text{\,}$ | 0.744 | $4.9\text{×}{10}^{-3}\text{\,}$ | 1.17 | $3.6\text{×}{10}^{-4}\text{\,}$ | 1.11 | | |
| 200 | $1.5\text{×}{10}^{-3}\text{\,}$ | 1.54 | $3.5\text{×}{10}^{-3}\text{\,}$ | 2.14 | $2.3\text{×}{10}^{-4}\text{\,}$ | 2.05 | | |
| 500 | $8.0\text{×}{10}^{-4}\text{\,}$ | 6.32 | $1.9\text{×}{10}^{-3}\text{\,}$ | 7.85 | $9.7\text{×}{10}^{-5}\text{\,}$ | 8.26 | | |
| 800 | $5.9\text{×}{10}^{-4}\text{\,}$ | 19.3 | $2.0\text{×}{10}^{-3}\text{\,}$ | 18.9 | $6.2\text{×}{10}^{-5}\text{\,}$ | 18.1 | | |
| 1000 | $5.8\text{×}{10}^{-4}\text{\,}$ | 27.3 | $1.8\text{×}{10}^{-3}\text{\,}$ | 25.3 | $5.2\text{×}{10}^{-5}\text{\,}$ | 26.4 | | |
| 2000 | $1.2\text{×}{10}^{-4}\text{\,}$ | 93.8 | $1.5\text{×}{10}^{-3}\text{\,}$ | 90.5 | $3.9\text{×}{10}^{-5}\text{\,}$ | 94.7 | | |
| 5000 | $1.2\text{×}{10}^{-4}\text{\,}$ | 877 | $1.1\text{×}{10}^{-3}\text{\,}$ | 546 | $1.9\text{×}{10}^{-5}\text{\,}$ | 888 | | |
| 10000 | $1.2\text{×}{10}^{-4}\text{\,}$ | 2206 | $8.5\text{×}{10}^{-4}\text{\,}$ | 2156 | $1.3\text{×}{10}^{-5}\text{\,}$ | 3315 | | |
Table 1: Wall-clock time comparison for the $SIR$ , $SIRD$ and $SEIRD$ models.
3.3 KOL and epidemic control
In this section we exploit KOL-m and KOL- $∂$ for the solution of paradigmatic optimal control problems, which are meaningful for making fast and reliable scenario-analyses in the context of epidemic control. More precisely, in Section 3.3.1 we are interested in the minimization of the eradication time prescribed a given epidemic threshold, while in Section 3.3.2 we tackle the minimization of the total amount of infected.
3.3.1 Optimal control for estimating the minimum eradication time
In the sequel, we focus on the minimization of the eradication time prescribed a given epidemic, building upon the theoretical results given in [13]. The presence of rigorous mathematical results on the existence of an optimal solution, makes the eradication problem a desirable and mathematically solid benchmark to study the reliability of our KOL-based approach. We formulate the problem by considering the standard deterministic $SIR$ model.
**Problem 2 (Minimum eradication time)**
*Let $\mathcal{U}=\{0≤ u(t)≤ u_{max}\,∀\,t∈[0,t^{*}]\}$ be the space of admissible controls. Solve
$$
\min_{u\in\mathcal{U}}\mathcal{J}_{t_{e}}(u)=\int_{0}^{t_{e}(u)}1\,dt,
$$
subject to the state problem
$$
\begin{cases}\dot{S}=-\beta(1-u)SI,\\
\dot{I}=\beta(1-u)SI-\gamma I,\\
\dot{R}=\gamma I,\end{cases}
$$
$∀ t\,∈(0,t^{*}]$ given $[S(0),I(0),R(0)]^{T}=[S_{0},I_{0},R_{0}]^{T}$ , where $t_{e}(u)$ is the eradication time associated to the control $u$ .*
Given a threshold $\eta>0$ of infected individuals, the eradication time is defined as the first time when infectious individuals are below the given threshold value, i.e.
$$
t_{e}\in(0,t^{*}]\;\mathrm{s.t.}\;I(t)>\eta\,\forall\,0<t<t_{e},\,I(t_{e})=\eta. \tag{18}
$$
It has been proven in [13] that the space of admissible optimal controls can be restricted to
$$
\mathcal{A}=\{u:[0,t^{*}]\rightarrow\{0,u_{max}\},\;u(t)=u_{max}\,H(t-\tau)\}, \tag{19}
$$
where $\tau$ is the starting intervention time depending on the maximum level of intervention $u_{max}$ , and $H(·)$ is the Heaviside step function. Defining the maximum control reproduction number as
$$
\mathcal{R}_{u_{max}}=\frac{\beta(1-u_{max})}{\gamma}, \tag{20}
$$
it is possible to find an optimal solution with non-trivial switching time, i.e. $\tau>0$ , when considering $\mathcal{R}_{u_{max}}<1$ .
First, we numerically reproduced the results of [13], by considering $\mathcal{R}_{0}=2$ , $\gamma=5$ , $dt=1$ , $t^{*}=100$ , and $u_{max}\,∈\,[0,0.7]$ and solving the above minimization problem constrained by the $SIR$ differential model. In order to match the initial conditions of [13], we consider $S_{0}=2000/2001$ and $I_{0}=1/2001$ . We retrieved the same behaviours of the paper of the switching time $\tau$ , the eradication time $t_{e}$ and of the susceptibles evaluated at the eradication time (see Figure 7). We solve the optimal control problem by direct evaluations and comparison of the values of the cost functional associated to the different control strategies in $\mathcal{A}$ , that are explored in an exhaustive way by sampling $\tau$ in a fine grid of step $\Delta t=0.01$ . This brute force strategy is not always feasible, but in the present context is motivated by the presence of the above mentioned analytical optimal solutions.
Then, building upon the results of Section 3, that show that our KOL approaches can be used as proxy models for a variety of epidemic differential model, we can rewrite Problem 2 by replacing the deterministic $SIR$ state problem with the surrogate KOL-m or KOL- $∂$ approach. Also in this case, the optimal solution is obtained by subsequent evaluations of control functions over the same fine grid of $\tau$ . In this way the optimal solutions obtained with our KOL approaches can be easily compared with the benchmark scenarios, thus obtaining a direct measure of the reliability of the KOL methods.
The KOL approaches have been trained on a specific dataset constituted by 10000 Step functions of different heights, sampled uniformly in $[0,0.8]$ (see Figure 3(b)).
The results are collected in Figure 7. More precisely, in Figure 7(a) we represent the switching times, the eradication time and the susceptibles at the eradication time for the optimal interventions of the strategy with KOL-m as surrogate state problem (point markers) together with the benchmark results (solid lines) depending on the maximum value of the intervention ( $u_{max}$ ). For the sake of comparison, in the same figures we represent the eradication times and the susceptibles at the eradication time for the trivial strategy implementing a constant intervention fixed at $u_{max}$ (dashed lines). Figure 7(b) represents the same benchmark quantities of [13] where the different quantities are retrieved by using KOL- $∂$ . The matching of the results shows the ability of our KOL approaches to reconstruct the optimal solution without directly relying on the differential model.
<details>
<summary>2404.11130v2/x10.png Details</summary>

### Visual Description
# Technical Document Extraction: Chart Analysis
## Left Chart: Time (t) vs. u_max
### Axes
- **Y-axis**: Time (t) [0.0 to 20.0]
- **X-axis**: u_max [0.0 to 0.7]
### Legend
1. **Eradication Time T* (constant measure)**
- Line style: Blue dashed
- Trend: Gradual increase to ~19 at u_max=0.5, then sharp decline to ~0.5 by u_max=0.7.
2. **Eradication Time T* (optimal measure)**
- Line style: Red dashed
- Trend: Flat at ~4 until u_max=0.5, then sharp decline to ~2 by u_max=0.7.
3. **Eradication Time T* (optimal measure - KOP)**
- Marker: Purple circles
- Trend: Starts at ~4, dips to ~1 by u_max=0.5, stabilizes near ~0.5.
4. **Jumping time τ* (KOP)**
- Line style: Green dashed
- Marker: Orange circles
- Trend: Flat at ~2 until u_max=0.5, then drops to ~0.5.
### Key Observations
- The **blue dashed line** (constant measure) exhibits a peak at u_max=0.5, followed by a rapid decline.
- The **red dashed line** (optimal measure) remains stable until u_max=0.5, then drops sharply.
- **Purple circles** (optimal measure - KOP) show a gradual decline with a notable dip at u_max=0.5.
- **Green dashed line** (jumping time) remains constant until u_max=0.5, then decreases.
---
## Right Chart: Population vs. u_max
### Axes
- **Y-axis**: Population [400 to 2000]
- **X-axis**: u_max [0.0 to 0.7]
### Legend
1. **Susceptibles at T* (constant measure)**
- Line style: Blue dashed
- Trend: Flat at ~400 until u_max=0.5, then sharp increase to ~1950.
2. **Susceptibles at T* (optimal measure)**
- Line style: Red dashed
- Trend: Flat at ~400 until u_max=0.5, then sharp increase to ~1900.
3. **Susceptibles at T* (optimal measure - KOP)**
- Marker: Purple circles
- Trend: Starts at ~400, peaks at ~550 at u_max=0.25, then declines to ~450.
### Key Observations
- The **blue dashed line** (constant measure) remains stable until u_max=0.5, then surges sharply.
- The **red dashed line** (optimal measure) mirrors the blue line but with a slightly lower peak (~1900 vs. ~1950).
- **Purple circles** (optimal measure - KOP) exhibit a bimodal pattern: a peak at u_max=0.25 (~550) followed by a decline.
---
## Cross-Reference Validation
- **Legend Colors/Labels**:
- Blue dashed lines correspond to "constant measure" in both charts.
- Red dashed lines correspond to "optimal measure" in both charts.
- Purple circles correspond to "optimal measure - KOP" in both charts.
- **Consistency**: All labels, axes, and trends align with the provided legends and visual data.
## Summary
- **Left Chart**: Focuses on time dynamics, with distinct behaviors for constant vs. optimal measures and KOP adjustments.
- **Right Chart**: Highlights population susceptibility, showing divergent outcomes based on measure type and KOP application.
- **Critical Data Points**:
- Left: Blue line peaks at ~19 (u_max=0.5); red line drops to ~2 (u_max=0.5).
- Right: Blue line jumps to ~1950 (u_max=0.5); purple circles peak at ~550 (u_max=0.25).
</details>
(a) KOL-m.
<details>
<summary>2404.11130v2/x11.png Details</summary>

### Visual Description
# Technical Document Extraction: Chart Analysis
## Chart 1: Time (t) vs. u_max
### Axes
- **X-axis**: `u_max` (range: 0.0 to 0.7)
- **Y-axis**: `Time (t)` (range: 0.0 to 20.0)
### Legend
1. **Eradication Time T* (constant measure)**
- Line style: Solid blue (`#1f77b4`)
2. **Eradication Time T* (optimal measure)**
- Line style: Dashed pink (`#ff7f0e`)
3. **Eradication Time T* (optimal measure - KOP)**
- Marker: Purple circles (`#9467bd`)
4. **Jumping time τ* (KOP)**
- Marker: Orange circles (`#d62728`)
### Key Trends
- **Eradication Time T* (constant measure)**
- Starts at ~4.0 (u_max=0.0), rises sharply to ~19.0 at u_max=0.5, then drops to ~0.5 by u_max=0.7.
- **Eradication Time T* (optimal measure)**
- Remains flat at ~3.5 from u_max=0.0 to 0.5, then drops sharply to ~0.5 by u_max=0.7.
- **Eradication Time T* (optimal measure - KOP)**
- Flat at ~3.5 until u_max=0.5, then drops to ~0.5 by u_max=0.7.
- **Jumping time τ* (KOP)**
- Flat at ~2.0 until u_max=0.5, then drops to ~0.0 by u_max=0.7.
## Chart 2: Susceptibles at T* vs. u_max
### Axes
- **X-axis**: `u_max` (range: 0.0 to 0.7)
- **Y-axis**: `Time (t)` (range: 0.0 to 2000.0)
### Legend
1. **Susceptibles at T* (constant measure)**
- Line style: Solid blue (`#1f77b4`)
2. **Susceptibles at T* (optimal measure)**
- Line style: Dashed pink (`#ff7f0e`)
3. **Susceptibles at T* (optimal measure - KOP)**
- Marker: Purple circles (`#9467bd`)
### Key Trends
- **Susceptibles at T* (constant measure)**
- Starts at ~400 (u_max=0.0), rises sharply to ~1950 at u_max=0.5, then plateaus at ~1950 by u_max=0.7.
- **Susceptibles at T* (optimal measure)**
- Remains flat at ~400 until u_max=0.5, then drops sharply to ~500 by u_max=0.7.
- **Susceptibles at T* (optimal measure - KOP)**
- Flat at ~400 until u_max=0.5, then drops to ~500 by u_max=0.7.
## Cross-Chart Observations
- **Consistency in KOP Notation**:
- Both charts use "KOP" (likely "Knockout Point") to denote modified optimal measures.
- **Behavioral Contrast**:
- Left chart focuses on eradication time dynamics, while the right chart emphasizes susceptible population changes.
- **Critical Threshold**:
- Both charts exhibit significant changes at `u_max=0.5`, suggesting a phase transition or critical parameter value.
## Notes
- All line styles and markers align with the legend definitions.
- No numerical data tables or embedded text blocks are present in the image.
</details>
(b) KOL- $∂$ .
Figure 7: Solutions of the optimal control problem searching for minimum eradication time.
3.3.2 Optimal control for minimizing the total amount of infected
Here, we consider a second control problem in which the total amount of infected individuals is minimized, as detailed in the following.
**Optimal Control Problem (constrained by ODE)**
*$$
\min_{u\,\in\,\mathcal{U}_{ad}}\mathcal{J}_{I,u}:=C_{I}\int_{0}^{t^{*}}I(t)^{2%
}dt+C_{u}\int_{0}^{t^{*}}u(t)^{2}dt,
$$
subject to the state problem
$$
\begin{cases}\dot{S}=-\beta(1-u)SI,\\
\dot{I}=\beta(1-u)SI-\gamma I,\\
\dot{R}=\gamma I,\end{cases}
$$
$∀ t\,∈(0,{t^{*}}]$ given $[S(0),I(0),R(0)]^{T}=[S_{0},I_{0},R_{0}]^{T}$ . The constants $C_{I}$ , $C_{u}$ are positive and they balance the effects of regularization with respect to the $L^{2}$ -infectious term.*
This is an optimal control problem governed by the $SIR$ model where the control $u$ acts on the transmission rate and incorporates any reducing effects. The optimal configuration obviously depends on the choice of weights $C_{I}$ and $C_{u}$ . The cost functional presents the classical Tikhonov regularization term, which has the double-purpose of increasing the convexity of the cost functional and to account for the (economic and social) burden at which higher-level NPIs come. For what concerns the set of admissible controls, we choose the space
$$
\mathcal{U}_{ad}=\left\{u(t)=\sum_{i=1}^{N}u_{i}\mathbbm{1}_{[t_{i-1},t_{i})]}%
(t),\,\mathrm{with}\,\{t_{i}\}_{i=0,N}\in(0,t^{*}]\;\right\}, \tag{21}
$$
containing piecewise constant functions over subintervals of equal length, with the number of subintervals that is a priori chosen. Moreover, choosing the admissible set of controls as in (21) allows to immediately recast the optimal control problems as a discrete optimization problems where the unknowns correspond to the value of the piecewise constant control function in each time slab.
In our numerical exploration, we consider the scenario where $\mathcal{R}_{0}=4$ , $t^{*}=5$ , $dt=0.05$ , initial conditions $[S_{0},\,I_{0},\,R_{0}]^{T}=[0.99,\,0.01,\,0]^{T}$ , $u:[0,t^{*}]→[0,0.7]$ and the number of subintervals $N=\{5,10,20\}$ . Clearly, larger values of $N$ increase the complexity of the optimization problem, but at the same time they carry more flexibility in designing the optimal control strategy. With the above choice of the parameters, we compare the solutions of the optimal control problem constrained by $SIR$ with the solutions of the following two problems, where the ODE constraint has been replaced by the surrogate operator obtained by KOL-m and KOL- $∂$ , respectively.
**Optimal Control Problem (governed by KOL-m)**
*$$
\min_{u\,\in\,\mathcal{U}_{ad}}\mathcal{J}_{I,u}=C_{I}\int_{0}^{t^{*}}I(t)^{2}%
dt+C_{u}\int_{0}^{t^{*}}u(t)^{2}dt,
$$ where the state problem is
$$
\begin{pmatrix}S(t)\\
I(t)\\
R(t)\end{pmatrix}=\begin{pmatrix}\mathcal{\bar{G}}_{m,S}(u)(t)\\
\mathcal{\bar{G}}_{m,I}(u)(t)\\
\mathcal{\bar{G}}_{m,R}(u)(t)\end{pmatrix}
$$
$∀ t\,∈(0,T]$ given $[S(0),I(0),R(0)]^{T}=[S_{0},I_{0},R_{0}]^{T}$ . This problem can be equivalently rewritten as an unconstrained minimization problem:
$$
\min_{u\,\in\,\mathcal{U}_{ad}}\mathcal{J}_{u}:=C_{I}\int_{0}^{t^{*}}\mathcal{%
\bar{G}}_{m,I}(u)(t)^{2}dt+C_{u}\int_{0}^{t^{*}}u(t)^{2}dt.
$$*
**Optimal Control Problem (governed by KOL-∂\partial∂)**
*$$
\min_{u\,\in\,\mathcal{U}_{ad}}\mathcal{J}_{I,u}=C_{I}\int_{0}^{T}I(t)^{2}dt+C%
_{u}\int_{0}^{T}u(t)^{2}dt,
$$
where the state problem is
$$
\begin{pmatrix}\dot{S}(t)\\
\dot{I}(t)\\
\dot{R}(t)\end{pmatrix}=\begin{pmatrix}\mathcal{\bar{G}}_{\partial,S}(u)(t)\\
\mathcal{\bar{G}}_{\partial,I}(u)(t)\\
\mathcal{\bar{G}}_{\partial,R}(u)(t)\end{pmatrix}
$$
$∀ t\,∈(0,T]$ given $[S(0),I(0),R(0)]^{T}=[S_{0},I_{0},R_{0}]^{T}$ . This problem can be equivalently rewritten as an unconstrained minimization problem:
$$
\min_{u\,\in\,\mathcal{U}_{ad}}\mathcal{J}_{u}:=C_{I}\int_{0}^{t^{*}}\left(I_{%
0}+\int_{0}^{t}\mathcal{\bar{G}}_{\partial,I}(u)(\tau)d\tau\right)^{2}dt+C_{u}%
\int_{0}^{t^{*}}u(t)^{2}dt.
$$*
Both KOL approaches have been trained with a dataset of 800 input control functions belonging to $\mathcal{U}_{ad}$ where the values of the piecewise constant control functions have been sampled from a uniform distribution in $[0,0.8]$ . We considered different values of $C_{I}>0$ and $C_{u}>0$ and solved the optimization problems using the Sequential Least Square Quadratic Programming methods (SLSQP) [37] as implemented in the python library Scipy [38]. This optimization scheme consists in approximating the original problem with a sequence of quadratic problems, whose objective is a second-order approximation of the Lagrangian of the original problem, and whose constraints are linearized.
In the case of $N=5$ phases, Figure 8 shows 13 different scenarios of the optimal trajectories reconstructed by solving the problem under $SIR$ (solid line), KOL-m (purple dashed line) and KOL- $∂$ (red dashed line) constraint, respectively. At first glance, we see that the optimal controls obtained with the three approaches do not exactly coincide, especially when $C_{u}\ll C_{I}$ , i.e. when the problem is extremely non-convex. However, comparing the cost functional values associated to the three optimal controls shed a completely different light on the reliability of our approach. Indeed, solving the deterministic $SIR$ problem for each optimal control strategy, and plotting the corresponding cost functional values in Figure 9, we can appreciate that the cost values practically coincide, thus showing the efficacy of our KOL-based approach in solving the optimal control problem. It is worth remarking that in some cases, typically the less convex ones (i.e. $C_{u}$ much larger than $C_{I}$ ), the KOL approaches succeed in finding cost values lower than the ones obtained using the $SIR$ constraint.
Finally, we compare the three approaches for increasing complexity of the optimization problem, namely for $N=10,20$ . The results are collected in Figures 10 - 11, where the optimal control functions are reported, and in Figures 12 - 13, where the corresponding cost values are compared. Also in these cases, we observe that the cost values achieved by the two KOL approaches are comparable, if not lower, than those obtained employing the $SIR$ constraint. However, with a growing amount of phases the three approaches produce different optimal trajectories due to the increased non-convexity of the cost functional and, in cascade, of the optimal control problem itself, which exhibit many local minima.
This latest set of results further proves the reliability and the effectiveness of KOL approaches for the most common optimization tasks.
Acknowledgements
The authors thank Dr. F. Regazzoni for the support provided with NN-ModL algorithm and for the insightful discussions on the topic. MV and GZ are members of INdAM-GNCS. The present reasearch is part of the activities of ”Dipartimento di Eccellenza 2023-2027”, MUR, Italy.
Data Availability
The computational framework is available on GitHub together with the numerical results of this work: https://github.com/giovanniziarelli/KOL.
<details>
<summary>2404.11130v2/x12.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Chart Description
A line graph comparing control values over time for three control strategies under specific parameter settings.
### Axis Labels
- **Y-axis**: "Control value (u_opt)" with scale from 0.0 to 0.7 in increments of 0.1
- **X-axis**: "Time (t)" with scale from 0 to 5 in increments of 1
### Legend Entries
| Color | Strategy |
|--------|-----------|
| Blue | ODE |
| Red | KOL-δ |
| Purple | KOL-m |
### Chart Title
"C_I = 1e-1, C_u = 1e-1"
### Key Observations
1. **Control Value Behavior**:
- All three control strategies maintain a constant value of 0.0 throughout the observed time period
- No temporal variation detected in any control strategy
2. **Temporal Coverage**:
- Time range: 0 to 5 units (exact duration unspecified)
- Uniform sampling at discrete time intervals (exact frequency unspecified)
3. **Parameter Context**:
- Input control coefficient: C_I = 0.1 (1e-1)
- Unspecified control coefficient: C_u = 0.1 (1e-1)
### Data Representation
- All three control strategies represented by horizontal lines at y=0.0
- No intersection points or phase changes observed
- Perfect alignment across all three control methods
### Technical Implications
- Indicates perfect control stability across all tested strategies
- Suggests parameter settings (C_I = C_u = 0.1) result in null control action
- May represent baseline/zero-control scenario for comparative analysis
</details>
(a)
<details>
<summary>2404.11130v2/x13.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Graph Title
**C_I = 1, C_u = 1e-1**
## Axes Labels
- **X-axis**: Time (t)
- Range: 0 to 5
- Tick marks: 0, 1, 2, 3, 4, 5
- **Y-axis**: Control value (u_opt)
- Range: 0 to 0.7
- Tick marks: 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7
## Legend
- **ODE**: Blue line
- **KOL-δ**: Red line
- **KOL-m**: Purple line
## Key Trends and Data Points
1. **ODE (Blue Line)**
- **t = 0**: Starts at 0.
- **t = 1**: Jumps to ~0.1.
- **t = 1–2**: Maintains ~0.1.
- **t = 2**: Drops to 0.
- **t = 2–5**: Remains at 0.
2. **KOL-δ (Red Line)**
- **t = 0**: Starts at ~0.05.
- **t = 1**: Increases to ~0.1.
- **t = 1–2**: Maintains ~0.1.
- **t = 2**: Drops to 0.
- **t = 2–5**: Remains at 0.
3. **KOL-m (Purple Line)**
- **t = 0**: Starts at 0.
- **t = 1**: Rises to ~0.05.
- **t = 1–2**: Maintains ~0.05.
- **t = 2**: Drops to 0.
- **t = 3.5**: Brief peak to ~0.05.
- **t = 3.5–5**: Returns to 0.
## Observations
- All control strategies exhibit a **step-like behavior** with activation at **t = 1** and deactivation at **t = 2**.
- **ODE** achieves the highest control value (~0.1) during its active phase.
- **KOL-δ** and **KOL-m** show lower control values (~0.05–0.1) during activation.
- **KOL-m** uniquely exhibits a secondary transient peak at **t = 3.5**.
- All strategies return to 0 by **t = 5**.
## Notes
- The graph uses dashed lines for all control strategies.
- Numerical values in the title (**C_I = 1**, **C_u = 1e-1**) are critical parameters for the system.
- No overlapping data points between strategies after **t = 2**.
</details>
(b)
<details>
<summary>2404.11130v2/x14.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value vs. Time Chart
## Chart Title and Parameters
- **Title**: Control Value (u_opt) vs. Time (t)
- **System Parameters**:
- \( C_I = 10 \)
- \( C_u = 1e-1 \)
## Axes Labels and Scales
- **X-axis (Time, t)**:
- Range: 0 to 5
- Increment: 1
- **Y-axis (Control Value, u_opt)**:
- Range: 0.0 to 0.7
- Increment: 0.1
## Legend and Control Strategies
- **ODE** (Solid Blue Line)
- **KOL-δ** (Dashed Red Line)
- **KOL-m** (Dashed Purple Line)
## Key Trends and Data Points
### ODE (Solid Blue)
- **t=0**: 0.1
- **t=1**: Jumps to 0.35 (remains constant until t=2)
- **t=2**: Drops to 0.1 (remains constant until t=5)
### KOL-δ (Dashed Red)
- **t=0**: 0.1
- **t=1**: Jumps to 0.35 (remains constant until t=2)
- **t=2**: Drops to 0.1 (remains constant until t=3)
- **t=3**: Drops to 0.05 (remains constant until t=5)
### KOL-m (Dashed Purple)
- **t=0**: 0.1
- **t=1**: Jumps to 0.3 (remains constant until t=2)
- **t=2**: Drops to 0.1 (remains constant until t=3)
- **t=3**: Drops to 0.05 (remains constant until t=5)
## Chart Components
- **Grid Lines**: Dashed horizontal and vertical lines at 0.1 increments.
- **Line Styles**:
- Solid lines for ODE
- Dashed lines for KOL-δ and KOL-m
- **Legend Position**: Top-right corner
## Observations
1. All strategies start at a control value of 0.1 at t=0.
2. ODE maintains the highest control value (0.35) during t=1–2.
3. KOL-δ and KOL-m exhibit stepwise decreases in control value after t=2.
4. By t=5, ODE stabilizes at 0.1, while KOL-δ and KOL-m stabilize at 0.05.
</details>
(c)
<details>
<summary>2404.11130v2/x15.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Chart Title
- **Title**: Control Value Over Time
- **Parameters**:
- \( C_I = 100 \)
- \( C_u = 1e-1 \)
## Axes
- **X-axis (Time, \( t \))**:
- Range: \( 0 \leq t \leq 5 \)
- Labels: Incremented by 1 (0, 1, 2, 3, 4, 5)
- **Y-axis (Control Value, \( u_{opt} \))**:
- Range: \( 0.0 \leq u_{opt} \leq 0.7 \)
- Labels: Incremented by 0.1 (0.0, 0.1, ..., 0.7)
## Legend
- **Color-Coded Strategies**:
- **Blue**: ODE
- **Red (dashed)**: KOL-δ
- **Purple (dotted)**: KOL-m
## Line Descriptions
1. **ODE (Blue Solid Line)**:
- **Initial Value**: \( u_{opt} = 0.7 \) at \( t = 0 \).
- **Transition**: Drops sharply to \( u_{opt} = 0.35 \) at \( t = 1 \).
- **Stability**: Remains constant at \( u_{opt} = 0.35 \) for \( t \geq 1 \).
2. **KOL-δ (Red Dashed Line)**:
- **Initial Value**: \( u_{opt} = 0.6 \) at \( t = 0 \).
- **Transition**: Drops sharply to \( u_{opt} = 0.35 \) at \( t = 1 \).
- **Fluctuation**: Oscillates between \( u_{opt} = 0.35 \) and \( u_{opt} = 0.4 \) for \( t \geq 1 \).
3. **KOL-m (Purple Dotted Line)**:
- **Initial Value**: \( u_{opt} = 0.55 \) at \( t = 0 \).
- **Transition**: Drops sharply to \( u_{opt} = 0.35 \) at \( t = 1 \).
- **Fluctuation**: Oscillates between \( u_{opt} = 0.35 \) and \( u_{opt} = 0.45 \) for \( t \geq 1 \).
## Key Observations
- All strategies exhibit a sharp drop in control value at \( t = 1 \).
- ODE stabilizes at the lowest control value (\( 0.35 \)) post-transition.
- KOL-δ and KOL-m show sustained oscillations around their post-transition values.
- KOL-m achieves the highest post-transition control value (\( 0.45 \)) compared to KOL-δ (\( 0.4 \)).
## Data Points Summary
| Time (\( t \)) | ODE (\( u_{opt} \)) | KOL-δ (\( u_{opt} \)) | KOL-m (\( u_{opt} \)) |
|----------------|---------------------|-----------------------|-----------------------|
| 0 | 0.7 | 0.6 | 0.55 |
| 1 | 0.35 | 0.35 | 0.35 |
| 2 | 0.35 | 0.35 | 0.4 |
| 3 | 0.35 | 0.35 | 0.4 |
| 4 | 0.35 | 0.35 | 0.45 |
| 5 | 0.35 | 0.35 | 0.45 |
## Notes
- **ODE** demonstrates the most stable post-transition behavior.
- **KOL-m** exhibits the highest variability but also the highest peak control value.
- **KOL-δ** balances stability and moderate control value.
</details>
(d)
<details>
<summary>2404.11130v2/x16.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Chart Title
- **Title**: Control Value (u_opt) vs Time (t)
- **Parameters**:
- \( C_I = 1e3 \)
- \( C_u = 1e-1 \)
## Axes
- **X-axis (Time, t)**:
- Range: 0 to 5
- Ticks: 0, 1, 2, 3, 4, 5
- **Y-axis (Control Value, u_opt)**:
- Range: 0.0 to 0.7
- Ticks: 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7
## Legend
- **ODE**: Blue solid line
- **KOL-δ**: Red dashed line
- **KOL-m**: Purple dash-dot line
## Key Trends and Data Points
1. **ODE (Blue)**:
- Starts at **0.7** at \( t = 0 \).
- Drops sharply to **0.4** at \( t = 1 \).
- Remains constant at **0.4** until \( t = 3 \).
- Rises slightly to **0.45** at \( t = 5 \).
2. **KOL-δ (Red)**:
- Starts at **0.6** at \( t = 0 \).
- Drops to **0.4** at \( t = 1 \).
- Remains constant at **0.4** until \( t = 3 \).
- Rises to **0.45** at \( t = 5 \).
3. **KOL-m (Purple)**:
- Starts at **0.6** at \( t = 0 \).
- Drops to **0.4** at \( t = 1 \).
- Remains constant at **0.4** until \( t = 3 \).
- Rises to **0.45** at \( t = 5 \).
## Convergence
- All three control strategies converge to **0.45** at \( t = 5 \).
## Observations
- **ODE** exhibits the most significant initial drop.
- **KOL-δ** and **KOL-m** show identical behavior after \( t = 1 \).
- All strategies stabilize near **0.45** by \( t = 5 \).
</details>
(e)
<details>
<summary>2404.11130v2/x17.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Chart Description
This image depicts a line chart analyzing control values over time under specific parameter conditions. The chart includes three distinct control strategies represented by colored lines, with precise numerical annotations for key data points.
---
## Axis Labels and Scales
- **X-axis (Time, t):**
- Range: 0 to 5 (unitless)
- Markers: Integer ticks at 0, 1, 2, 3, 4, 5
- **Y-axis (Control value, u_opt):**
- Range: 0.0 to 0.7 (unitless)
- Markers: Incremental ticks at 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7
---
## Chart Title and Parameters
- **Title:**
`C_I = 1e4, C_u = 1e-1`
(Indicates system parameters for the control analysis)
---
## Legend and Line Definitions
| Color | Label | Line Style |
|---------|-----------|------------------|
| Blue | ODE | Solid |
| Red | KOL-δ | Dashed |
| Purple | KOL-m | Dash-dot |
---
## Key Data Trends
### ODE (Blue Solid Line)
- **Initial Value (t=0):** 0.7
- **Step Down (t=1):** Drops to 0.4
- **Stable Phase (t=1–3):** Maintains 0.4
- **Slight Rise (t=4):** Increases to 0.45
- **Final Value (t=5):** Returns to 0.4
### KOL-δ (Red Dashed Line)
- **Initial Value (t=0):** 0.4
- **Step Down (t=1):** Drops to 0.35
- **Stable Phase (t=1–3):** Maintains 0.35
- **Rise (t=4):** Increases to 0.4
- **Final Value (t=5):** Maintains 0.4
### KOL-m (Purple Dash-Dot Line)
- **Initial Value (t=0):** 0.7
- **Step Down (t=1):** Drops to 0.5
- **Stable Phase (t=1–3):** Maintains 0.5
- **Drop (t=4):** Decreases to 0.4
- **Final Value (t=5):** Maintains 0.4
---
## Observations
1. **ODE Strategy:**
- Exhibits the highest initial control value (0.7) but stabilizes at 0.4 after t=1.
- Shows minor fluctuation at t=4 (0.45) before returning to baseline.
2. **KOL-δ Strategy:**
- Maintains the lowest control value throughout (0.35–0.4).
- Demonstrates consistent stability after t=1.
3. **KOL-m Strategy:**
- Starts at the same value as ODE (0.7) but drops to 0.5 at t=1.
- Remains stable until t=4, where it aligns with other strategies at 0.4.
4. **Parameter Impact:**
- High `C_I` (1e4) and low `C_u` (1e-1) likely influence the control dynamics, as reflected in the step changes and stabilization patterns.
---
## Cross-Referenced Legend Accuracy
- **Blue (ODE):** Matches solid line with highest initial value.
- **Red (KOL-δ):** Matches dashed line with lowest values.
- **Purple (KOL-m):** Matches dash-dot line with intermediate values.
All legend labels and line styles are consistent with the chart data.
</details>
(f)
<details>
<summary>2404.11130v2/x18.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Chart Overview
This line chart illustrates the control value (`u_opt`) over time (`t`) for three control strategies: **ODE**, **KOL-δ**, and **KOL-m**. The chart is titled **"C_I = 1, C_u = 0"**, indicating fixed parameters for the system.
---
### Axis Labels and Markers
- **X-axis (Time, t):**
- Range: 0 to 5 (discrete intervals).
- Labels: 0, 1, 2, 3, 4, 5.
- **Y-axis (Control Value, u_opt):**
- Range: 0.0 to 0.7.
- Labels: 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7.
---
### Legend
- **Location:** Top-right corner of the chart.
- **Entries:**
- **Blue (solid):** ODE
- **Red (dashed):** KOL-δ
- **Purple (dash-dot):** KOL-m
---
### Data Series Analysis
#### 1. ODE (Blue Solid Line)
- **Trend:**
- Starts at 0.42 (t=0).
- Increases stepwise to 0.52 (t=1), 0.62 (t=2).
- Drops to 0.42 (t=3), peaks at 0.72 (t=4), then sharply declines to 0.12 (t=5).
- **Key Data Points:**
- t=0: 0.42
- t=1: 0.52
- t=2: 0.62
- t=3: 0.42
- t=4: 0.72
- t=5: 0.12
#### 2. KOL-δ (Red Dashed Line)
- **Trend:**
- Starts at 0.36 (t=0).
- Rises to 0.56 (t=1), remains constant until t=2.
- Drops to 0.36 (t=3), increases to 0.66 (t=4), then falls to 0.16 (t=5).
- **Key Data Points:**
- t=0: 0.36
- t=1: 0.56
- t=2: 0.56
- t=3: 0.36
- t=4: 0.66
- t=5: 0.16
#### 3. KOL-m (Purple Dash-Dot Line)
- **Trend:**
- Starts at 0.42 (t=0).
- Jumps to 0.62 (t=1), remains constant until t=2.
- Drops to 0.42 (t=3), rises to 0.62 (t=4), then declines to 0.22 (t=5).
- **Key Data Points:**
- t=0: 0.42
- t=1: 0.62
- t=2: 0.62
- t=3: 0.42
- t=4: 0.62
- t=5: 0.22
---
### Cross-Reference Validation
- **Legend Colors vs. Line Colors:**
- ODE (blue) matches the solid blue line.
- KOL-δ (red) matches the dashed red line.
- KOL-m (purple) matches the dash-dot purple line.
- **Spatial Grounding:**
- Legend is positioned in the top-right corner, outside the main chart area.
---
### Summary
The chart demonstrates distinct control value trajectories for the three strategies. **ODE** exhibits the highest variability, with a sharp peak at t=4. **KOL-δ** and **KOL-m** show similar patterns but with lower peak values. All data points align with the legend’s color coding, ensuring consistency.
</details>
(g)
<details>
<summary>2404.11130v2/x19.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Strategy Performance Graph
## Axis Labels and Titles
- **X-axis**:
- Label: `Time (t)`
- Ticks: `0`, `1`, `2`, `3`, `4`, `5`
- **Y-axis**:
- Label: `Control value (u_opt)`
- Ticks: `0.0`, `0.1`, `0.2`, `0.3`, `0.4`, `0.5`, `0.6`, `0.7`
- **Graph Title**:
- `C_I = 1, C_u = 1`
## Legend
- **Color-Coded Strategies**:
- `ODE`: Blue line
- `KOL-δ`: Red line
- `KOL-m`: Purple line
## Key Trends and Data Points
1. **Control Value Behavior**:
- All strategies maintain control values near `0.0` throughout the observed time range (`t = 0` to `t = 5`).
- **ODE** (blue) consistently exhibits the lowest control value, slightly below `0.01`.
- **KOL-δ** (red) and **KOL-m** (purple) overlap almost perfectly, with values marginally above `0.0` (approximately `0.01–0.02`).
2. **Stability**:
- No significant fluctuations or deviations observed for any strategy over time.
- All lines remain flat, indicating steady-state performance.
## Diagram Components
- **Lines**:
- Three distinct lines represent the three control strategies.
- Lines are plotted against a grid with dashed reference lines for alignment.
- **Grid**:
- Horizontal and vertical dashed lines at axis tick intervals.
## Cross-Referenced Accuracy
- Legend colors (`blue`, `red`, `purple`) match the corresponding lines in the graph.
- Axis labels and tick markers align with the plotted data ranges.
## Notes
- The graph emphasizes comparative performance of control strategies under identical conditions (`C_I = 1`, `C_u = 1`).
- Minimal variation in control values suggests robustness across strategies, with ODE marginally outperforming others in minimizing `u_opt`.
</details>
(h)
<details>
<summary>2404.11130v2/x20.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis Chart
## Chart Title
- **Title**: `C_I = 1, C_u = 1e-2`
## Axes
- **X-axis (Horizontal)**:
- Label: `Time (t)`
- Scale: Linear, 0 to 5
- Tick Marks: 0, 1, 2, 3, 4, 5
- **Y-axis (Vertical)**:
- Label: `Control value (u_opt)`
- Scale: Linear, 0.0 to 0.7
- Tick Marks: 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7
## Legend
- **Color-Coded Strategies**:
- **Blue**: ODE
- **Red**: KOL-δ
- **Purple**: KOL-m
## Data Trends
1. **ODE (Blue Line)**:
- Initial value: ~0.08 (t=0)
- Step 1: Jumps to ~0.35 at t=1
- Step 2: Drops to ~0.3 at t=2
- Step 3: Drops to ~0.08 at t=3
- Final value: ~0.08 (t=4–5)
2. **KOL-δ (Red Line)**:
- Initial value: ~0.1 (t=0)
- Step 1: Jumps to ~0.35 at t=1
- Step 2: Drops to ~0.3 at t=2
- Step 3: Drops to ~0.1 at t=3
- Final value: ~0.1 (t=4–5)
3. **KOL-m (Purple Line)**:
- Initial value: ~0.0 (t=0)
- Step 1: Jumps to ~0.35 at t=1
- Step 2: Drops to ~0.28 at t=2
- Step 3: Drops to ~0.08 at t=3
- Final value: ~0.08 (t=4–5)
## Key Observations
- All strategies exhibit stepwise control value changes at discrete time intervals (t=1, 2, 3).
- ODE and KOL-δ maintain higher control values than KOL-m after t=3.
- KOL-m shows the most significant drop at t=3 (~0.28 → ~0.08).
- ODE and KOL-δ share identical control values at t=1 and t=2 (~0.35 and ~0.3, respectively).
## Embedded Text
- No additional text blocks or annotations present in the chart.
</details>
(i)
<details>
<summary>2404.11130v2/x21.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Chart Title and Parameters
- **Title**: Control Value Over Time
- **System Parameters**:
- \( C_I = 1 \)
- \( C_u = 1 \times 10^{-3} \)
## Axes
- **X-axis (Time, \( t \))**:
- Range: \( 0 \leq t \leq 5 \)
- Markers: Integer ticks at \( t = 0, 1, 2, 3, 4, 5 \)
- **Y-axis (Control Value, \( u_{opt} \))**:
- Range: \( 0.0 \leq u_{opt} \leq 0.7 \)
- Markers: Incremental ticks at \( 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7 \)
- Grid: Dotted lines for reference
## Legend
- **Control Strategies**:
- **ODE**: Solid blue line
- **KOL-δ**: Dashed red line
- **KOL-m**: Dash-dot purple line
## Key Trends and Data Points
### ODE (Solid Blue)
- **Behavior**:
- \( t = 0 \): \( u_{opt} = 0.3 \)
- \( t = 1 \): \( u_{opt} = 0.5 \) (step increase)
- \( t = 2 \): \( u_{opt} = 0.4 \) (step decrease)
- \( t = 3 \): \( u_{opt} = 0.55 \) (step increase)
- \( t = 4 \): \( u_{opt} = 0.3 \) (step decrease)
- \( t = 5 \): \( u_{opt} = 0.3 \) (constant)
### KOL-δ (Dashed Red)
- **Behavior**:
- \( t = 0 \): \( u_{opt} = 0.6 \)
- \( t = 1 \): \( u_{opt} = 0.3 \) (step decrease)
- \( t = 2 \): \( u_{opt} = 0.4 \) (step increase)
- \( t = 3 \): \( u_{opt} = 0.4 \) (constant)
- \( t = 4 \): \( u_{opt} = 0.2 \) (step decrease)
- \( t = 5 \): \( u_{opt} = 0.2 \) (constant)
### KOL-m (Dash-Dot Purple)
- **Behavior**:
- \( t = 0 \): \( u_{opt} = 0.55 \)
- \( t = 1 \): \( u_{opt} = 0.25 \) (step decrease)
- \( t = 2 \): \( u_{opt} = 0.45 \) (step increase)
- \( t = 3 \): \( u_{opt} = 0.45 \) (constant)
- \( t = 4 \): \( u_{opt} = 0.2 \) (step decrease)
- \( t = 5 \): \( u_{opt} = 0.2 \) (constant)
## Observations
1. **ODE** exhibits the most frequent control value adjustments, with three distinct step changes.
2. **KOL-δ** shows a gradual reduction in control value over time, with a notable stabilization at \( t = 3 \).
3. **KOL-m** demonstrates the largest initial drop but stabilizes at a lower value compared to other strategies.
4. All strategies converge to \( u_{opt} = 0.2 \) by \( t = 5 \), though ODE and KOL-δ maintain higher values earlier in the timeline.
## Structural Notes
- The chart uses a Cartesian coordinate system with a white background and light gray grid lines.
- Line styles (solid, dashed, dash-dot) are used to differentiate control strategies.
- No overlapping data points are observed between strategies at any time \( t \).
</details>
(j)
<details>
<summary>2404.11130v2/x22.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Chart Title
- **Title**: Control Value Analysis
- **Parameters**:
- \( C_I = 1 \)
- \( C_u = 1 \times 10^{-4} \)
## Axes
- **X-axis (Time)**:
- Label: `Time (t)`
- Range: \( 0 \leq t \leq 5 \)
- Ticks: \( 0, 1, 2, 3, 4, 5 \)
- **Y-axis (Control Value)**:
- Label: `Control value (u_opt)`
- Range: \( 0.0 \leq u_{opt} \leq 0.7 \)
- Ticks: \( 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7 \)
## Legend
- **Control Strategies**:
- **ODE**: Solid blue line
- **KOL-δ**: Dashed red line
- **KOL-m**: Dash-dot purple line
## Key Trends and Data Points
1. **ODE (Blue Line)**:
- **Initial Value**: \( u_{opt} = 0.6 \) at \( t = 0 \).
- **Drops**: \( u_{opt} = 0.5 \) at \( t = 1 \).
- **Drops Further**: \( u_{opt} = 0.4 \) at \( t = 2 \).
- **Peaks**: \( u_{opt} = 0.6 \) at \( t = 3 \).
- **Sharp Drop**: \( u_{opt} = 0.15 \) at \( t = 4 \).
- **Stabilizes**: \( u_{opt} = 0.15 \) at \( t = 5 \).
2. **KOL-δ (Red Dashed Line)**:
- **Initial Value**: \( u_{opt} = 0.6 \) at \( t = 0 \).
- **Drops**: \( u_{opt} = 0.5 \) at \( t = 1 \).
- **Drops Further**: \( u_{opt} = 0.4 \) at \( t = 2 \).
- **Rises**: \( u_{opt} = 0.5 \) at \( t = 3 \).
- **Drops**: \( u_{opt} = 0.15 \) at \( t = 4 \).
- **Stabilizes**: \( u_{opt} = 0.15 \) at \( t = 5 \).
3. **KOL-m (Purple Dash-Dot Line)**:
- **Initial Value**: \( u_{opt} = 0.5 \) at \( t = 0 \).
- **Drops**: \( u_{opt} = 0.4 \) at \( t = 1 \).
- **Drops Further**: \( u_{opt} = 0.35 \) at \( t = 2 \).
- **Rises**: \( u_{opt} = 0.5 \) at \( t = 3 \).
- **Drops**: \( u_{opt} = 0.15 \) at \( t = 4 \).
- **Stabilizes**: \( u_{opt} = 0.15 \) at \( t = 5 \).
## Observations
- All strategies exhibit a **sharp drop** in control value at \( t = 4 \), converging to \( u_{opt} = 0.15 \).
- **ODE** and **KOL-δ** start at the highest initial value (\( 0.6 \)), while **KOL-m** begins at \( 0.5 \).
- **KOL-δ** and **KOL-m** show intermediate recovery at \( t = 3 \), whereas **ODE** maintains a higher value until \( t = 4 \).
## Cross-Referenced Legend Accuracy
- **Blue (ODE)**: Matches solid line trajectory.
- **Red (KOL-δ)**: Matches dashed line trajectory.
- **Purple (KOL-m)**: Matches dash-dot line trajectory.
</details>
(k)
<details>
<summary>2404.11130v2/x23.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Chart Title
- **Title**: Control Value Over Time
- **Parameters**:
- \( C_I = 1 \)
- \( C_u = 1e-5 \)
## Axes
- **X-axis (Time, \( t \))**:
- Range: \( 0 \leq t \leq 5 \)
- Increment: 1 unit
- **Y-axis (Control Value, \( u_{opt} \))**:
- Range: \( 0.0 \leq u_{opt} \leq 0.7 \)
- Increment: 0.1 units
## Legend
| Color | Label |
|--------|-----------|
| Blue | ODE |
| Red | KOL-δ |
| Purple | KOL-m |
## Key Trends
1. **ODE (Blue Line)**:
- Initial value: \( 0.4 \) at \( t = 0 \)
- Step changes:
- \( t = 1 \): \( 0.5 \)
- \( t = 2 \): \( 0.7 \)
- \( t = 3 \): \( 0.4 \)
- \( t = 4 \): \( 0.1 \)
- Final value: \( 0.1 \) (held constant at \( t = 5 \))
2. **KOL-δ (Red Line)**:
- Initial value: \( 0.35 \) at \( t = 0 \)
- Step changes:
- \( t = 1 \): \( 0.5 \)
- \( t = 2 \): \( 0.6 \)
- \( t = 3 \): \( 0.5 \)
- \( t = 4 \): \( 0.2 \)
- Final value: \( 0.2 \) (held constant at \( t = 5 \))
3. **KOL-m (Purple Line)**:
- Initial value: \( 0.4 \) at \( t = 0 \)
- Step changes:
- \( t = 1 \): \( 0.6 \)
- \( t = 2 \): \( 0.5 \)
- \( t = 3 \): \( 0.6 \)
- \( t = 4 \): \( 0.2 \)
- Final value: \( 0.2 \) (held constant at \( t = 5 \))
## Observations
- All strategies exhibit stepwise control value adjustments at discrete time intervals.
- **ODE** shows the largest initial increase (from \( 0.4 \) to \( 0.7 \)) but ends with the lowest control value (\( 0.1 \)).
- **KOL-δ** and **KOL-m** converge to the same final control value (\( 0.2 \)) but differ in intermediate steps.
- **KOL-m** demonstrates the highest control value (\( 0.6 \)) at \( t = 1 \) and \( t = 3 \).
## Data Points Summary
| Time (\( t \)) | ODE (\( u_{opt} \)) | KOL-δ (\( u_{opt} \)) | KOL-m (\( u_{opt} \)) |
|----------------|---------------------|-----------------------|-----------------------|
| 0 | 0.4 | 0.35 | 0.4 |
| 1 | 0.5 | 0.5 | 0.6 |
| 2 | 0.7 | 0.6 | 0.5 |
| 3 | 0.4 | 0.5 | 0.6 |
| 4 | 0.1 | 0.2 | 0.2 |
| 5 | 0.1 | 0.2 | 0.2 |
## Notes
- The chart uses dashed lines for all strategies, with no intermediate smoothing.
- Control values are held constant between time steps (e.g., \( t = 0 \) to \( t = 1 \)).
- No overlapping data points occur between strategies at any time step.
</details>
(l)
<details>
<summary>2404.11130v2/x24.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Chart Title
**C_I = 1, C_u = 1e-6**
## Axis Labels
- **X-axis**: Time (t) [0, 1, 2, 3, 4, 5]
- **Y-axis**: Control value (u_opt) [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7]
## Legend
- **ODE**: Solid blue line
- **KOL-δ**: Dashed red line
- **KOL-m**: Dotted purple line
## Key Trends and Data Points
### ODE (Blue)
- **0–1s**: Control value = 0.4
- **1–2s**: Step increase to 0.5
- **2–3s**: Step increase to 0.6
- **3–4s**: Step decrease to 0.4
- **4–5s**: Step decrease to 0.1
### KOL-δ (Red)
- **0–1s**: Control value = 0.35
- **1–2s**: Step increase to 0.5
- **2–3s**: Maintains 0.5
- **3–5s**: Step decrease to 0.35 (held constant)
### KOL-m (Purple)
- **0–1s**: Control value = 0.4
- **1–2s**: Step increase to 0.6
- **2–3s**: Step decrease to 0.5
- **3–4s**: Step increase to 0.6
- **4–5s**: Step decrease to 0.2 (held constant)
## Parameter Context
- **C_I = 1**: Integral gain parameter
- **C_u = 1e-6**: Control update rate constant
## Observations
1. All strategies exhibit stepwise control adjustments at discrete time intervals.
2. ODE shows the most aggressive control value changes (e.g., 0.4 → 0.6 in 1s).
3. KOL-δ maintains stability after initial adjustments (0.5 held for 3s).
4. KOL-m demonstrates oscillatory behavior (0.6 → 0.5 → 0.6) before final stabilization.
</details>
(m)
Figure 8: Optimal controls for the three optimal control problems fixing $N=5$ .
<details>
<summary>2404.11130v2/x25.png Details</summary>

### Visual Description
# Technical Document Extraction: Line Chart Analysis
## Chart Overview
The image is a **logarithmic line chart** comparing three computational methods (ODE, KOL-δ, KOL-m) across varying parameters `C_I` and `C_u`. The y-axis represents the integral `C_I ∫₀ᵀ I²(t)dt + C_u ∫₀ᵀ u²(t)dt`, and the x-axis categorizes data points by combinations of `C_I` and `C_u` values.
---
## Axis Labels and Markers
- **Y-Axis**:
`C_I ∫₀ᵀ I²(t)dt + C_u ∫₀ᵀ u²(t)dt`
Scale: Logarithmic (10⁻⁶ to 10⁻²).
- **X-Axis Categories**:
- `C_I = 1e-1, C_u = 1e-1`
- `C_I = 1, C_u = 1e-1`
- `C_I = 1e1, C_u = 1e-1`
- `C_I = 1e2, C_u = 1e-1`
- `C_I = 1e3, C_u = 1e-1`
- `C_I = 1e4, C_u = 1e-1`
- `C_I = 1, C_u = 0`
---
## Legend and Data Series
| Symbol | Label | Color | Marker Type |
|--------|---------|--------|-------------|
| ● | ODE | Blue | Circle |
| △ | KOL-δ | Red | Triangle |
| ■ | KOL-m | Purple | Square |
---
## Key Trends and Data Points
1. **General Behavior**:
- All three methods (ODE, KOL-δ, KOL-m) exhibit **nearly identical values** across most x-axis categories (`C_I = 1e-1` to `C_I = 1e4`).
- Values cluster tightly between **10⁻³ and 10⁻²** for these ranges.
2. **Exception at `C_I = 1, C_u = 0`**:
- **ODE** and **KOL-δ** show a **sharp drop** to ~10⁻⁶.
- **KOL-m** remains significantly lower (~10⁻⁷) compared to other categories.
- This suggests sensitivity to the `C_u = 0` condition.
3. **Consistency**:
- No overlap or divergence between methods in most categories, indicating robustness across tested parameters.
---
## Critical Observations
- The chart highlights a **unique behavior** at `C_I = 1, C_u = 0`, where ODE and KOL-δ diverge sharply from other data points.
- KOL-m consistently underperforms (lower values) compared to ODE and KOL-δ in all categories except `C_I = 1e-1, C_u = 1e-1`, where it aligns closely with ODE.
---
## Notes for Interpretation
- The logarithmic y-axis emphasizes **orders of magnitude** differences, particularly at `C_I = 1, C_u = 0`.
- The x-axis labels combine `C_I` and `C_u` values, requiring careful cross-referencing with legend markers.
- No data table is present; all information is derived from the plotted points and axis labels.
</details>
(a)
<details>
<summary>2404.11130v2/x26.png Details</summary>

### Visual Description
# Technical Document Extraction: Scatter Plot Analysis
## Chart Description
This image is a **scatter plot** comparing three computational methods (ODE, KOL-δ, KOL-m) across varying parameters `C_I` and `C_u`. The y-axis represents a normalized integral value, while the x-axis categorizes data points by `C_I` and `C_u` values.
---
### **Axis Labels & Markers**
- **Y-Axis**:
`C_I ∫₀ᵀ I²(t)dt + C_u ∫₀ᵀ u²(t)dt`
(Logarithmic scale: 10⁻⁶ to 10⁻³)
- **X-Axis**:
Discrete categories for `C_I` and `C_u` values:
- `C_I = 1` (repeated for all data points)
- `C_u = 1, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6`
---
### **Legend**
| Symbol | Method |
|--------|----------|
| 🔵 Circle | ODE |
| 🔴 Triangle | KOL-δ |
| 🟣 Square | KOL-m |
---
### **Data Points & Trends**
1. **C_u = 1**
- All methods cluster near **10⁻³** on the y-axis.
- ODE (🔵) and KOL-δ (🔴) overlap closely; KOL-m (🟣) is slightly lower.
2. **C_u = 1e-1 to 1e-3**
- Values decrease logarithmically with increasing `C_u` magnitude.
- ODE and KOL-δ remain tightly grouped; KOL-m lags by ~1–2 orders of magnitude.
3. **C_u = 1e-4 to 1e-6**
- ODE and KOL-δ converge toward **10⁻⁶**, while KOL-m remains significantly lower (~10⁻⁷ to 10⁻⁶).
- At `C_u = 1e-6`, KOL-m is **10× smaller** than ODE/KOL-δ.
---
### **Key Observations**
- **ODE vs. KOL-δ**: Nearly identical performance across all `C_u` values.
- **KOL-m**: Consistently underperforms ODE/KOL-δ by 1–2 orders of magnitude.
- **Parameter Sensitivity**: Results degrade (higher integral values) as `C_u` decreases, but KOL-m exhibits greater sensitivity.
---
### **Cross-Referenced Legend & Data**
- **C_u = 1e-6**:
- ODE (🔵): ~10⁻⁶
- KOL-δ (🔴): ~10⁻⁶
- KOL-m (🟣): ~10⁻⁷
- **C_u = 1e-1**:
- ODE/KOL-δ: ~10⁻³
- KOL-m: ~10⁻⁴
---
### **Conclusion**
The plot demonstrates that ODE and KOL-δ methods yield comparable results, while KOL-m underperforms significantly, particularly at lower `C_u` values. The logarithmic y-axis highlights exponential differences in computational efficiency or error metrics between methods.
</details>
(b)
Figure 9: Cost functionals at the optimal control for different $C_{I}$ and $C_{u}$ ( $N=5$ ). (a) We consider different orders of magnitude for $C_{I}$ , keeping $C_{u}=1e-1$ . (b) We consider different orders of magnitude for $C_{u}$ , keeping $C_{I}=1$ .
<details>
<summary>2404.11130v2/x27.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis Over Time
## Title
**Control Value (u_opt) Analysis**
*Parameters: C_I = 1e-1, C_u = 1e-1*
---
### Axes Labels
- **X-axis**: Time (t)
- Range: 0 to 5
- Increment: 1
- **Y-axis**: Control value (u_opt)
- Range: 0 to 0.7
- Increment: 0.1
---
### Legend
- **ODE**: Blue line
- **KOL-δ**: Red line
- **KOL-m**: Purple line
---
### Line Descriptions
1. **ODE (Blue)**
- **Behavior**: Flat line at y = 0 across all time points (t = 0 to t = 5).
- **Key Observation**: No deviation from baseline control value.
2. **KOL-δ (Red)**
- **Behavior**:
- Starts at y ≈ 0.01 (t = 0).
- Sharp spike to y ≈ 0.05 at t ≈ 1.5.
- Returns to y ≈ 0.01 at t ≈ 2.5.
- Remains flat at y ≈ 0.01 for t > 2.5.
- **Key Observation**: Transient control value spike at t ≈ 1.5.
3. **KOL-m (Purple)**
- **Behavior**: Flat line at y = 0 across all time points (t = 0 to t = 5).
- **Key Observation**: No deviation from baseline control value.
---
### Key Trends
- **ODE and KOL-m**: Maintain constant control values (u_opt = 0) throughout the observed time period.
- **KOL-δ**: Exhibits a transient increase in control value at t ≈ 1.5, followed by a return to baseline.
- **Grid**: Dotted grid lines at x = 0, 1, 2, 3, 4, 5 and y = 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7 for reference.
---
### Additional Notes
- **Control Parameters**:
- C_I (Input Control): 1e-1
- C_u (Uncertainty Control): 1e-1
- **Visualization**: Line graph with three distinct control strategies compared over time.
---
### Cross-Reference Validation
- **Legend Colors vs. Line Colors**:
- Blue (ODE) matches blue line.
- Red (KOL-δ) matches red line.
- Purple (KOL-m) matches purple line.
- **Axis Markers**: All labels and increments explicitly stated in the image.
---
### Conclusion
The graph demonstrates that ODE and KOL-m strategies maintain stable control values, while KOL-δ exhibits a transient response. This suggests KOL-δ may be sensitive to specific time intervals, whereas ODE and KOL-m are robust to temporal variations under the given parameters (C_I = 1e-1, C_u = 1e-1).
</details>
(a)
<details>
<summary>2404.11130v2/x28.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Chart Overview
- **Title**: `C_I = 1e3, C_u = 1e-1`
- **Type**: Line chart with three data series
- **Legend**: Located in the top-right corner
- **ODE**: Blue solid line
- **KOL-δ**: Red dashed line
- **KOL-m**: Purple dash-dot line
## Axis Labels
- **X-axis**: `Time (t)` (0 to 5 in integer increments)
- **Y-axis**: `Control value (u_opt)` (0.0 to 0.7 in 0.1 increments)
## Spatial Grounding
- **Legend Position**: Top-right quadrant
- **Line Color Consistency**:
- Blue = ODE (confirmed at all data points)
- Red = KOL-δ (confirmed at all data points)
- Purple = KOL-m (confirmed at all data points)
## Data Series Analysis
### ODE (Blue)
- **Trend**:
- Starts at 0.7 (t=0)
- Sharp drop to 0.45 (t=1)
- Fluctuates between 0.4–0.5 (t=1–5)
- **Key Points**:
- t=0: 0.7
- t=1: 0.45
- t=2: 0.45
- t=3: 0.45
- t=4: 0.4
- t=5: 0.38
### KOL-δ (Red)
- **Trend**:
- Starts at 0.7 (t=0)
- Sharp drop to 0.35 (t=1)
- Fluctuates between 0.3–0.5 (t=1–5)
- **Key Points**:
- t=0: 0.7
- t=1: 0.35
- t=2: 0.5
- t=3: 0.5
- t=4: 0.35
- t=5: 0.4
### KOL-m (Purple)
- **Trend**:
- Starts at 0.5 (t=0)
- Gradual drop to 0.35 (t=1)
- Fluctuates between 0.3–0.4 (t=1–5)
- **Key Points**:
- t=0: 0.5
- t=1: 0.45
- t=2: 0.4
- t=3: 0.35
- t=4: 0.35
- t=5: 0.35
## Critical Observations
1. **Initial Drop**: All series show a sharp decline between t=0 and t=1.
2. **ODE Stability**: ODE maintains the highest control values after t=1.
3. **KOL-δ Volatility**: KOL-δ exhibits the most pronounced fluctuations.
4. **KOL-m Consistency**: KOL-m shows the least variability after t=2.
## Technical Notes
- **Control Parameters**:
- `C_I = 1e3` (integral gain)
- `C_u = 1e-1` (control gain)
- **Visual Encoding**:
- Line styles differentiate algorithms (solid/dashed/dash-dot)
- Color coding matches legend entries exactly
</details>
(b)
<details>
<summary>2404.11130v2/x29.png Details</summary>

### Visual Description
# Technical Analysis of Control Value Over Time
## Chart Overview
The image depicts a step-plot chart illustrating the behavior of three control strategies over time. The chart is titled with system parameters:
**C_I = 1, C_u = 0**
---
## Axes and Labels
- **X-axis (Time, t):**
- Range: 0 to 5 (discrete intervals)
- Units: Dimensionless time steps
- **Y-axis (Control value, u_opt):**
- Range: 0.0 to 0.7
- Units: Dimensionless control value
---
## Legend and Control Strategies
The chart compares three control strategies, differentiated by line style and color:
1. **ODE** (Solid blue line)
2. **KOL-δ** (Dashed red line)
3. **KOL-m** (Dotted purple line)
---
## Key Data Trends
### ODE (Solid Blue)
- **t = 0 → 1:** Control value remains at **0.0**.
- **t = 1 → 3:** Control value steps up to **0.7** and remains constant.
- **t = 3 → 4:** Control value drops to **0.1** and remains constant.
- **t = 4 → 5:** Control value stays at **0.1**.
### KOL-δ (Dashed Red)
- **t = 0 → 1:** Control value starts at **0.7**, drops to **0.4** at t=1.
- **t = 1 → 3:** Fluctuates between **0.4** and **0.6**.
- **t = 3 → 4:** Drops to **0.1** and remains constant.
- **t = 4 → 5:** Control value decreases to **0.0**.
### KOL-m (Dotted Purple)
- **t = 0 → 1:** Control value starts at **0.4**, rises to **0.7** at t=1.
- **t = 1 → 3:** Remains at **0.7**.
- **t = 3 → 4:** Drops to **0.2** and remains constant.
- **t = 4 → 5:** Further decreases to **0.0**.
---
## System Parameters
- **C_I = 1:** Indicates a specific system configuration (e.g., input constraint).
- **C_u = 0:** Suggests no upper bound on control value (or a fixed reference).
---
## Observations
1. **ODE** exhibits the most abrupt transitions, maintaining high control values (0.7) for extended periods.
2. **KOL-δ** shows intermediate stability, with gradual adjustments and minor fluctuations.
3. **KOL-m** demonstrates the most gradual response, with delayed transitions and eventual convergence to 0.0.
---
## Technical Notes
- The chart uses dashed grid lines for reference, enhancing readability of step changes.
- All control strategies converge to **u_opt = 0.0** by t=5, indicating system stabilization under the given parameters.
</details>
(c)
<details>
<summary>2404.11130v2/x30.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Chart Description
The image depicts a line graph comparing the performance of three control strategies over time. The graph is titled **"C_I = 1, C_u = 1"**, indicating fixed parameters for the control system.
---
### Axis Labels
- **X-axis**: **Time (t)**
- Scale: 0 to 5 (linear increments of 1)
- Units: Dimensionless time steps
- **Y-axis**: **Control value (u_opt)**
- Scale: 0.0 to 0.7 (linear increments of 0.1)
- Units: Dimensionless control magnitude
---
### Legend
- **ODE**: Blue line
- **KOL-δ**: Red line
- **KOL-m**: Purple line
---
### Key Trends and Data Points
1. **Initial Response (t = 0 to t = 1.5)**
- **ODE (Blue)**:
- Rises sharply to ~0.02 at t = 0.5.
- Drops to 0.0 at t = 1.0.
- **KOL-δ (Red)**:
- Peaks at ~0.02 at t = 0.5.
- Drops to 0.0 at t = 1.0.
- **KOL-m (Purple)**:
- Remains at 0.0 throughout.
2. **Secondary Response (t = 1.5 to t = 2.5)**
- **ODE (Blue)**:
- Rises to ~0.01 at t = 1.75.
- Drops to 0.0 at t = 2.0.
- **KOL-δ (Red)**:
- Peaks at ~0.01 at t = 1.75.
- Drops to 0.0 at t = 2.0.
- **KOL-m (Purple)**:
- Remains at 0.0 throughout.
3. **Steady State (t = 2.5 to t = 5)**
- All strategies stabilize at **u_opt = 0.0**.
---
### Observations
- **ODE and KOL-δ** exhibit transient control actions with identical magnitude and timing, suggesting similar dynamic responses under the tested conditions.
- **KOL-m** shows no control action, remaining at the baseline value of 0.0.
- All strategies achieve zero control value by t = 2.5, indicating system stabilization.
---
### Cross-Referenced Legend Accuracy
- Blue (ODE) and red (KOL-δ) lines align with their respective legend entries.
- Purple (KOL-m) line consistently matches its legend label.
---
### Conclusion
The graph demonstrates that ODE and KOL-δ strategies produce identical transient control responses, while KOL-m remains inactive. All strategies converge to a steady state with no control action by t = 2.5.
</details>
(d)
<details>
<summary>2404.11130v2/x31.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Chart Title
- **Title**: Control Value Over Time
- **Parameters**:
- \( C_I = 1 \)
- \( C_u = 1e-2 \)
## Axes
- **X-axis (Time, \( t \))**:
- Range: \( 0 \leq t \leq 5 \)
- Increment: 1
- **Y-axis (Control Value, \( u_{opt} \))**:
- Range: \( 0 \leq u_{opt} \leq 0.7 \)
- Increment: 0.1
## Legend
- **ODE**: Blue line
- **KOL-δ**: Red line
- **KOL-m**: Purple line
## Key Trends and Data Points
1. **ODE (Blue)**:
- Starts at \( u_{opt} = 0 \) at \( t = 0 \).
- Rises to \( u_{opt} = 0.3 \) at \( t = 1 \).
- Peaks at \( u_{opt} = 0.4 \) at \( t = 2 \).
- Drops to \( u_{opt} = 0.1 \) at \( t = 3 \).
- Reaches \( u_{opt} = 0 \) at \( t = 4 \).
2. **KOL-δ (Red)**:
- Starts at \( u_{opt} = 0 \) at \( t = 0 \).
- Rises to \( u_{opt} = 0.1 \) at \( t = 1 \).
- Peaks at \( u_{opt} = 0.3 \) at \( t = 2 \).
- Drops to \( u_{opt} = 0.1 \) at \( t = 3 \).
- Reaches \( u_{opt} = 0 \) at \( t = 4 \).
3. **KOL-m (Purple)**:
- Starts at \( u_{opt} = 0 \) at \( t = 0 \).
- Rises to \( u_{opt} = 0.2 \) at \( t = 1 \).
- Peaks at \( u_{opt} = 0.5 \) at \( t = 2 \).
- Drops to \( u_{opt} = 0.1 \) at \( t = 3 \).
- Reaches \( u_{opt} = 0 \) at \( t = 4 \).
## Observations
- All strategies converge to \( u_{opt} = 0 \) by \( t = 5 \).
- **KOL-m** exhibits the highest peak (\( u_{opt} = 0.5 \)) at \( t = 2 \).
- **ODE** and **KOL-δ** show similar trajectories but with lower peak values.
- **KOL-m** demonstrates the most aggressive control response.
## Notes
- The chart uses dashed lines for all strategies, with no additional annotations or gridlines beyond the labeled axes.
- No data table or heatmap is present; all information is conveyed via line plots.
</details>
(e)
<details>
<summary>2404.11130v2/x32.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Chart Description
The image depicts a **stepwise line chart** illustrating the evolution of control values (`u_opt`) over time (`t`) for three distinct control strategies. The chart includes labeled axes, a legend, and embedded parameters.
---
### **Key Components**
1. **Title**:
`C_I = 1, C_u = 1e-6`
- Indicates system parameters:
- `C_I` (Integral gain) = 1
- `C_u` (Control coefficient) = 1 × 10⁻⁶
2. **Axes**:
- **X-axis (Time, `t`)**:
- Range: 0 to 5 (increments of 1).
- Labels: `0, 1, 2, 3, 4, 5`.
- **Y-axis (Control value, `u_opt`)**:
- Range: 0.0 to 0.7 (increments of 0.1).
- Labels: `0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7`.
3. **Legend**:
- **ODE**: Blue solid line.
- **KOL-δ**: Red dashed line.
- **KOL-m**: Purple dash-dot line.
---
### **Data Trends**
#### **1. ODE (Blue Line)**
- **Initial State**: `u_opt = 0.0` at `t = 0`.
- **Step Changes**:
- Jumps to `0.7` at `t = 0.5`.
- Drops to `0.4` at `t = 1.5`.
- Rises to `0.7` at `t = 2.5`.
- Falls to `0.3` at `t = 3.5`.
- Increases to `0.6` at `t = 4.5`.
- Drops to `0.1` at `t = 5`.
#### **2. KOL-δ (Red Dashed Line)**
- **Initial State**: `u_opt = 0.7` at `t = 0`.
- **Step Changes**:
- Drops to `0.3` at `t = 0.5`.
- Remains at `0.3` until `t = 1.5`.
- Jumps to `0.6` at `t = 2.5`.
- Falls to `0.4` at `t = 3.5`.
- Rises to `0.5` at `t = 4.5`.
- Drops to `0.1` at `t = 5`.
#### **3. KOL-m (Purple Dash-Dot Line)**
- **Initial State**: `u_opt = 0.3` at `t = 0`.
- **Step Changes**:
- Rises to `0.7` at `t = 0.5`.
- Drops to `0.5` at `t = 1.5`.
- Remains at `0.5` until `t = 2.5`.
- Falls to `0.3` at `t = 3.5`.
- Increases to `0.4` at `t = 4.5`.
- Drops to `0.05` at `t = 5`.
---
### **Critical Observations**
1. **Control Value Dynamics**:
- All strategies exhibit stepwise behavior, suggesting discrete control adjustments.
- **ODE** shows the most frequent oscillations, while **KOL-m** has the largest initial jump.
- All strategies converge to `u_opt ≈ 0.1` by `t = 5`.
2. **Parameter Influence**:
- The parameters `C_I = 1` and `C_u = 1e-6` likely govern the sensitivity and scaling of control adjustments.
3. **Legend Consistency**:
- Colors and line styles in the legend match the corresponding lines in the chart.
---
### **Summary**
The chart compares three control strategies (`ODE`, `KOL-δ`, `KOL-m`) under fixed parameters (`C_I = 1`, `C_u = 1e-6`). Control values evolve stepwise over time, with all strategies stabilizing near `u_opt = 0.1` by `t = 5`. The **ODE** strategy exhibits the highest variability, while **KOL-m** shows the largest initial adjustment.
</details>
(f)
Figure 10: Optimal controls for the three optimal control problems fixing $N=10$ .
<details>
<summary>2404.11130v2/x33.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Chart Description
This image depicts a line graph comparing control values over time for three different control strategies. The graph is titled with parameter values:
**C_I = 1e-1, C_u = 1e-1**
---
### Axes Labels
- **Y-Axis**:
`Control value (u_opt)`
Scale: 0.0 to 0.7 (increments of 0.1)
- **X-Axis**:
`Time (t)`
Scale: 0 to 5 (increments of 1)
---
### Legend Entries
| Color | Label | Strategy |
|--------|-----------|-----------|
| Blue | ODE | ODE |
| Red | KOL-δ | KOL-δ |
| Purple | KOL-m | KOL-m |
---
### Data Trends
1. **ODE (Blue Line)**:
- Flat line at `u_opt = 0.0` for all time (t = 0 to 5).
- No deviation observed.
2. **KOL-δ (Red Line)**:
- Flat line at `u_opt = 0.0` for all time (t = 0 to 5).
- No deviation observed.
3. **KOL-m (Purple Line)**:
- Flat line at `u_opt = 0.0` for all time (t = 0 to 5).
- No deviation observed.
---
### Key Observations
- All three control strategies maintain a constant control value of `0.0` throughout the observed time period (0 ≤ t ≤ 5).
- No temporal variation or interaction between strategies is evident.
- Parameter values `C_I = 1e-1` and `C_u = 1e-1` are constant across the system.
---
### Technical Notes
- The graph uses dashed gridlines for reference.
- Data points are represented as continuous lines without markers.
- No outliers or anomalies detected in the dataset.
</details>
(a)
<details>
<summary>2404.11130v2/x34.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Chart Title
- **Title**: `C_I = 1e3, C_u = 1e-1`
## Axes Labels
- **X-axis**: `Time (t)`
- Range: `0` to `5`
- Increment: `1`
- **Y-axis**: `Control value (u_opt)`
- Range: `0.0` to `0.7`
- Increment: `0.1`
## Legend
- **ODE**: Blue line
- **KOL-δ**: Red line
- **KOL-m**: Purple line
## Key Trends and Data Points
1. **ODE (Blue Line)**
- Initial value: `0.7` at `t = 0`
- Sharp drop to `0.45` at `t ≈ 0.5`
- Stabilizes between `0.45` and `0.5` from `t ≈ 1` to `t ≈ 4`
- Final value: `0.3` at `t = 5`
2. **KOL-δ (Red Line)**
- Initial value: `0.7` at `t = 0`
- Sharp drop to `0.3` at `t ≈ 0.5`
- Oscillatory behavior with peaks at `0.55` (e.g., `t ≈ 2.5`, `t ≈ 3.5`)
- Final value: `0.35` at `t = 5`
3. **KOL-m (Purple Line)**
- Initial value: `0.4` at `t = 0`
- Gradual decline to `0.35` at `t ≈ 4`
- Minor fluctuations around `0.35` to `0.4`
- Final value: `0.35` at `t = 5`
## Observations
- **ODE** exhibits the most significant initial drop and stabilization.
- **KOL-δ** shows erratic oscillations after the initial drop.
- **KOL-m** demonstrates the smoothest decline with minimal fluctuations.
- All lines converge near `0.35` at `t = 5`.
## Embedded Text
- **Title Parameters**:
- `C_I = 1e3` (Inflow coefficient)
- `C_u = 1e-1` (Control coefficient)
</details>
(b)
<details>
<summary>2404.11130v2/x35.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Chart Title
- **Title**: `C_I = 1, C_u = 0`
## Axes
- **X-axis (Horizontal)**:
- Label: `Time (t)`
- Range: `0` to `5`
- Ticks: Incremented by `1`
- **Y-axis (Vertical)**:
- Label: `Control value (u_opt)`
- Range: `0.0` to `0.7`
- Ticks: Incremented by `0.1`
## Legend
- **ODE**: Solid blue line
- **KOL-δ**: Dashed red line
- **KOL-m**: Dashed purple line
## Key Trends
1. **ODE (Blue Line)**:
- Starts at `0.7` at `t = 0`.
- Decreases stepwise to `0.0` by `t = 5`.
- Sharp transitions between discrete control values.
2. **KOL-δ (Red Dashed Line)**:
- Begins at `0.5` at `t = 0`.
- Exhibits frequent fluctuations between `0.1` and `0.7`.
- Ends at `0.1` at `t = 5`.
3. **KOL-m (Purple Dashed Line)**:
- Starts at `0.6` at `t = 0`.
- Remains relatively stable compared to KOL-δ, with minor fluctuations.
- Ends at `0.0` at `t = 5`.
## Observations
- All three control strategies exhibit stepwise behavior, suggesting discrete control actions.
- ODE demonstrates the most significant overall decline in control value.
- KOL-δ shows the highest variability, while KOL-m maintains moderate stability.
- All strategies converge to `0.0` by `t = 5`, indicating a shared endpoint in control value.
</details>
(c)
<details>
<summary>2404.11130v2/x36.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Chart Description
This image presents a line graph comparing control values over time for three different control strategies. The graph is titled with constant parameters:
**CI = 1e-1, Cu = 1e-1**
### Axes
- **X-axis (Time, t):**
- Range: 0 to 5 (unitless)
- Markers: Integer ticks at 0, 1, 2, 3, 4, 5
- **Y-axis (Control value, u_opt):**
- Range: 0.0 to 0.7 (unitless)
- Markers: Ticks at 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7
### Legend
- **ODE:** Blue line
- **KOL-δ:** Red line
- **KOL-m:** Purple line
### Data Trends
All three control strategies (ODE, KOL-δ, KOL-m) exhibit **flat, constant control values** throughout the observed time period (t = 0 to t = 5).
- **Control value:** 0.0 for all strategies
- **Time duration:** No variation observed
### Key Observations
1. All control strategies maintain identical performance (u_opt = 0.0) across the entire time window.
2. No dynamic behavior or time-dependent changes are evident in any strategy.
3. The graph suggests a static system or equilibrium state under the given parameters (CI = 1e-1, Cu = 1e-1).
### Technical Notes
- The absence of vertical displacement in any line indicates no sensitivity to time for the control value under the tested conditions.
- The flat lines imply perfect stability or a null control input requirement for the system modeled.
</details>
(d)
<details>
<summary>2404.11130v2/x37.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Analysis
## Title
Control Value Analysis with Parameters:
**CI = 1**, **Cu = 1e-2**
---
## Axis Labels
- **X-axis**: Time (t)
- Range: 0 to 5 (increments of 1)
- **Y-axis**: Control value (u_opt)
- Range: 0 to 0.7 (increments of 0.1)
---
## Legend
| Color | Label |
|--------|-----------|
| Blue | ODE |
| Red | KOL-δ |
| Purple | KOL-m |
---
## Key Trends and Data Points
### ODE (Blue Line)
1. **Initial State**: Starts at 0.0 at t=0.
2. **First Jump**: Rises to 0.2 at t=1.
3. **Second Jump**: Increases to 0.4 at t=1.5.
4. **Drop**: Decreases to 0.3 at t=2.5.
5. **Final Drop**: Falls to 0.0 at t=5.
### KOL-δ (Red Line)
1. **Initial State**: Starts at 0.05 at t=0.
2. **First Jump**: Rises to 0.15 at t=1.
3. **Second Jump**: Increases to 0.3 at t=2.
4. **Drop**: Decreases to 0.2 at t=3.
5. **Final State**: Ends at 0.25 at t=5.
### KOL-m (Purple Line)
1. **Initial State**: Starts at 0.0 at t=0.
2. **First Jump**: Rises to 0.2 at t=2.
3. **Peak**: Reaches 0.5 at t=3.
4. **Drop**: Decreases to 0.1 at t=4.
5. **Final State**: Ends at 0.0 at t=5.
---
## Critical Observations
1. **ODE** exhibits stepwise behavior with sharp transitions.
2. **KOL-δ** shows gradual increases and decreases with a plateau at t=3.
3. **KOL-m** has the highest peak (0.5) at t=3, followed by a sharp decline.
4. All strategies converge to 0.0 by t=5, except KOL-δ, which stabilizes at 0.25.
---
## Parameters
- **CI = 1**: Likely a coupling or interaction coefficient.
- **Cu = 1e-2**: Possibly a control update rate or damping factor.
---
## Notes
- The graph uses dashed lines for KOL-δ and KOL-m, while ODE is solid.
- All lines are plotted on a Cartesian coordinate system with gridlines for clarity.
- No overlapping data points are observed between strategies.
</details>
(e)
<details>
<summary>2404.11130v2/x38.png Details</summary>

### Visual Description
# Technical Document Extraction: Control Value Over Time
## Graph Description
The image depicts a line graph illustrating the **control value (u_opt)** over **time (t)** for three distinct control strategies: **ODE**, **KOL-δ**, and **KOL-m**. The graph includes labeled axes, a legend, and numerical annotations for key data points.
---
### **Axes and Labels**
- **X-axis (Time, t):**
- Range: 0 to 5 (discrete intervals).
- Units: Time (t).
- **Y-axis (Control value, u_opt):**
- Range: 0.0 to 0.7.
- Units: Control value (u_opt).
---
### **Legend**
- **Colors and Corresponding Strategies:**
- **Blue (solid line):** ODE
- **Red (dashed line):** KOL-δ
- **Purple (dash-dot line):** KOL-m
---
### **Key Data Points and Trends**
1. **ODE (Blue Line):**
- **Initial Value (t=0):** 0.7.
- **First Drop (t=1):** 0.3.
- **Second Drop (t=3):** 0.1.
- **Final Value (t=5):** 0.0.
- **Behavior:** Sharp, stepwise decreases at t=1 and t=3.
2. **KOL-δ (Red Dashed Line):**
- **Initial Value (t=0):** 0.2.
- **Rise (t=2):** 0.6.
- **Drop (t=4):** 0.1.
- **Final Value (t=5):** 0.0.
- **Behavior:** Stepwise increase at t=2, followed by a drop at t=4.
3. **KOL-m (Purple Dash-Dot Line):**
- **Initial Value (t=0):** 0.0.
- **Rise (t=2):** 0.7.
- **Drop (t=4):** 0.05.
- **Final Value (t=5):** 0.0.
- **Behavior:** Sharp increase at t=2, followed by a gradual decline.
---
### **Parameters**
- **C_I = 1**
- **C_u = 1e-6**
---
### **Observations**
- All strategies converge to **u_opt = 0.0** by t=5.
- **ODE** exhibits the most aggressive control adjustments, with the largest initial drop.
- **KOL-m** achieves the highest control value (0.7) but decays rapidly.
- **KOL-δ** balances moderate increases and decreases.
---
### **Technical Notes**
- The graph uses **step-like transitions**, indicating discrete control adjustments rather than continuous changes.
- Numerical annotations on the graph confirm exact control values at critical time points.
- The legend and axis labels are explicitly defined, ensuring clarity in interpreting the strategies and metrics.
</details>
(f)
Figure 11: Optimal controls for the three optimal control problems fixing $N=20$ .
<details>
<summary>2404.11130v2/x39.png Details</summary>

### Visual Description
# Technical Document Extraction: Scatter Plot Analysis
## Plot Overview
The image is a **log-log scatter plot** comparing three datasets across varying input parameters. The y-axis represents a normalized integral metric, while the x-axis categorizes input combinations of `C_I` (integration coefficient) and `C_u` (uncertainty coefficient).
---
## Axis Labels and Titles
- **Y-Axis**:
`C_I ∫₀ᵀ I²(t)dt + C_u ∫₀ᵀ u²(t)dt`
(Normalized integral of squared error terms, logarithmic scale from 10⁻⁷ to 10⁻¹)
- **X-Axis**:
Categorical values of `C_I` and `C_u`:
- `C_I = 1e-1, 1, 1e1, 1e2, 1e3, 1e4, 1`
- `C_u = 1e-1, 1, 0`
(Note: `C_u = 0` appears only at `C_I = 1`)
---
## Legend and Data Series
1. **ODE** (Blue Circles):
- Represents baseline model performance.
- Data points cluster tightly along the lower bound of the plot.
- At `C_I = 1e4, C_u = 1e-1`, value ≈ 10⁻².
- At `C_I = 1, C_u = 0`, value ≈ 10⁻⁶.
2. **KOL-δ** (Red Triangles):
- Slightly higher than ODE in most cases.
- At `C_I = 1e4, C_u = 1e-1`, value ≈ 10⁻².
- At `C_I = 1, C_u = 0`, value ≈ 10⁻⁵.
3. **KOL-m** (Purple Squares):
- Consistently the highest values across all categories.
- At `C_I = 1e4, C_u = 1e-1`, value ≈ 10⁻¹.
- At `C_I = 1, C_u = 0`, value ≈ 10⁻⁷.
---
## Key Trends
- **Performance Hierarchy**:
`KOL-m > KOL-δ > ODE` for most parameter combinations.
- **Parameter Sensitivity**:
- Higher `C_I` values (e.g., `1e4`) correlate with increased metric values.
- `C_u = 0` (no uncertainty) results in the lowest metric values for all models.
- **Notable Outlier**:
At `C_I = 1, C_u = 0`, KOL-m drops to 10⁻⁷, significantly lower than its performance at other `C_I` values.
---
## Data Point Cross-Reference
| Category | ODE (10⁻ⁿ) | KOL-δ (10⁻ⁿ) | KOL-m (10⁻ⁿ) |
|-------------------|------------|--------------|--------------|
| `C_I=1e-1, C_u=1e-1` | 10⁻⁴ | 10⁻⁴ | 10⁻⁴ |
| `C_I=1, C_u=1e-1` | 10⁻³ | 10⁻³ | 10⁻³ |
| `C_I=1e1, C_u=1e-1` | 10⁻² | 10⁻² | 10⁻² |
| `C_I=1e2, C_u=1e-1` | 10⁻² | 10⁻² | 10⁻² |
| `C_I=1e3, C_u=1e-1` | 10⁻² | 10⁻² | 10⁻¹ |
| `C_I=1e4, C_u=1e-1` | 10⁻² | 10⁻² | 10⁻¹ |
| `C_I=1, C_u=0` | 10⁻⁶ | 10⁻⁵ | 10⁻⁷ |
---
## Notes
- All datasets share identical `C_u` values except for the final category (`C_u = 0`).
- The plot uses a **logarithmic scale** for both axes, emphasizing multiplicative differences.
- No gridlines or annotations beyond the legend and axis labels are present.
</details>
(a)
<details>
<summary>2404.11130v2/x40.png Details</summary>

### Visual Description
# Technical Document Extraction: Scatter Plot Analysis
## Chart Description
The image is a **scatter plot** comparing three methods (ODE, KOL-δ, KOL-m) across varying parameter combinations of \( C_I \) and \( C_u \). The y-axis represents a normalized integral metric, while the x-axis encodes parameter values.
---
### **Axis Labels and Scales**
- **Y-Axis**:
\( C_I \int_0^T I^2(t)dt + C_u \int_0^T u^2(t)dt \)
Logarithmic scale: \( 10^{-5} \) to \( 10^{-3} \).
- **X-Axis**:
Parameter combinations of \( C_I \) and \( C_u \):
- \( C_I = 1 \), \( C_u = 1 \)
- \( C_I = 1 \), \( C_u = 1e-1 \)
- \( C_I = 1 \), \( C_u = 1e-2 \)
- \( C_I = 1 \), \( C_u = 1e-3 \)
- \( C_I = 1 \), \( C_u = 1e-4 \)
- \( C_I = 1 \), \( C_u = 1e-5 \)
- \( C_I = 1 \), \( C_u = 1e-6 \)
---
### **Legend and Markers**
- **ODE**: Blue circles (●)
- **KOL-δ**: Red triangles (▼)
- **KOL-m**: Purple squares (■)
---
### **Data Points and Trends**
1. **\( C_I = 1 \), \( C_u = 1 \)**
- ODE: \( \sim 10^{-3} \)
- KOL-δ: \( \sim 10^{-3} \)
- KOL-m: \( \sim 10^{-3} \)
2. **\( C_I = 1 \), \( C_u = 1e-1 \)**
- ODE: \( \sim 10^{-3} \)
- KOL-δ: \( \sim 10^{-3} \)
- KOL-m: \( \sim 10^{-3} \)
3. **\( C_I = 1 \), \( C_u = 1e-2 \)**
- ODE: \( \sim 10^{-3} \)
- KOL-δ: \( \sim 10^{-3} \)
- KOL-m: \( \sim 10^{-3} \)
4. **\( C_I = 1 \), \( C_u = 1e-3 \)**
- ODE: \( \sim 10^{-4} \)
- KOL-δ: \( \sim 10^{-4} \)
- KOL-m: \( \sim 10^{-3} \)
5. **\( C_I = 1 \), \( C_u = 1e-4 \)**
- ODE: \( \sim 10^{-5} \)
- KOL-δ: \( \sim 10^{-5} \)
- KOL-m: \( \sim 10^{-3} \)
6. **\( C_I = 1 \), \( C_u = 1e-5 \)**
- ODE: \( \sim 10^{-5} \)
- KOL-δ: \( \sim 10^{-5} \)
- KOL-m: \( \sim 10^{-5} \)
7. **\( C_I = 1 \), \( C_u = 1e-6 \)**
- ODE: \( \sim 10^{-5} \)
- KOL-δ: \( \sim 10^{-5} \)
- KOL-m: \( \sim 10^{-5} \)
---
### **Key Observations**
- **ODE** (blue circles) consistently shows the highest values across all parameter combinations.
- **KOL-δ** (red triangles) and **KOL-m** (purple squares) exhibit similar trends but with slightly lower magnitudes than ODE.
- As \( C_u \) decreases (e.g., \( C_u = 1e-6 \)), all methods converge toward lower y-axis values, suggesting diminishing performance with reduced \( C_u \).
- The y-axis metric decreases exponentially with increasing \( C_u \), indicating sensitivity to \( C_u \) adjustments.
---
### **Cross-Referenced Legend Accuracy**
- Blue circles (ODE) align with the highest y-axis values.
- Red triangles (KOL-δ) and purple squares (KOL-m) are consistently lower, with KOL-m occasionally matching KOL-δ at extreme \( C_u \) values.
</details>
(b)
Figure 12: Cost functionals at the optimal control for different $C_{I}$ and $C_{u}$ ( $N=10$ ). (a) We consider different orders of magnitude for $C_{I}$ , keeping $C_{u}=1e-1$ . (b) We consider different orders of magnitude for $C_{u}$ , keeping $C_{I}=1$ .
<details>
<summary>2404.11130v2/x41.png Details</summary>

### Visual Description
# Technical Document Extraction: Scatter Plot Analysis
## Axis Labels and Titles
- **Y-Axis**:
$ C_I \int_0^T I^2(t)dt + C_u \int_0^T u^2(t)dt $
(Logarithmic scale: $ 10^{-6} $ to $ 10^0 $)
- **X-Axis**:
Categories for $ C_I $ and $ C_u $:
- $ C_I = 1e-1, 1, 1e1, 1e2, 1e3, 1e4, 1 $
- $ C_u = 1e-1, 1e-1, 1e-1, 1e-1, 1e-1, 1e-1, 0 $
## Legend
- **ODE**: Blue circles (●)
- **KOL-δ**: Red triangles (▼)
- **KOL-m**: Purple squares (■)
## Data Points
| $ C_I $ | $ C_u $ | ODE (●) | KOL-δ (▼) | KOL-m (■) |
|----------|----------|---------|-----------|-----------|
| 1e-1 | 1e-1 | 1e-4 | 1e-4 | 1e-4 |
| 1 | 1e-1 | 1e-3 | 1e-3 | 1e-3 |
| 1e1 | 1e-1 | 1e-2 | 1e-2 | 1e-2 |
| 1e2 | 1e-1 | 1e-1 | 1e-1 | 1e-1 |
| 1e3 | 1e-1 | 1e-1 | 1e-1 | 1e-1 |
| 1e4 | 1e-1 | 1e-1 | 1e-1 | 1e-1 |
| 1 | 0 | 1e-5 | 1e-6 | 1e-7 |
## Key Observations
1. **ODE (●)**:
- Data points decrease in magnitude as $ C_I $ increases.
- At $ C_I = 1 $, $ C_u = 0 $, the value drops to $ 10^{-5} $.
2. **KOL-δ (▼)**:
- Similar trend to ODE but with slightly lower values at $ C_I = 1 $, $ C_u = 0 $ ($ 10^{-6} $).
3. **KOL-m (■)**:
- Highest values at $ C_I = 1e4 $, $ C_u = 1e-1 $ ($ 10^{-1} $).
- At $ C_I = 1 $, $ C_u = 0 $, the value is $ 10^{-7} $, the lowest among all methods.
## Grid and Formatting
- Dashed grid lines for reference.
- Logarithmic y-axis ensures exponential scale representation.
- X-axis categories are explicitly labeled with $ C_I $ and $ C_u $ values.
</details>
(a)
<details>
<summary>2404.11130v2/x42.png Details</summary>

### Visual Description
# Technical Document Extraction: Scatter Plot Analysis
## **Chart Overview**
- **Type**: Scatter plot with log-scaled y-axis.
- **Purpose**: Comparison of three methods (ODE, KOL-δ, KOL-m) across varying parameters \( C_I \) and \( C_u \).
---
## **Axis Labels and Markers**
### **Y-Axis**
- **Title**:
\[
C_I \int_0^T I^2(t)dt + C_u \int_0^T u^2(t)dt
\]
- **Scale**: Logarithmic (base 10), ranging from \( 10^{-3} \) to \( 10^{-6} \).
- **Tick Labels**: \( 10^{-3}, 10^{-4}, 10^{-5}, 10^{-6} \).
### **X-Axis**
- **Categories**: Combinations of \( C_I \) and \( C_u \) values:
- \( C_I = 1 \), \( C_u = 1 \)
- \( C_I = 1 \), \( C_u = 1e^{-1} \)
- \( C_I = 1 \), \( C_u = 1e^{-2} \)
- \( C_I = 1 \), \( C_u = 1e^{-3} \)
- \( C_I = 1 \), \( C_u = 1e^{-4} \)
- \( C_I = 1 \), \( C_u = 1e^{-5} \)
- \( C_I = 1 \), \( C_u = 1e^{-6} \)
---
## **Legend**
- **Entries**:
- **ODE**: Blue circles (\( \circ \)).
- **KOL-δ**: Red triangles (\( \triangle \)).
- **KOL-m**: Purple squares (\( \square \)).
---
## **Data Points and Trends**
1. **ODE (Blue Circles)**:
- Highest values across all \( C_u \) levels for a given \( C_I \).
- Example: At \( C_I = 1 \), \( C_u = 1 \), value ≈ \( 10^{-3} \).
- Decreases monotonically as \( C_u \) decreases.
2. **KOL-δ (Red Triangles)**:
- Slightly higher than ODE at \( C_u = 1 \), \( C_u = 1e^{-1} \), and \( C_u = 1e^{-2} \).
- Converges with ODE at lower \( C_u \) values (e.g., \( C_u = 1e^{-3} \)).
3. **KOL-m (Purple Squares)**:
- Consistently the lowest values across all \( C_u \).
- Example: At \( C_I = 1 \), \( C_u = 1e^{-6} \), value ≈ \( 10^{-6} \).
---
## **Key Observations**
- **Method Performance**:
- KOL-m outperforms ODE and KOL-δ in minimizing the integral \( C_I \int_0^T I^2(t)dt + C_u \int_0^T u^2(t)dt \).
- ODE and KOL-δ show similar trends, with KOL-δ marginally better at higher \( C_u \).
- **Parameter Sensitivity**:
- All methods exhibit reduced performance (higher integral values) as \( C_u \) decreases, indicating sensitivity to \( C_u \).
---
## **Cross-Reference Validation**
- **Legend Colors vs. Markers**:
- Blue circles (ODE) match all blue data points.
- Red triangles (KOL-δ) align with red markers.
- Purple squares (KOL-m) correspond to purple markers.
- **Axis Consistency**: All \( C_I = 1 \) labels are correctly paired with their respective \( C_u \) values.
---
## **Conclusion**
The plot demonstrates that KOL-m achieves the lowest integral values across all tested \( C_u \) levels, suggesting superior performance in the evaluated metric. ODE and KOL-δ show comparable results, with KOL-δ slightly outperforming ODE at higher \( C_u \) values.
</details>
(b)
Figure 13: Cost functionals at the optimal control for different $C_{I}$ and $C_{u}$ ( $N=20$ ). (a) We consider different orders of magnitude for $C_{I}$ , keeping $C_{u}=1e-1$ . (b) We consider different orders of magnitude for $C_{u}$ , keeping $C_{I}=1$ .
References
- [1] Martcheva M. An introduction to mathematical epidemiology. vol. 61. Springer; 2015.
- [2] Brauer F, Castillo-Chavez C, Feng Z, et al. Mathematical models in epidemiology. vol. 32. Springer; 2019.
- [3] Giordano G, Blanchini F, Bruno R, Colaneri P, Di Filippo A, Di Matteo A, et al. Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy. Nature medicine. 2020;26(6):855-60.
- [4] Parolini N, Ardenghi G, Quarteroni A, et al. Modelling the COVID-19 epidemic and the vaccination campaign in Italy by the SUIHTER model. Infectious Disease Modelling. 2022;7(2):45-63.
- [5] Bertuzzo E, Mari L, Pasetto D, Miccoli S, Casagrandi R, Gatto M, et al. The geography of COVID-19 spread in Italy and implications for the relaxation of confinement measures. Nature communications. 2020;11(1):42-64.
- [6] Gozzi N, Chinazzi M, Davis JT, Mu K, Pastore y Piontti A, Ajelli M, et al. Anatomy of the first six months of COVID-19 vaccination campaign in Italy. PLoS Computational Biology. 2022;18(5):e1010146.
- [7] He M, Tang S, Xiao Y. Combining the dynamic model and deep neural networks to identify the intensity of interventions during COVID-19 pandemic. PLOS Computational Biology. 2023;19(10):e1011535.
- [8] Ziarelli G, Parolini N, Verani M, Quarteroni A, et al. Optimized numerical solutions of SIRDVW multiage model controlling SARS-CoV-2 vaccine roll out: An application to the Italian scenario. Infectious Disease Modelling. 2023;8(3):672-703.
- [9] Lemaitre JC, Pasetto D, Zanon M, Bertuzzo E, Mari L, Miccoli S, et al. Optimal control of the spatial allocation of COVID-19 vaccines: Italy as a case study. PLoS computational biology. 2022;18(7):e1010237.
- [10] Britton T, Leskelä L. Optimal intervention strategies for minimizing total incidence during an epidemic. SIAM Journal on Applied Mathematics. 2023;83(2):354-73.
- [11] Dimarco G, Toscani G, Zanella M. Optimal control of epidemic spreading in the presence of social heterogeneity. Philosophical Transactions of the Royal Society A. 2022;380(2224):20210160.
- [12] Morris DH, Rossine FW, Plotkin JB, Levin SA. Optimal, near-optimal, and robust epidemic control. Communications Physics. 2021;4(1):78.
- [13] Bolzoni L, Bonacini E, Soresina C, Groppi M. Time-optimal control strategies in SIR epidemic models. Mathematical biosciences. 2017;292:86-96.
- [14] Lu L, Jin P, Pang G, Zhang Z, Karniadakis GE. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature machine intelligence. 2021;3(3):218-29.
- [15] Goswami S, Yin M, Yu Y, Karniadakis GE. A physics-informed variational DeepONet for predicting crack path in quasi-brittle materials. Computer Methods in Applied Mechanics and Engineering. 2022;391:114587.
- [16] Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K, Stuart A, et al. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:201008895. 2020.
- [17] Batlle P, Darcy M, Hosseini B, Owhadi H. Kernel methods are competitive for operator learning. Journal of Computational Physics. 2024;496:112549.
- [18] Genton MG. Classes of kernels for machine learning: a statistics perspective. Journal of machine learning research. 2001;2(Dec):299-312.
- [19] Lee J, Schoenholz S, Pennington J, Adlam B, Xiao L, Novak R, et al. Finite versus infinite neural networks: an empirical study. Advances in Neural Information Processing Systems. 2020;33:15156-72.
- [20] Jacot A, Gabriel F, Hongler C. Neural tangent kernel: Convergence and generalization in neural networks. Advances in neural information processing systems. 2018;31.
- [21] Arora S, Du SS, Hu W, Li Z, Salakhutdinov RR, Wang R. On exact computation with an infinitely wide neural net. Advances in neural information processing systems. 2019;32.
- [22] Adlam B, Lee J, Padhy S, Nado Z, Snoek J. Kernel regression with infinite-width neural networks on millions of examples. arXiv preprint arXiv:230305420. 2023.
- [23] Carmeli C, De Vito E, Toigo A. Vector valued reproducing kernel Hilbert spaces of integrable functions and Mercer theorem. Analysis and Applications. 2006;4(04):377-408.
- [24] Fleming WH, Rishel RW. Deterministic and stochastic optimal control. vol. 1. Springer Science & Business Media; 2012.
- [25] Parolini N, Dede’ L, Antonietti PF, Ardenghi G, Manzoni A, Miglio E, et al. SUIHTER: A new mathematical model for COVID-19. Application to the analysis of the second epidemic outbreak in Italy. Proceedings of the Royal Society A. 2021;477(2253):20210027.
- [26] Marziano V, Guzzetta G, Mammone A, Riccardo F, Poletti P, Trentini F, et al. The effect of COVID-19 vaccination in Italy and perspectives for living with the virus. Nature communications. 2021;12(1):7272.
- [27] Yong J, Zhou XY. Stochastic controls: Hamiltonian systems and HJB equations. vol. 43. Springer Science & Business Media; 2012.
- [28] Sherratt K, Gruson H, Johnson H, Niehus R, Prasse B, Sandmann F, et al. Predictive performance of multi-model ensemble forecasts of COVID-19 across European nations. Elife. 2023;12:e81916.
- [29] Meanti G, Carratino L, Rosasco L, Rudi A. Kernel methods through the roof: handling billions of points efficiently. Advances in Neural Information Processing Systems. 2020;33:14410-22.
- [30] Owhadi H, Yoo GR. Kernel flows: From learning kernels from data into the abyss. Journal of Computational Physics. 2019;389:22-47.
- [31] De Marchi S, Erb W, Marchetti F, Perracchione E, Rossini M. Shape-driven interpolation with discontinuous kernels: Error analysis, edge extraction, and applications in magnetic particle imaging. SIAM Journal on Scientific Computing. 2020;42(2):B472-91.
- [32] Gulian M, Raissi M, Perdikaris P, Karniadakis G. Machine learning of space-fractional differential equations. SIAM Journal on Scientific Computing. 2019;41(4):A2485-509.
- [33] Novak R, Xiao L, Hron J, Lee J, Alemi AA, Sohl-Dickstein J, et al. Neural Tangents: Fast and Easy Infinite Neural Networks in Python. In: International Conference on Learning Representations; 2019. .
- [34] Regazzoni F, Dede L, Quarteroni A. Machine learning for fast and reliable solution of time-dependent differential equations. Journal of Computational physics. 2019;397:108852.
- [35] Regazzoni F, Salvador M, Dedè L, Quarteroni A. A machine learning method for real-time numerical simulations of cardiac electromechanics. Computer methods in applied mechanics and engineering. 2022;393:114825.
- [36] Regazzoni F, Dedè L, Quarteroni A. Machine learning of multiscale active force generation models for the efficient simulation of cardiac electromechanics. Computer Methods in Applied Mechanics and Engineering. 2020;370:113268.
- [37] Nocedal J, Wright SJ. Numerical optimization. Springer; 1999.
- [38] Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods. 2020;17(3):261-72.
- [39] Du SS, Zhai X, Poczos B, Singh A. Gradient descent provably optimizes over-parameterized neural networks. arXiv preprint arXiv:181002054. 2018.
- [40] Lee J, Xiao L, Schoenholz S, Bahri Y, Novak R, Sohl-Dickstein J, et al. Wide neural networks of any depth evolve as linear models under gradient descent. Advances in neural information processing systems. 2019;32.
- [41] Schölkopf B, Herbrich R, Smola AJ. A generalized representer theorem. In: International conference on computational learning theory. Springer; 2001. p. 416-26.
- [42] Alvarez MA, Rosasco L, Lawrence ND, et al. Kernels for vector-valued functions: A review. Foundations and Trends® in Machine Learning. 2012;4(3):195-266.
Appendix A Summary of Neural Tangent Kernel
We introduce the Neural Tangent Kernel in the specific context of neural networks minimizing the Mean Square Error in the input dataset. Consider a set of input data (training set)
$$
\mathcal{D}=\{(x_{1},y_{1}),(x_{2},y_{2})\ldots(x_{n},y_{n})\}, \tag{22}
$$
where $\{x_{i}\}_{i}∈\mathbb{R}^{d}$ represents the sequence of input data, and $\{y_{i}\}_{i}∈\mathbb{R}$ the respective labels. Let $f(x;\theta)$ be the neural network regressor, whose weights $\theta$ need to be trained solving the minimization problem
$$
\min_{\theta\in\mathbb{R}^{n_{\theta}}}L(\theta)=\sum_{i=1}^{n}\dfrac{1}{2}(f(%
x_{i};\theta)-y_{i})^{2}=\sum_{i=1}^{n}l(f(x_{i};\theta),y_{i}). \tag{23}
$$
We solve the optimization problem via gradient flow, which is the continuous counterpart of the usual full-batch gradient descent method used in machine learning,
$$
\partial_{\tau}\theta(\tau)=-\partial_{\theta}L(\theta(\tau))=-\sum_{i=1}^{n}%
\partial_{\theta}f(x_{i};\theta(\tau))(f(x_{i};\theta(\tau))-y_{i}), \tag{24}
$$
where $\tau$ is a fictious time variable accounting for the iteration progress. Therefore,
$$
\begin{split}\partial_{\tau}f(x_{j};\theta(\tau))&=\partial_{\theta}f(x_{j};%
\theta(\tau))\partial_{\tau}\theta(\tau)\\
&=-\sum_{i=1}^{n}\partial_{\theta}f(x_{j};\theta(\tau))^{T}\partial_{\theta}f(%
x_{i};\theta(\tau))(f(x_{i};\theta(\tau))-y_{i}),\end{split} \tag{25}
$$
and we define the Neural Tangent Kernel $K:\mathbb{R}^{d}×\mathbb{R}^{d}→\mathbb{R}$ as
$$
K_{\tau}(x_{i},x_{j}):=\langle\partial_{\theta}f(x_{j};\theta(\tau)),\,%
\partial_{\theta}f(x_{i};\theta(\tau))\rangle. \tag{26}
$$
This kernel is symmetric and positive semi-definite by construction. We remark that the NTK depends on the specific topology of the considered neural network and on the choice of the activation function. Considering wider neural networks, it can be proven [39, 40] that
$$
K_{\tau}(x_{i},x_{j})\underset{\mathbb{P}}{\rightarrow}K^{\infty}(x_{i},x_{j}), \tag{27}
$$
where in the infinite-width limit $K^{∞}(x_{i},x_{j})≈ K_{0}(x_{i},x_{j})$ , meaning that the average infinite-width limiting NTK is close to the NTK computed with weights and biases at the initialization. Therefore, during the training processes of wide neural networks the trajectories of the cost functional ruled by $K_{\tau}$ are close to the approximated linearized ones (ruled by $K_{∞}$ ).
Appendix B KOL: algebraic derivation
We set the learning problem in the specific Reproducing Kernel Hilbert Spaces (RKHS) framework, so to employ the tools deriving from the Kernel Regression theory, thus enhancing the computational efficiency and capitalizing on the linearity of the resulting functional framework. Let $\mathcal{U}$ be an RKHS of functions $u:\Omega→\mathbb{R}$ where the kernel introduced is $Q:\Omega×\Omega→\mathbb{R}$ , and $\mathcal{V}$ is an RKHS of functions $v:D→\mathbb{R}$ endowed with the following kernel $K:D× D→\mathbb{R}$ . Then, $\psi$ and $\chi$ are defined as optimal recovery maps as
$$
\begin{split}\psi(U)&:=\operatorname*{arg\,min}_{w\in\mathcal{U}}\|w\|_{Q}\;\;%
\mathrm{s.t.}\;\;\phi(w)=U,\\
\chi(V)&:=\operatorname*{arg\,min}_{w\in\mathcal{V}}\|w\|_{K}\;\;\mathrm{s.t.}%
\;\;\phi(w)=V,\\
\end{split} \tag{28}
$$
with $\|w\|_{Q}=\sqrt{\langle\langle w,Q(·,{t})\rangle,\langle w,Q(·,{t})%
\rangle\rangle}$ and $\|w\|_{K}=\sqrt{\langle\langle w,K(·,t)\rangle,\langle w,K(·,t)\rangle\rangle}$ are the norms defined in the respective Hilbert spaces exploiting the kernel reproducibility property. Assuming that $\phi$ and $\varphi$ are pointwise evaluations at specific collocation points as previously discussed, the optimal recovery maps have explicit closed forms deriving from kernel interpolation theory [41]:
$$
\psi(U)({t})=Q({t},{T})Q({T},{T})^{-1}U,\;\;\chi(V)(t)=K(t,T)K(T,T)^{-1}V, \tag{29}
$$
where $Q({T},{T})$ and $K(T,T)$ are symmetric and definite positive matrices such that $Q({T},{T})_{ij}=Q({T}_{i},{T}_{j})$ and $K(T,T)_{ij}=K(T_{i},T_{j})$ , whilst $Q({t},{T})_{i}=Q({t},{T}_{i})$ and $K(t,T)_{i}=K(t,T_{i})$ represent row-vectors. Following the Kernel-based approach proposed in [17], the operator learning scheme simplifies to determine an approximation of the mapping between two finite-dimensional Euclidean spaces $f^{\dagger}:\mathbb{R}^{n}→\mathbb{R}^{n}$ , defined as
$$
f^{\dagger}:=\varphi\circ\mathcal{G}\circ\psi, \tag{30}
$$
cf. Figure 1, where the reconstruction maps $\psi:\mathbb{R}^{n}→\mathcal{U}$ and $\chi:\mathbb{R}^{n}→\mathcal{V}$ need to be properly defined.
Finally, we aim at approximating the $f^{\dagger}$ function through the use of a vector-valued kernel. Indeed, consider $\Gamma:\mathbb{R}^{n}×\mathbb{R}^{n}→\mathcal{L}(\mathbb{R}^{n})$ the matrix valued kernel following the notation of [42]. We call the RKHS induced by the considered kernel $\mathcal{H}_{\Gamma}$ , and the respective induced norm is $\|·\|_{\mathcal{H}_{\Gamma}}$ . Hence, $f^{\dagger}$ can be approximated with the map $\bar{f}$ solving the following optimization problem:
$$
\bar{f}:=\operatorname*{arg\,min}_{f\in\mathcal{H}_{\Gamma}}\|f\|_{\Gamma}\;\;%
\mathrm{s.t.}\;\;f(\phi(u_{i}))=\varphi(v_{i}),\;i=1,2\ldots N. \tag{31}
$$
We introduce the following block vectors U and V as
$$
\textbf{U}=\begin{pmatrix}U_{1}\\
U_{2}\\
\vdots\\
U_{N-1}\\
U_{N}\end{pmatrix}\in\mathbb{R}^{n\,N},\;\textbf{V}=\begin{pmatrix}V_{1}\\
V_{2}\\
\vdots\\
V_{N-1}\\
V_{N}\end{pmatrix}\in\mathbb{R}^{n\,N}, \tag{32}
$$
where $\{U_{i}\}_{i}:=\{\phi(u_{i})\}_{i},\,∀\,u_{i}∈\mathcal{U}$ and $\{V_{i}\}_{i}:=\{\varphi(v_{i})\}_{i},\,∀\,v_{i}∈\mathcal{V}$ . Then, we define with a slight abuse of notation the matrix $\Gamma:\mathbb{R}^{nN}×\mathbb{R}^{nN}→\mathbb{R}^{nN× nN}$ as
$$
\Gamma(\textbf{U},\textbf{U})=\begin{bmatrix}\Gamma(U_{1},U_{1})&\Gamma(U_{1},%
U_{2})&\ldots&\Gamma(U_{1},U_{N})\\
\Gamma(U_{2},U_{1})&\Gamma(U_{2},U_{2})&\ldots&\Gamma(U_{2},U_{N})\\
\vdots&\vdots&&\vdots\\
\Gamma(U_{N},U_{1})&\Gamma(U_{N},U_{2})&\ldots&\Gamma(U_{N},U_{N})\\
\end{bmatrix}, \tag{33}
$$
where each $\Gamma(U_{i},U_{j})∈\mathbb{R}^{n× n},\;∀ i,j=1,2... N$ is an independent block, and the following matrix
$$
\Gamma(U,\textbf{U})=\begin{bmatrix}\Gamma(U,U_{1})&\Gamma(U,U_{2})&\ldots&%
\Gamma(U,U_{N})\end{bmatrix}\in\mathbb{R}^{n\times nN}. \tag{34}
$$
In this work we assume to deal with uncorrelated input samples, therefore the analysis can be simplified by relying on diagonal kernels for $\Gamma$ . In particular, let $S:\mathbb{R}^{n}×\mathbb{R}^{n}→\mathbb{R}$ be a scalar kernel (see Section 2.1 where we compare different practical choices for $S$ ). Thus, each block in (33) is a diagonal block, i.e.
$$
\Gamma(U_{i},U_{j})=S(U_{i},U_{j})I=\begin{bmatrix}S(U_{i},U_{j})&0&\ldots&0\\
0&S(U_{i},U_{j})&\ldots&0\\
\vdots&\vdots&&\vdots\\
0&0&\ldots&S(U_{i},U_{j})\\
\end{bmatrix}. \tag{35}
$$
Hence, the problem of learning the operator from prescribed input-output couples can be recast as an optimal recovery problem:
$$
\bar{f}_{j}:=\operatorname*{arg\,min}_{g\in\mathcal{H}_{S}}\|g\|_{S}\;\mathrm{%
s.t.}\;g(\phi(u_{i}))=\varphi(v_{i})_{j},\,\forall\,i=1,2\ldots N,\,j=1,2%
\ldots n, \tag{36}
$$
where the RKHS endowed with the $S$ kernel is named $(\mathcal{H}_{S},\|·\|_{S})$ . In this finite dimensional case, it is possible to employ the fundamental result from kernel theory known as representer theorem [41]. Therefore, each component of the dicrete representation can be written explicitly as
$$
\bar{f}_{j}(U)=S(U,\textbf{U})S(\textbf{U},\textbf{U})^{-1}\textbf{V}_{\cdot,j}, \tag{37}
$$
where $S(U,\textbf{U}):\mathbb{R}^{n}×\mathbb{R}^{nN}→\mathbb{R}^{N}$ is a row vector, $S(\textbf{U},\textbf{U}):\mathbb{R}^{nN}×\mathbb{R}^{nN}→%
\mathbb{R}^{N× N}$ and $\textbf{V}_{·,j}=[[V_{1}]_{j},[V_{2}]_{j}...[V_{N}]_{j}]^{T}∈\mathbb{%
R}^{N}$ . Equation (37) in the alternative kernel-methods-like form
$$
\bar{f}(U)=\sum_{j=1}^{N}\underbrace{S(U,U_{j})}_{\in\mathbb{R}}\underbrace{%
\alpha_{j}}_{\in\mathbb{R}^{n}}. \tag{38}
$$
Indeed, by the representer theorem it holds that
$$
V(U)=\Gamma(U,\textbf{U})\Gamma(\textbf{U},\textbf{U})^{-1}\textbf{V}=\Gamma(U%
,\textbf{U})\boldsymbol{\alpha}=\sum_{j=1}^{N}\underbrace{\Gamma(U,U_{j})}_{%
\in\mathbb{R}^{n\times n}}\alpha_{j}=\sum_{j=1}^{N}\underbrace{S(U,U_{j})}_{%
\in\mathbb{R}}\alpha_{j}, \tag{39}
$$
where
$$
\boldsymbol{\alpha}=\begin{pmatrix}\alpha_{1}\\
\alpha_{2}\\
\vdots\\
\underbrace{\alpha_{N}}_{\in\mathbb{R}^{n}}\end{pmatrix}={\tiny\begin{pmatrix}%
\begin{array}[]{c}\alpha_{1}^{1}\\
\alpha_{1}^{2}\\
\vdots\\
\alpha_{1}^{n}\\
\hdashline\\
\alpha_{2}^{1}\\
\alpha_{2}^{2}\\
\vdots\\
\alpha_{2}^{n}\\
\hdashline\\
\vdots\\
\hdashline\\
\alpha_{N}^{1}\\
\alpha_{N}^{2}\\
\vdots\\
\alpha_{N}^{n}\\
\end{array}\end{pmatrix}}\in\mathbb{R}^{nN}\;\mathrm{s.t}\;\Gamma(\textbf{U},%
\textbf{U})\boldsymbol{\alpha}=\textbf{V}. \tag{40}
$$
The matrix in system in (40) can be written explicitly as
$$
\begin{split}&\Gamma(\textbf{U},\textbf{U})=\\
&\begin{bmatrix}\tiny\begin{array}[]{c@{}c@{}c@{%
}c:c@{}c@{}c@{}c:c:c@{%
}c@{}c@{}c}S(U_{1},U_{1})&0\hfil\hskip 3%
.0&\ldots\hfil\hskip 2.0&0&S(U_{1},U_{2})&0\hfil\hskip 3.0&\ldots\hfil\hskip 2%
.0&0&\ldots&S(U_{1},U_{N})&0\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&0\\
0&S(U_{1},U_{1})\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&0&0&S(U_{1},U_{2})\hfil%
\hskip 3.0&\ldots\hfil\hskip 2.0&0&\ldots&0&S(U_{1},U_{N})\hfil\hskip 3.0&%
\ldots\hfil\hskip 2.0&0\\
\vdots&\hfil\hskip 3.0&\hfil\hskip 2.0&\vdots&\vdots&\hfil\hskip 3.0&\hfil%
\hskip 2.0&\vdots&\ldots&\vdots&\hfil\hskip 3.0&\hfil\hskip 2.0&\vdots\\
0&0\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&S(U_{1},U_{1})&0&0\hfil\hskip 3.0&%
\ldots\hfil\hskip 2.0&S(U_{1},U_{2})&\ldots&0&0\hfil\hskip 3.0&\ldots\hfil%
\hskip 2.0&S(U_{1},U_{N})\\
\hdashline S(U_{2},U_{1})&0\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&0&S(U_{2},U_{%
2})&0\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&0&\ldots&S(U_{2},U_{N})&0\hfil%
\hskip 3.0&\ldots\hfil\hskip 2.0&0\\
0&S(U_{2},U_{1})\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&0&0&S(U_{2},U_{2})\hfil%
\hskip 3.0&\ldots\hfil\hskip 2.0&0&\ldots&0&S(U_{2},U_{N})\hfil\hskip 3.0&%
\ldots\hfil\hskip 2.0&0\\
\vdots&\hfil\hskip 3.0&\hfil\hskip 2.0&\vdots&\vdots&\hfil\hskip 3.0&\hfil%
\hskip 2.0&\vdots&\ldots&\vdots&\hfil\hskip 3.0&\hfil\hskip 2.0&\vdots\\
0&0\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&S(U_{2},U_{1})&0&0\hfil\hskip 3.0&%
\ldots\hfil\hskip 2.0&S(U_{2},U_{2})&\ldots&0&0\hfil\hskip 3.0&\ldots\hfil%
\hskip 2.0&S(U_{2},U_{N})\\
\hdashline\vdots&\hfil\hskip 3.0&\hfil\hskip 2.0&\vdots&\vdots&\hfil\hskip 3.0%
&\hfil\hskip 2.0&\vdots&\ldots&\vdots&\hfil\hskip 3.0&\hfil\hskip 2.0&\vdots\\
\hdashline S(U_{N},U_{1})&0\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&0&S(U_{N},U_{%
2})&0\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&0&\ldots&S(U_{N},U_{N})&0\hfil%
\hskip 3.0&\ldots\hfil\hskip 2.0&0\\
0&S(U_{N},U_{1})\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&0&0&S(U_{N},U_{2})\hfil%
\hskip 3.0&\ldots\hfil\hskip 2.0&0&\ldots&0&S(U_{N},U_{N})\hfil\hskip 3.0&%
\ldots\hfil\hskip 2.0&0\\
\vdots&\hfil\hskip 3.0&\hfil\hskip 2.0&\vdots&\vdots&\hfil\hskip 3.0&\hfil%
\hskip 2.0&\vdots&\ldots&\vdots&\hfil\hskip 3.0&\hfil\hskip 2.0&\vdots\\
0&0\hfil\hskip 3.0&\ldots\hfil\hskip 2.0&S(U_{N},U_{1})&0&0\hfil\hskip 3.0&%
\ldots\hfil\hskip 2.0&S(U_{N},U_{2})&\ldots&0&0\hfil\hskip 3.0&\ldots\hfil%
\hskip 2.0&S(U_{N},U_{N})\\
\end{array}\end{bmatrix}\end{split} \tag{41}
$$
Hence, we can reorder the system as
$$
\begin{cases}S(U_{1},U_{1})\alpha_{1}^{1}+S(U_{1},U_{2})\alpha_{2}^{1}+S(U_{1}%
,U_{3})\alpha_{3}^{1}+\ldots+S(U_{1},U_{N})\alpha_{N}^{1}=V_{1}^{1}\\
S(U_{2},U_{1})\alpha_{1}^{1}+S(U_{2},U_{2})\alpha_{2}^{1}+S(U_{2},U_{3})\alpha%
_{3}^{1}+\ldots+S(U_{2},U_{N})\alpha_{N}^{1}=V_{2}^{1}\\
S(U_{1},U_{1})\alpha_{1}^{1}+S(U_{1},U_{2})\alpha_{2}^{1}+S(U_{1},U_{3})\alpha%
_{3}^{1}+\ldots+S(U_{1},U_{N})\alpha_{N}^{1}=V_{1}^{1}\\
S(U_{3},U_{1})\alpha_{1}^{1}+S(U_{3},U_{2})\alpha_{2}^{1}+S(U_{3},U_{3})\alpha%
_{3}^{1}+\ldots+S(U_{3},U_{N})\alpha_{N}^{1}=V_{3}^{1}\\
\vdots\\
S(U_{N},U_{1})\alpha_{1}^{1}+S(U_{N},U_{2})\alpha_{2}^{1}+S(U_{N},U_{3})\alpha%
_{3}^{1}+\ldots+S(U_{N},U_{N})\alpha_{N}^{1}=V_{N}^{1}\\
\end{cases}, \tag{42}
$$
and we can define the matrix
$$
\textbf{S}(\textbf{U},\textbf{U})=\begin{bmatrix}S(U_{1},U_{1})&S(U_{1},U_{2})%
&S(U_{1},U_{3})&\ldots&S(U_{1},U_{N})\\
S(U_{2},U_{1})&S(U_{2},U_{2})&S(U_{2},U_{3})&\ldots&S(U_{2},U_{N})\\
\vdots&&&&\vdots\\
S(U_{N},U_{1})&S(U_{N},U_{2})&S(U_{N},U_{3})&\ldots&S(U_{N},U_{N})\\
\end{bmatrix}\in\mathbb{R}^{N\times N}. \tag{43}
$$
We get that
$$
\alpha_{i}=\begin{pmatrix}[\textbf{S}(\textbf{U},\textbf{U})^{-1}\textbf{V}_{%
\cdot,1}]_{i}\\
[\textbf{S}(\textbf{U},\textbf{U})^{-1}\textbf{V}_{\cdot,2}]_{i}\\
\vdots\\
[\textbf{S}(\textbf{U},\textbf{U})^{-1}\textbf{V}_{\cdot,n}]_{i}\\
\end{pmatrix},\;\forall i=1,2\ldots N. \tag{44}
$$
Finally, we rewrite equation (39) as
$$
\bar{f}_{j}(U)=V(U)_{j}=\left[\sum_{i=1}^{N}S(U,U_{i})\alpha_{i}\right]_{j}=%
\left[\sum_{i=1}^{N}S(U,U_{i})\begin{pmatrix}\begin{array}[]{c}[\textbf{S}(%
\textbf{U},\textbf{U})^{-1}\textbf{V}_{\cdot,1}]_{i}\\[0.0pt]
{}_{i}\\
\vdots\\
\hdashline\\[0.0pt]
{}_{i}\\
\hdashline\vdots\\[0.0pt]
{}_{i}\\
\end{array}\end{pmatrix}\right]_{j}, \tag{45}
$$
that, by rearranging the terms is exactly equation (37).
Combining equations (29) and (37) we obtain the approximation operator as
$$
\bar{\mathcal{G}}:=\chi\circ\bar{f}\circ\phi. \tag{46}
$$
Finally, recalling the definition of the operator in equation (38), $\bar{\mathcal{G}}:\mathcal{U}→\mathcal{V}$ is
$$
\bar{\mathcal{G}}(u)=\chi\left(\sum_{j=1}^{N}S(\phi(u),U_{j})\alpha_{j}\right). \tag{47}
$$