Abstract
In this work, the efficient robust global optimization (ERGO) method is revisited with the aim of enhancing and expanding its existing capabilities. The original objective of ERGO was to address the computational challenges associated with optimization-under-uncertainty through the use of Bayesian optimization (BO). ERGO tackles robust optimization problems which are characterized by sensitivity in the objective function due to stochasticity in the design space. It does this by concurrently minimizing the mean and variance of the objective in a multi-objective setting. To handle the computational complexity arising from the uncertainty propagation, ERGO exploits the analytical expression of the surrogate model underlying BO. In this study, ERGO is extended to accommodate multiple objectives, incorporate an improved predictive error estimation approach, investigate the treatment of failed function evaluations, and explore the handling of stochastic parameters next to stochastic design variables. To evaluate the effectiveness of these improvements, the enhanced ERGO scheme is compared with the original method using an analytical test problem with varying dimensionality. Additionally, the novel optimization technique is applied to an aerodynamic design problem to validate its performance.
1 Introduction
Driven by the goal of reducing the computational burden associated with conventional approaches to optimization-under-uncertainty (OUU), recent years have witnessed the emergence of various surrogate-assisted (SA) techniques aimed at addressing this challenge [1–3]. This paper focuses on addressing this computational challenge within the context of robust design optimization (RDO), which seeks to find designs that exhibit robustness against small variations in uncertain parameters [4]. More specifically, the performance of the efficient robust global optimization (ERGO) scheme is enhanced, which is positioned in the OUU research field below.
The existing SA-RDO approaches can be categorized in two ways: first based on the manner by which the surrogate is introduced and second on the formulation of the RDO problem. Persson and Ölvander divided the former in two based on whether the surrogate is updated during the optimization or not [5]. In reference to the latter, Kanno distinguished three RDO formulations: worst-case optimization (minimization of the maximum objective evaluation), discrepancy optimization (minimization of the difference between objective value evaluations), and variance minimization (minimization of the moments of the objective probability density functions) [6]. Persson and Ölvander’s study illustrated the outperforming ability of the actively updating approach [5]. Ryan et al. put a multi-objective (MO) reformulation of the variance minimization forward as the most competitive formulation of the RDO problem [7].
Among the various surrogate-modeling techniques available, Gaussian process interpolation (GPI), also known as Kriging, has received significant attention [8]. This can be attributed to GPI’s flexibility as a parameter-free method, as well as its capability to quantify prediction uncertainty. Bayesian optimization (BO) builds on GPI’s through the introduction of an acquisition functions that are typically formulated in such a way that the design space is explored and the objective is exploited. The efficient global optimization (EGO) framework is the most well known in which the acquisition function corresponds to the expected improvement (EI) [9,10].
The MO–BO approach to RDO was pursued by Keane in which the uncertainty propagation was performed by means of sampling the surrogate to obtain the quantities of interest and optimization these by means of NSGA-II [11,12]. This approach was also followed by Cheng et al. [13]. The approach of Ribaud et al. [14] relies on a Taylor expansion in order to propagate mean and variance through the surrogate. However, none of these methods make active use of the underlying model uncertainty to ensure objective space exploration and therefore, the end result might strongly depend on the initial set of function evaluation on which the surrogate is built. Most recently, the ERGO scheme was devised to account for this problem by formulating an acquisition function that actively leverages on the surrogate-induced uncertainty [15]. The ERGO scheme builds on the EGO algorithm and further increases the efficiency of solving RDO problems by performing the uncertainty propagation (UP) to assess the robustness measures without additional experiments or sampling. In this paper, the ERGO scheme is improved upon by permitting the treatment of multiple objectives, by improving upon the assessment of the predictive uncertainty of the stochastic measures, by treating function evaluation failures, and by treating the occurrence of stochastic parameters along with stochastic design variables. The contributions are summarized as follows:
The original ERGO scheme relied on a reformulation of the Keane’s Euclidean EI [16] as the acquisition function that drives the selection of designs to be evaluated. However, the Euclidean EI has been proven to be sensitive to magnitude variations of the objectives and is furthermore limited to two objectives [17], corresponding to the mean and variance of a single function output. As such, the hypervolume EI is introduced in the ERGO framework to account for both shortcomings [18].
The original ERGO scheme made use of a first-order second moment (FOSM) approach [19] to assess the predictive variance (epistemic uncertainty [20]) of the predictions of mean and variance of the objective due to input variability (aleatoric uncertainty [20]). This low-order approach drove the acquisition function to under emphasize certain regions in the objective space. To account for this shortcoming, a second-order second moment (SOSM) approach is introduced to assess the predictive variance.
The original ERGO scheme did not treat the possible occurrence of objective function failures. The SAMURAI framework [21], which incorporated the ERGO scheme in the context of asynchronous reliability-based robust design optimization, accounted for this particularity by means of adding the prediction of failed evaluations to the training set, mimicking Forrester et al.’s missing-at-random (MAR) approach [22]. The introduction of the MAR approach in the original scheme can introduce excessive non-linearities in the objective space when the predictive uncertainty of the surrogate is substantial. This can lead to the prediction of non-existent multi-model objective spaces and regions with high predictive uncertainty. Consequently, the acquisition function may remain elevated throughout the optimization process. To account for this, a classifier is introduced to predict the feasible domain and guide the acquisition function, following the approach of Backov et al. [23].
The original ERGO scheme only permitted stochastic design variables. As such, the impact of parameter variability on the objective could not be directly assessed. To account for this defect, the treatment of stochastic parameters is examined in the form of a two-step acquisition function optimization.
The remainder of this paper is organized as follows. In Sec. 2 of this paper, a brief overview of the notations used throughout this work is introduced. In Sec. 3, the ERGO framework is briefly summarized after which the three improvements upon the strategy are presented in Sec. 4, namely the novel acquisition function which relies on the hypervolume expected improvement, the higher-order predictive variance assessment of the stochastic objective measures, the treatment of crash constraints by means of Gaussian process classification, and the treatment of stochastic parameters. In Sec. 5, the aforementioned improvements are brought together making up ERGO-II and compared with the original scheme on an analytical test problem. In Sec. 6, the improved approach is examined on an aerodynamic test case inspired by the aerodynamic design optimization discussion group [24]. Concluding remarks and outlooks are formulated in Sec. 7.
2 Notation
2.1 Sub- and Superscripts.
refers to the sample of () with the number of elements in set , with a proper subset of vector space and a subset of : . is used to denote the dimensionality of the vector space, such that refers to the component () of design vector . Furthermore, with the vector is denoted from which the component has been omitted.
The superscripts and are used to differentiate the interpolator from the classifier. Similarly, the subscripts and are used to differentiate the design variable, entities that influence the model that the engineer can change within a specified range during the optimization design study, and the operation parameters, entities that influence the model and might vary during operation, but can not be changed by the engineer.
2.2 Operators.
The concept of vectorizing a matrix, denoted as , involves the process of stacking its columns. Similarly, the matricization of a tensor, expressed as , entails the stacking of matrices. In the case of a rank 3 tensor, , this operation yields , while for a rank 4 tensor, , it results in .
Furthermore, the use of the double dot product, denoted as “:”, serves to represent the Frobenius inner product: , with signifying the trace operator of a matrix. Additionally, a noteworthy observation is made: holds true when both and are symmetric. The Hadamard product is aptly denoted by “”, and the notation “” is introduced to represent the Hadamard power. The Kronecker product is represented by “”. Moreover, an “outer power” of a vector is introduced as , providing a means to simplify notation. Similarly, an “outer power” of a matrix is defined as . Based on these definitions, equivalent expressions are formulated, such as , which are employed interchangeably for the sake of brevity and clarity.
Finally, use is made of and to respectively denote the standard normal and cumulative distribution function. The notation of both is eased out by introducing in the context of Gaussian processes.
2.3 Fonts.
Epistemic uncertainty denotes non-deterministic behavior arising from a degree of ignorance or a deficiency in knowledge concerning the system or the environment and is depicted in this work using the calligraphic font, symbolized as . In contrast, aleatory uncertainty pertains to the inherent variability linked to the physical system or environment under investigation, which is in this work represented by the Fraktur typeface, denoted as [20].
3 Efficient Robust Global Optimization
A closed-form expression of and is not readily available, therefore approximations are resorted to. In the original framework, Girard et al.’s approach is followed for the universal Kriging interpolator [15], where the trends are obtained by means of Isserlis’ theorem and the processes are estimated through the assumption that they are again Gaussian, appropriately referred to as the Gaussian approximation [25]. Furthermore, the covariance is locally approximated by a second-order polynomial.
Plugged in EGO (cf. Appendix) the ERGO framework is obtained (Algorithm 1). Notice in Algorithm 1: line 3 that is added as an argument. This substitution is made to indicate that a maximum likelihood estimation (MLE) approach is adopted instead of a fully Bayesian approach to evaluate the interpolator. Furthermore, is introduced as an additional constant that serves as a stopping criterion: if no further significant improvement can be made, stop () the optimization.
4 Improved Efficient Robust Global Optimization
4.1 Robust Hypervolume Expected Improvement.
The initial ERGO scheme depends on adapting Keane’s Euclidean EI [16] as the acquisition function guiding the selection of designs for evaluation. Nevertheless, the Euclidean EI has been shown to be responsive to changes in the magnitudes of objectives and is also constrained to handling only two objectives [17], representing the mean and variance of a singular function output. Consequently, the ERGO framework introduces the hypervolume EI to address both of these limitations [18].
4.2 Improved Epistemic Uncertainty Propagation.
4.3 Crash Constraint Treatment.
The original ERGO scheme did not treat the possible occurrence of objective function failures. The SAMURAI framework, which incorporated the ERGO scheme in the context of reliability-based robust design optimization, accounted for this particularity by means of adding the prediction of failed evaluations to the training set. However, this strongly affects the convergence of the scheme [21].
By storing information on whether () or not () the design vector () could be evaluated, a binary classification problem is obtained. In this work, use is made of a discriminative Gaussian process classifier (GPC) to model the feasible space directly with . Rasmussen and Williams’ textbook is strongly drawn from to describe the theory behind GPC and its implementation [27]. The approach corresponds to turning the output of a regression model into a class probability using a response function , guaranteeing a valid probabilistic interpretation. As such, the regression model becomes a latent or hidden process.
The construction of the Gaussian process classifier is performed using the publicly available matlab® implementation of Rasmussen and Williams’ work using a conjugate gradient approach to maximize the concentrated log marginal likelihood function [27].
4.4 Stochastic Parameter Treatment.
4.5 Algorithmic Development.
If these different aspects are brought together, the improved ERGO-II algorithm is obtained (Algorithm 2 and Fig. 5) with and . Upon inspection of the new algorithm, it can be observed that in comparison with the original ERGO scheme (or EGO upon which it was built as presented in Algorithm 3 in Appendix) multiple GP interpolators are built for each of the different function output sets and a GP classifier to model the evaluable region (lines 2-4). Upon assessment of whether the contribution of the design that is expected to contribute most strongly to the robust Pareto is significant enough (lines 6 and 7), the design is evaluated (line 10). When the design evaluation is success, the output is added to the sets (lines 11-14). This process is subsequently repeated until the computational budget is depleted (line 1) or until the improvement decreases beyond a threshold value (lines 7 and 16).
5 Performance Assessment
5.1 Problem Description.
5.2 Methodology.
The evaluation of the performance of the multi-objective optimization algorithms is conducted through the utilization of the hypervolume metric [29]. The inherent stochastic nature of the optimization process, influenced by the random seed of the optimizer, can result in variable performance. To account for this aspect, the optimization is repeated ten times after which the average, min and max in each iteration after the design of experiments (DoE) are computed.
The determination of the appropriate size for the DoE adheres to Jones et al.’s recommendation, which suggests a scaling factor of [10]. For the DoE, a Latin hypercube sampling approach [30] is employed, and Morris and Mitchell’s maximin-criterion is utilized to quantify the space-filling property. This criterion aims to maximize the distances between pairs of points in ascending order while simultaneously minimizing the number of corresponding pairs [31]. The stopping criteria employed include a maximum of 200 function evaluations after the DoE or the termination of the process if the normalized expected improvement falls below 0.1% (corresponding to ). The optimization of the REI is accomplished using a genetic algorithm.
5.3 Results and Discussion.
Figure 6 illustrates the ERGO versus ERGO-II performance assessment as a function of number of iterations for increasing dimensionality over the different plots from top to bottom with on the left the hypervolume indicator and on the right the normalized maximum expected improvement. The full line denotes the mean values, while the shades area are minimum, and maximum values of the respective performance measures. Optimal performance of an optimizer would be characterized by an early climb of and a decrease to zero of the REI once stagnates. The latter would ensure that the optimizer terminates if no further improvement can be obtained.
A notable observation is the enhanced performance of ERGO-II compared to the original scheme as the dimensionality increases (). This can be observed based on the fact that there is a strong increase in early on and a decrease of REI to zero once stagnated. ERGO maintains a high REI during the optimization, without further improvements of . The significant performance fluctuations of the original scheme are attributed to the treatment of crash constraints, since this approach might lead to the constant generation of regions of high predictive variance, as discussed in the introduction.
At lower dimensionality () ERGO-II also performs better than ERGO, but fails to stop once stagnates. The reason for which might be attributed to the size DoE, which is now taken to be proportional to the dimensionality of the input space.
6 Validation
To illustrate the method in a more practical context, the original ERGO scheme is compared with the improved ERGO-II methodology by applying them to a computational fluid dynamics (CFD) design problem. More precisely, the stochastic reformulation of the second benchmark case introduced by the AIAA Aerodynamic Design Optimization Discussion Group (ADODG) is solved [24].
6.1 Problem Description.
6.2 Parameterization.
Production tolerances are translated in the form stochastic design variables such that the design vector is normally distributed around the parametrization of the airfoil with a coefficient of variation equal to 0.1.
Self-intersecting airfoils that would appear in the DoE are accounted for by the crash constraint treatment presented in Sec. 4.3.
6.3 Methodology.
The treatment of constraints is performed through a conventional approach [36]. To model the transonic flow around the airfoil, CFD simulations are applied. The governing equations for flows correspond to the unsteady Reynolds-averaged Navier–Stokes equations.
6.3.1 Turbulence Modeling.
The introduction of Reynolds’ decomposition of the flow variables in a mean and fluctuating part in the Navier–Stokes equations and the ensemble-averaging of this leads to the RANS equations, which are much cheaper to solve. However, by decomposing and averaging an additional term is introduced in the equations, which has the form of a stress-term and is typically addressed as the Reynolds stress. This leads to a set of equations that is no longer closed. Different manners exist by which this stress term is solved, such as by relating the Reynolds stress to the mean strain rate, building forth on the Boussinesq eddy viscosity assumption, by means of the eddy viscosity. Again different approaches exist to determine this term; the SST model, which is used in this work, does this by means of two additional transport equations for turbulent kinetic energy and specific turbulence dissipation rate and assuming a linear relation in the boundary layer between the turbulent shear stress and the turbulent kinetic energy [37].
6.3.2 Solver-Specific Settings.
To solve the set of equations presented in the previous sections, a pressure-based solver with a fully-coupled approach was applied. A second-order scheme was used for the pressure and second-order upwind discretization was applied for density and momentum while turbulent kinetic energy, specific dissipation rate, and energy were discretized using a first-order upwind scheme. A least-squares cell-based approach is employed to calculate the gradients. This implies that a least-squares method is used to solve the system of equations relating the cell gradient to the difference in cell values. All residuals of the equations were enforced to drop below , including for the continuity equation. The residuals of the former are expressed as the ratio of the sum of the absolute errors in each cell to the sum of the absolute values of the flow variable calculated in each cell. The residual of the latter is expressed as the ratio of the sum of the rate of mass creation in each cell to the largest absolute value of this sum in the first five iterations.
6.3.3 Mesh Settings.
A c-shaped grid is constructed around the airfoil. A grid independence study was conducted to determine the grid and domain sizing in accordance to the approach in [38]. The results correspond to a computational of 20 chord lengths in front, above, and below the airfoil and 40 chord lengths behind it as presented in Fig. 7. Furthermore, a maximum wall normal expansion ratio of 1.1 is imposed on 120 layers surrounding the airfoil to obtain a of maximum 1 near the stagnation point, on average 0.35, and 200 nodes are placed on the chord.
6.3.4 Boundary Conditions.
Surrounding the computational domain a pressure far field boundary condition is imposed with a Mach number determined by the ERGO-II scheme. The values of the turbulent intensity and the turbulent viscosity ratio are respectively set to and . The wall is modeled as a no-slip boundary.
6.3.5 Computational Solver.
All calculations are performed using the CFD-code ansys®fluent®23.1.0 The computational effort of a single simulation using five cores of an Intel Xeon Gold 6240 2.6GHz CPU amounted to roughly 1 min.
6.4 Results and Discussion.
The resulting Pareto fronts of the design optimization-under-uncertainty as presented by ERGO and ERGO-II are visualized in Fig. 8. In light and dark gray, the feasible designs of respectively the DOE and the infills are presented. In black the Pareto fronts, as predicted by the surrogates of the output () after the final iteration are visualized.
Inspection shows that the Pareto front obtained by ERGO-II extends deeper in the bottom left than the one obtained by ERGO, conform the conclusions drawn in the previous section. Additionally, the number of feasible infills is larger when using ERGO-II, as can be seen by the larger number of dark gray dots in the bottom plot. This can again be attributed to the effect of the crash constraint treatment as pinpointed in the introduction and elaborated in Sec. 4.3: using the MAR approach might lead to distorted surrogates if the prediction is excessive or might mislead the optimizer if the MAR approach predicts the unfeasible region to be the optimal one. The behavior is avoided when using the GP classifier approach as presented in this work.
Another observation that can be made is the fact that the default prediction is different in the top and bottom plots. This can be attributed to the fact that the uncertainty propagation relies on the GPI. Theoretically, if the output function displays the smoothness characteristics enforced by the GPI and UP approach, the GP-assisted UP will converge to the exact solution. However, this is prohibited by the infills added using the MAR approach.
The two designs on the two Pareto fronts are selected for closer inspection. Their corresponding CST-parametrization values are given in Table 1 and visualized along with their characteristics for Ma and Re as obtained using the CFD modeling approach in Fig. 9. A first observation that can be drawn is the fact that ERGO’s design 2 (in yellow in the top plots) has a smaller area than the default design (in blue), while this constraint was enforced (Eq. (24)). This is once more a consequence of the MAR approach which has added this unfeasible design to the design set with a prediction that outperformed the other evaluated designs. A possible approach to avoid this is driven up the contribution of the model uncertainty in the MAR prediction. However, experience has shown that this destabilizes the surrogate training set, as denoted above.
On the other hand, the designs presented by ERGO-II meet the area constraint. It can be noted that the optimal designs have a more rounded pressure side near the leading edge and rounded suction side near the trailing edge. Inspection of the characteristics indicates a higher stall angle for the optimal designs, but a lower value of -gradient in the linear region in comparison with the baseline design. Second, a lower drag and moment coefficient is observed.
In regards to the computational cost, ERGO and ERGO-II are built on the EGO algorithm, a form of BO as discussed in the introduction, which is known to be a data-efficient method and therefore used specifically for methods where the computational cost of a function evaluation is far more substantial than the cost of training the GPs and finding the next infill. Both methods required roughly the same computational budget, corresponding to 250 function calls, of which 150 succeeded. This added up to approximately 4–5 h for each optimization. If an alternative surrogate-free approach had been attempted to solve the problem, the computational cost of the uncertainty quantification would have prohibited solving the problem due to the computational cost of the CFD calls; for example, performing a single UP of a design by means of a sampling approach or a spectral method such as the Taylor-based method or Polynomial Chaos would surmount to the same computational cost as the whole ERGO-II optimization.
The successful outcome of the validation case using ERGO-II proves the effectiveness of the improved scheme.
7 Conclusion and Outlook
In this paper improvements upon ERGO, a surrogate-assisted optimization-under-uncertainty methodology for robust design optimization, was presented. This was realized in four steps:
The hypervolume expected improvement was introduced in the framework as an alternative to the Euclidean EI. The Euclidean EI exhibited sensitivity to variations in objective magnitudes and was restricted to only two objectives. By incorporating the hypervolume EI, these limitations were mitigated, enabling more robust and versatile design selection based on the acquisition function.
A more advanced approach for assessing predictive variance was introduced. The initial implementation relied on a FOSM technique to estimate the predictive uncertainty of the stochastic objective measures. However, this lower-order approach led to the underemphasis of certain regions in the objective space. To address this drawback, the adoption of a SOSM approach was proposed, enabling a more accurate assessment of the predictive variance within the acquisition function.
A classifier to model the feasible space was introduced. In the original ERGO scheme, the consideration of objective function failures was neglected. To address this issue, the utilization of a Gaussian process classifier was proposed, which, when combined with the acquisition function, minimizes the disruption to the convergence of the optimization process.
The treatment of stochastic parameters was examined. The original ERGO scheme had a limitation in that it only allowed for stochastic design variables. Consequently, the direct assessment of the impact of parameter variability on the objective was not possible. In order to address this limitation, a two-step acquisition function optimization approach was undertaken to specifically handle stochastic parameters.
The scheme was compared with the original ERGO approach using one of the ZDG test function for varying dimensionality and validated on the second ADODG problem, the drag minimization of a RAE2822 airfoil in a viscous transonic flow while being subjected to stochasticity in both operating conditions and design variables. The outcome illustrated the success of the scheme in obtaining a robust solution within an affordable computational budget.
Future research tracks are currently focused on two aspects of the proposed ERGO-II scheme. First, alternative approaches to the uncertainty propagation routine are explored. For highly non-linear objective spaces, the second-order approximation might leave wanting. Therefore, exact propagation methods are explored for specific covariance function and input distribution combinations. A second research track is focused on dealing with the limitations inherent to Bayesian optimization: namely the assumption of a smooth output space and a limited number of variables. Some recent advances in the machine learning community, namely manifold GPs, are currently being explored.
Acknowledgment
The author gratefully acknowledges the funding by the Research Foundation-Flanders (FWO), Belgium, through the Junior Post-Doctoral fellowship of Jolan Wauters. The computational resources (Stevin Supercomputer Infrastructure) and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by Ghent University, Belgium, FWO and the Flemish Government—department EWI, Belgium.
Conflict of Interest
There are no conflicts of interest.
Data Availability Statement
The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.
Nomenclature
Greek Symbols
- =
angle of attack
- =
GP predictive mean coefficients
- =
polynomial coefficients
- =
covariance function hyperparameters
- =
hypervolume cell
- =
Matérn smoothness parameter
- =
process variance
- =
input space covariance matrix
- =
Gaussian process covariance matrix
- =
chord length
- =
dimensionality
- =
Bernstein coefficients
- =
vector of objective function evaluations
- =
model matrix
- =
pareto front
- =
training set
- =
deterministic/mean design vector
- =
stochastic design vector
- =
Gaussian process of the objective
- =
design set
- =
design space
- =
drag coefficient
- =
lift coefficient
- =
moment coefficient
- Ma =
Mach number
- Re =
Reynolds number
Subscripts
Superscripts
Operators
- =
activation function
- =
standard normal probability density function
- =
standard normal cumulative distribution function
- =
covariance function
- =
expected value operator
- =
predictive mean
- =
trend vector function
- =
hypervolume measure
- =
Bessel function
- =
likelihood function
- =
normal distribution
- =
predictive variance
- =
variance operator
- =
objective vector function
- =
Gaussian process
Abbreviations
- ADODG =
aero design optimization discussion group
- BO =
Bayesian optimization
- DoE =
design of experiments
- E(R)GO =
efficient (robust) global optimization
- ((S)R)EI =
((safe) robust) expected improvement
- EP =
expectation propagation
- (F/S)OFM =
(first/second)-order first moment
- (F/S)OSM =
(first/second)-order second moment
- GA =
genetic algorithm
- GPI/C =
Gaussian process interpolator/classifier
- KL =
Kullback–Leibler
- LHS =
Latin hypercube sampling
- MAR =
missing-at-random
- MC(I/S) =
Monte Carlo (integration/sampling)
- MLE =
maximum likelihood estimation
- NSGA-II =
non-dominated sorting genetic algorithm II
- OUU =
optimization-under-uncertainty
- PDF =
probability density function
- RANS =
Reynolds-averaged Navier–Stokes
- RDO =
robust design optimization
- SA =
surrogate-assisted
- (S/M)OO =
(single/multi) objective optimization
- SQP =
sequential quadratic programming
- U(Q/P) =
uncertainty (quantification/propagation)
Appendix: Bayesian Optimization
This appendix presents a concise introduction to GPI and BO, extended with a number of metrics essential to the ERGO framework, making use of the notation introduced in Sec. 2. For more information on GPI and BO, the reader is referred to Refs. [8,27].
The GPI is the surrogate model commonly used in BO. From a function-space perspective, a Gaussian process can be defined as a distribution over functions, where the values of at any set of points follow a joint Gaussian distribution. A Gaussian process is fully characterized by its second-order statistics . Here, represents the mean function, and represents the covariance function fully described by its parameters . Different interpolators can be obtained based on the formulation of the mean function. When a constant value is used, the method is known as “simple Kriging.” Moreover, if a multivariate polynomial is employed, the interpolator is referred to as “universal Kriging.”