Model calibration and validation are two activities in system model development, and both of them make use of test data. Limited testing budget creates the challenge of test resource allocation, i.e., how to optimize the number of calibration and validation tests to be conducted. Test resource allocation is conducted before any actual test is performed, and therefore needs to use synthetic data. This paper develops a test resource allocation methodology to make the system response prediction “robust” to test outcome, i.e., insensitive to the variability in test outcome; therefore, consistent system response predictions can be achieved under different test outcomes. This paper analyzes the uncertainty sources in the generation of synthetic data regarding different test conditions, and concludes that the robustness objective can be achieved if the contribution of model parameter uncertainty in the synthetic data can be maximized. Global sensitivity analysis (Sobol’ index) is used to assess this contribution, and to formulate an optimization problem to achieve the desired consistent system response prediction. A simulated annealing algorithm is applied to solve this optimization problem. The proposed method is suitable either when only model calibration tests are considered or when both calibration and validation tests are considered. Two numerical examples are provided to demonstrate the proposed approach.
Introduction
In engineering applications, it is often required to estimate the system response under untested conditions using available computational models and test data at different conditions. The computational model aims to describe the physics of the system and can be denoted as , where is the system response, is the set of model inputs, and is the set of model parameters. The uncertainty in an input can be described by a probability distribution . In the actual test, in some cases, we can control an input at a nominal value but the control is not perfect; thus, characterizes this imperfect control. In some other cases, an input such as outdoor temperature cannot be controlled but measured; thus, characterizes the natural variability in . The model parameters have fixed but unknown values in all tests on the same specimen. The uncertainty regarding the values of is epistemic uncertainty due to lack of information, which can be reduced using test data. (In some problems, the model parameters may not be physical quantities but simply artifacts of modeling, in which case the concept of true value may not be applicable; such cases are not considered in this paper. Also, in some problems, the model parameters could be input-dependent; this paper does not consider such cases.)
Two important questions in the system response prediction are: (1) how to quantify and reduce the uncertainty in and (2) how to validate the agreement of the computational model to the true physics or quantify their difference. These two questions are resolved by model calibration and model validation, respectively. Usually, model calibration is conducted first to quantify the values of or reduce the uncertainty about their values, and then model validation follows. Various approaches to model calibration and validation have been studied in the literature. Consider an example of model calibration using Bayesian inference. While some researchers directly use the computational model and calibrate , others [1] use a model discrepancy term to correct the computational model and calibrate both and . Consider another example regarding the use of test data. Some researches treat all the data as calibration data and use the calibrated model parameters in predicting the system response [2,3]; others integrate the results of model calibration and model validation (each done with different sets of data) in predicting the system response [4–6].
No matter what approaches are pursued, model calibration and validation always require test data. Due to the variability in test outcomes, two sets of test data of the same size may lead to two distinct system response predictions (after calibration and/or validation) even if the same computational model and the same framework of model calibration/validation are used. Here, “test outcome” is defined as the value of test data, i.e., the measurements of test inputs and outputs. The variability in the test outcome is due to the following reasons: (1) the input is controlled at a nominal value but the control is imperfect; (2) the input has natural variability, which means that the input cannot be controlled; and (3) there is measurement error in the input and output data.
If a single data point is used in model calibration/validation, the calibration/validation result will be affected by the value of this data point significantly. However, as more data points are applied, the calibration/validation result will converge; thus, the consequent system response prediction will also converge. Thus, as the number of tests increases, the model prediction uncertainty becomes less and less sensitive to variability in the test outcomes. This raises the following questions, when test budget is limited: (1) is it possible to organize the test campaign to make the system response prediction robust to variability in test outcomes and (2) how many tests of each type are necessary to achieve the robustness objective. Note that in this paper, the term “test type” refers to two attributes: (1) whether the test data are for calibration or validation and (2) the physical quantity measured in the test. For example, if three quantities are measured in tests and all data are used for calibration, we have types of test; but if part of the data is used for calibration and the remaining data are used for validation, then we have 3 2 = 6 types of tests. The focus of this paper is to develop an optimization approach to answer this question, assuming the computational model and the framework of model calibration/validation are given. The design variables of this optimization are the numbers of each type of test, denoted as if types of tests are available; the objective function and constraints will be discussed later. Note that (1) this optimization needs to be solved before any actual test is conducted [4] and (2) this optimization needs to consider test outcome uncertainty due to which the subsequent system response prediction is also uncertain.
Several approaches for test resource allocation have been studied in the literature [4,7–11], and the main difference among these approaches is the choice of the objective function. Note that model calibration aims to reduce the uncertainty in model parameters, and thus reduce the uncertainty in the subsequent system response prediction. Thus, in the case that only model calibration is considered in system response prediction, generally the objective of test resource allocation optimization is to minimize the system response prediction uncertainty subject to limited budget. Several quantities have been used to represent system response prediction uncertainty, and the first one is variance. Sankararaman et al. [4] minimized where is the variance of the system response prediction at given numbers of each type of test, and denotes the average of over different synthetic data sets. Similarly, Vanlier et al. [8] defined the variance reduction of via model calibration as and maximized it, where is the variance of the system response prediction using the posterior distribution and is the variance of the system response prediction using the prior distribution. Entropy measures have also been used to represent system response prediction uncertainty. In Ref. [9], the authors maximized the relative entropy (Kullback–Leibler divergence) from the system response prediction using the prior distribution and the system response prediction using the posterior distribution; while in Refs. [10] and [11], the authors maximized the mutual information, i.e., the change of entropy from to .
The previously mentioned approaches that select only calibration tests to minimize the uncertainty in the system response prediction are not applicable when model validation is also incorporated in the system response prediction. The reason is that model validation may indicate that the calibrated model is not exactly valid; accounting for this result increases the uncertainty in the system response prediction. Thus, the earlier optimization formulations would lead to the conclusion that model validation is not necessary. Mullins et al. [12] proposed a method considering both model calibration and model validation, in which model calibration is via Bayesian inference, and model validation is via a stochastic model reliability metric, i.e., describing model validity through a probability distribution. In this method, the objective regarding model validation tests was to minimize the spread in the family of system response predictions that results from the uncertainty in model validity, denoted as where the inner is the system response prediction mean at given synthetic data set and given value of model validity, and is the average over the distribution of model validity, and the outer is the average over the different data sets. The objective regarding model calibration tests is still to minimize the variance of the system response prediction, denoted as where is the system response prediction variance based on a given synthetic data set and given value of model validity; the inner is the average over the distribution of model validity, and the outer is the average over different synthetic data sets.
In this paper, the proposed concept of “test resource allocation for system response prediction robustness” means that the system response prediction becomes insensitive to the variability in test outcomes; thus, at the optimal value of the design variables (i.e., number of tests , different test outcomes result in consistent system response predictions. This concept and the required objective function will be explained in Sec. 2. The approach is suitable in different situations when only model calibration tests are considered or when both calibration and validation tests are considered. Note that the proposed methodology only selects the number of each type of test; it does not design the actual tests, i.e., select the input for the test. Experimental design is a subsequent step to test selection; we only focus on test selection.
The constraint in the optimization of test resource allocation is generally the budget. Note that the constraint and objective are interchangeable, i.e., the optimization may have two alternative formats: (1) subject to the budget constraint, optimize the design variable (the number of each type of test) to reach the most robust system response prediction; or (2) subject to the robustness requirement in the system response prediction, find to minimize the budget. The proposed concept can be realized with either formulation.
In addition, it is important to note that the data considered in test resource allocation analysis has to be synthetic since it is done before any actual test. The actual physical test data from a test are obtained by: (1) selecting the values of inputs ; (2) applying to the physical test configuration where the model parameters are at their true but unknown values; and (3) recording the input–output data, where both the input and output measurements may be subject to measurement errors. In actual tests where the values of have been decided, the test outcome uncertainty arises only from experimental variability, including measurement errors. The generation of synthetic data is a simulation of the three steps mentioned earlier, with the physical test configuration replaced by a computational model and the model parameters being unknown. Thus, two additional uncertainty sources are introduced in the synthetic data: (1) uncertainty regarding the value of and (2) model discrepancy, i.e., the difference between the computational model and the actual physics. In a Bayesian framework, the first one can be represented by the prior distribution of based on available knowledge. But no information on the model discrepancy is available before any testing.
In addition, compared to the actual test, the physical meaning of the input distribution may be changed in generating the synthetic data. As explained at the beginning of Sec. 1, for an actual test, the uncertainty characterized by is due to the following sources: (1) imperfect control over the true value, (2) natural variability of the input, and (3) measurement errors. In generating the synthetic data, accounts for the same uncertainty sources in the case that test conditions are known (for example, the nominal values of the inputs are known). But in the case of unknown test condition, mainly accounts the uncertainty about which experimental conditions will be subsequently selected. For example, if the tester only mentions that the possible nominal value of an input is between 5 and 10, then we may have a uniform distribution to represent this uncertainty in the nominal value. In this case, the uncertainty in is epistemic. In this paper, the proposed method is versatile and able to handle both cases. It is possible for the decision-maker to apply the proposed method before and after knowing the test conditions, and different answers can be obtained due to changed availability of knowledge.
In summary, the objectives of this paper are to: (1) find the optimal number of each type of test such that different data sets result in consistent system response predictions; (2) develop solutions for both formats of the optimization problem; and (3) adapt to different cases when only model calibration tests are considered or when both calibration and validation tests are considered. The rest of this paper is organized as follows: Section 2 proposes the objective in the optimization of robust test allocation. Section 3 analyzes the uncertainty sources in the synthetic data and the use of Sobol’ indices to assess their contributions toward the uncertainty in the system response prediction. Section 4 develops a flexible approach for test resource allocation optimization. Section 5 uses two numerical examples to illustrate the proposed approach.
Global Sensitivity Analysis of Uncertainty in Synthetic Test Data
Objective of Robust Test Resource Allocation.
The objective of the proposed test resource allocation optimization can be visually represented as in Fig. 1, which shows the families of the system response prediction probability density functions (PDFs) at different values of the design variables . Within a sub-figure, the variation between the PDFs is caused by the test outcome variability among different data sets. From Figs. 1(a)–1(c), this variation becomes smaller and the system response predictions reveal stronger consistency due to: (1) the decreased variability of mean values across the PDFs, meaning that the centroids of the family members are closer; and (2) the decreased variability of the variance across the PDFs, meaning that the ranges of values covered by the PDF are similar. In other words, at the value of optimal in Fig. 1(c), the effects of test outcome uncertainty on and are small so that consistent system response predictions can be obtained with different sets of test data. Note that this paper is only concerned about the mean value and variance in the system response prediction, not the exact shape of the PDF in Fig. 1. Here, the “variability” of and is captured by their variance, i.e., and across different data sets.
Therefore, this paper defines the objective for robust test resource allocation as: minimize the contribution of test outcome uncertainty toward the variability (i.e., variance) in the system response prediction mean value and the system response prediction variance .
Global sensitivity analysis (GSA) using Sobol’ indices is a prominent approach [13–15] to quantify the contributions of input uncertainty toward the variance in the output. A brief introduction to Sobol’ indices is given in Sec. 2.2. One challenge is to establish a deterministic function required by the Sobol’ indices computation, in mapping the test outcome uncertainty to the system response prediction uncertainty. This challenge will be analyzed and overcome in Sec. 3.
Sobol’ Indices.
is a combined measure of the individual contributions of the components of and of the interactions among them.
where is the complementary subset of . is a combined measure of the individual contributions of the components of , the interactions among them, and the interactions between and .
The direct computation of Sobol’ indices requires double-loop Monte Carlo simulation and thus is expensive. Taking in Eq. (1) as an example, we need: (1) an inner loop to compute the mean value of using random samples of and (2) an outer loop to compute by iterating the inner loop times at different values of . In addition, other Monte Carlo simulation iterations are required to compute . Various algorithms have been developed in the literature to reduce the computational cost [21–25]. Any one of them can be used to compute the Sobol’ indices in this paper. Several illustrative examples on computing Sobol’ indices can be found in Ref. [19].
The Sobol’ index computation requires (1) a deterministic input–output function and (2) the representation of all the inputs by uncorrelated continuous probabilistic distributions. These two requirements need to be achieved before applying Sobol’ indices in the proposed approach for test resource allocation. Section 3 analyzes the uncertainty sources in test outcomes and develops an approach to achieve both requirements.
Uncertainty Sources in Test Outcomes
Recall that all the data considered in test resource allocation analysis have to be synthetic since the analysis is done before any actual test. The uncertainty in the synthetic data depends on specific test conditions, including: (1) the possible values of inputs ; (2) the number of test types; and (3) whether a single test specimen or multiple specimens are used for each type of test.
Regarding the first condition, this paper assumes that a distribution of is provided by the testing personnel or assumed based on some information. For example, for a single model input , we may have where is the lower bound and is the upper bound. We can also use other types of distribution such as Gaussian distribution to capture the uncertainty in if additional information is available.
This section will analyze the uncertainty sources in the synthetic data regarding the second and third conditions; the corresponding deterministic function required by the Sobol’ indices also varies correspondingly. The rest of this section starts from the simplest case of one type of test and single specimen, and subsequently extends it to multiple types of tests and multiple test specimens.
Single Type of Test and Single Test Specimen.
If only one type of test is available and all tests are conducted on a single specimen, the actual test data is a set of data points obtained from the same specimen. Figure 2 shows the generation and usage of the synthetic data in this case. As shown in the left part of Fig. 2, to generate a data set of synthetic data points, four steps should be followed: (1) select and fix the values of , where is the dimension of model parameters; (2) generate samples of model inputs , where is the dimension of model inputs; and (3) propagate and through the computational model ; and (4) record the model input and output with measurement errors added. The resultant data set contains pairwise data points as
where is the model input measurement error and is the model output measurement error. If the model input measurement error is ignored, then .
A crucial point in the generation of synthetic data is regarding the model parameters . For a single specimen, have true but unknown values, meaning that the uncertainty in is epistemic. Thus, the uncertainty caused by is the uncertainty in selecting the values of before generating a synthetic data set; once selected, the values of are fixed within the synthetic data set. This uncertainty in only exists in the synthetic data; actual tests will fix the value of at their true values.
The four steps mentioned earlier indicate three uncertainty sources in generating a pairwise synthetic data point , including:
- (1)
Uncertainty regarding the values of model parameters can be represented by their prior distribution based on available knowledge before conducting any physical test. This uncertainty is epistemic since have unknown but fixed true values.
- (2)
Uncertainty regarding the possible values of inputs to be used in the tests. As mentioned earlier, a distribution of has been provided or assumed. This uncertainty is also epistemic if the values of are unknown during test selection analysis, but will be decided by the test personnel in actual tests.
- (3)
Uncertainty regarding input measurement errors and output measurement errors . Usually, measurement error is assumed to have a zero mean Gaussian distribution; thus, and . The uncertainty in and is aleatory if the values of and are known; but additional epistemic uncertainty regarding and will be introduced if their values are unknown.
where for representing the uncertainty sources in generating a single pairwise data point , and is the number of pairwise data points; represents the entire process shown in Fig. 2, including both synthetic data generation and model calibration/validation analyses before predicting the system response.
In Eq. (6), the uncertainty in represents the variability in the actual test outcomes; while the epistemic uncertainty in only exists in the synthetic data, not in actual test data. To minimize the sensitivity of the system response prediction to the variability in the test outcomes, we need to minimize the sensitivity index of in Eq. (6) so that and are insensitive to the variability in test outcomes and consistent system response prediction distributions can be achieved under different actual test outcomes. However, this minimization requires the sensitivity index closer to zero while numerical accuracy is always a challenge for small sensitivity indices.
Instead, this paper chooses to maximize the sensitivity index of . If that is achieved, the epistemic uncertainty in will be dominant toward the uncertainty in the system response prediction mean and the system response prediction variance (based on synthetic data). In the system response prediction using actual test data where are fixed at their true values, the most dominant uncertainty contribution to and will be removed. Therefore, the uncertainty in and caused by test outcome uncertainty will reduce significantly and consistent system response prediction distributions can be achieved under different actual test outcomes. In sum, the basic idea of the proposed approach is to maximize the contribution of epistemic uncertainty regarding model parameters in the synthetic data.
Note that the proposed approach guarantees consistent system response predictions regardless of what the true values of are, since the Sobol’ index is a global sensitivity analysis method and considers the entire distribution of .
Single Type of Test and Multiple Test Specimens.
For a single type of test, multiple test specimens are required if the test is destructive so that each specimen can be used only once. Two examples of destructive tests are fatigue test and tensile strength test. The true value of a model parameter for is fixed for a single specimen, but varies across different specimens. This variability of may be represented by a probability distribution where are the distribution parameters of . For example, if has a Gaussian distribution where is the mean value and is the standard deviation. In addition, the entire set of distribution parameters for all components of are denoted as where for . In this case, have unknown true values; thus, the uncertainty in is epistemic; and this uncertainty can be represented by a prior distribution based on available knowledge. Thus, model calibration aims to quantify the uncertainty in , instead of . (Note that have both aleatory and epistemic uncertainty, whereas the uncertainty in is epistemic.)
In the case of single type of test and multiple test specimens, the steps in generation and usage of the synthetic data set of data points are similar to those in Fig. 2, but the box “model parameters ” should be replaced by “ ,” where is the value of generated for the th specimen (i.e., the th test). Compared to Fig. 2, the values of are now selected before generating a synthetic data set; once selected, the values of are fixed within the synthetic data set. The values of model parameters for each of the specimens are generated from the conditional distribution for .
It seems natural to replace in Eq. (6) with and build new functions for the Sobol’ indices computation. However, the new functions will not be deterministic functions as required by the Sobol’ indices. A specific realization of does not determine the values of but only the distribution for ; thus, are still stochastic at given . Only deterministic values of and () can decide the subsequent system response prediction distribution and its mean value and variance . In sum, an approach to establish a deterministic relationship from to is needed.
where is the inverse CDF of at given . Note that has the standard uniform distribution . Equation (7) indicates three steps: (1) generate the values of from their prior distribution to produce the conditional distribution ; (2) generate the value of from ; and (3) substitute into the inverse CDF to obtain a unique value of .
The uncertainty in model parameter consists of two components: (1) the epistemic uncertainty in distribution parameters , represented by the prior distribution ; and (2) the aleatory uncertainty in at given , represented by the conditional distribution . These two parts are coupled since depends on the value of . The introduced auxiliary variable captures the aleatory uncertainty, and also helps to decouple the aleatory and epistemic uncertainties [26] since the distribution of does not depend on .
As explained earlier, the basic idea of the proposed approach is to maximize the contribution of epistemic uncertainty of in the synthetic data, in the case of a single specimen. In the case of multiple specimens, we need the contribution of to be dominant in the context of Eq. (8). If that is achieved, in the system response prediction using actual test data where are fixed at their true values, the most dominant uncertainty contribution to and will be removed. Therefore the uncertainty in and caused by test outcome uncertainty will be reduced significantly, and different actual test outcomes will lead to consistent system response predictions.
Multiple Types of Tests and Single Test Specimen.
In the case that different types of tests are to be considered and each type utilizes only one specimen (nondestructive test), Fig. 2 expands to Fig. 3, and Eq. (6) expands to
Equation (9) gives the required deterministic functions for Sobol’ indices computation. In Eq. (9), for represents the uncertainty regarding inputs and measurement errors in generating the synthetic data for the th type of test, where for and ; represents the test number and is the total number of the th type of test. Note that here is the vector of the model parameters in all types of tests, and test type refers to calibration test versus validation test, and the output quantities measured, as explained in Sec. 1.
Similar to the earlier discussion, in the test resource allocation optimization regarding Eq. (9), we need the contribution of the epistemic uncertainty in toward the uncertainty in and to be dominant. For the case of multiple types of tests and single test specimen, an example with a framework considering only model calibration is considered in Sec. 5.1; another example of a framework incorporating both model calibration and model validation is considered in Sec. 5.2.
Multiple Types of Tests and Multiple Test Specimens.
Similarly, in the test resource allocation optimization regarding Eq. (10), we need the contribution of the epistemic uncertainty in toward the uncertainty in and to be dominant.
Selection of Sobol’ Indices.
Thus far, deterministic functions for Sobol’ indices computation in different test conditions have been established. Robust design of resource allocation can be achieved by maximizing the contribution of the epistemic uncertainty regarding either (single specimen) or (multiple specimen). This epistemic uncertainty is represented by a set of random variables ( in Eqs. (6) and (9); in Eqs. (8) and (10)). The total effect sensitivity index considers the interactions between the subset of random variables and its complement; thus, to be more comprehensive, the optimization in this paper uses Eq. (4) to compute the total effect index for the subset of epistemic uncertainty (either or ). In the rest of the paper, Sobol’ index indicates the total effect index in Eq. (4). The computed Sobol’ indices are denoted as for and for . In the case of single specimen, so that and are the Sobol’ indices of ; in the case of multiple specimen, so that and are the Sobol’ indices of .
Optimum Test Resource Allocation
Formulation.
where is the unit cost of the th type of test and is the number of tests of the th type; and is the budget constraint; and and are user-defined positive constant weight coefficients.
where and are the desired lower bounds of the Sobol’ index for and , respectively.
Equations (11) and (12) are both integer optimization problems since the decision variables are integers. Sometimes, integer optimization is solved using a relaxation approach [28], where the integer constraint is first relaxed, and the integers nearest to the resultant optimal solution are used as the solution of the original (unrelaxed) problem. Unfortunately, this approach is not applicable here because the synthetic data to be used in model calibration/validation can be generated only if are integers. It is not possible to generate test data for a noninteger number of tests.
Solution Algorithm.
A simulated annealing algorithm [29] is used for the solution of Eqs. (11) and (12) because it can handle stochastic discrete optimization problems without requiring relaxation. For discrete optimization problems such as in Eqs. (11) and (12), this algorithm aims to minimize an objective function where is a vector of integers and its feasible region is . If the objective is to maximize as shown in Eq. (11), ought to be minimized.
As shown in Fig. 4, the simulated annealing algorithm starts from an initial value . If is the optimal solution in an iteration, a new value will be randomly selected within the neighborhood of . This neighborhood, denoted as , can be defined by different proposal density functions; and this paper defines where is a user-defined positive integer for . In one iteration, if , the new value is accepted as the new optimal solution; otherwise, the probability to accept is
where is the user-defined starting value of , is the current iteration number, is the total number of iterations allowed, and is a user-defined exponent that determines the rate of decrease of . This iteration proceeds until the total allowed number of iterations is expended.
Summary.
This section proposed formulations for test resource allocation optimization, considering two formats: (1) maximizing the Sobol’ index of the epistemic uncertainty in or subject to budget constraint and (2) minimizing the cost subject to the Sobol’ index threshold. Both formats are applicable to the cases of single or multiple specimens and single or multiple types of tests. As a result, the system response predictions become insensitive to the variability in test outcomes. The decision variables (numbers of tests) are discrete variables, and a simulated annealing algorithm is used to solve this discrete optimization. In this optimization, the Sobol’ index of the epistemic uncertainty in or is computed by the method discussed in Sec. 2.
Numerical Examples
This section uses two examples to illustrate the proposed method. The first example is a mathematical problem and the second example is a structural dynamics problem. Regarding the types of tests, specimen, and calibration/validation, the first example considers: (1) multiple types of tests, (2) model calibration only, and (3) both the cases of single and multiple specimens. The second example considers: (1) multiple types of tests, (2) both model calibration and validation, and (3) single specimen only.
Mathematical Example.
The inputs and are assumed to be independent random variables; the uncertainty regarding their values in tests is represented by uniform distributions , based on ranges obtained from the test personnel.
Two types of tests are available. Test type I measures with measurement error ; and test type II measures with measurement error . The resultant synthetic data are pairwise data and , respectively. Assume that the unit cost of type I test is 4 and the unit cost of type II test is 1.
Two cases are considered in this example: single test specimen versus multiple test specimens. In case 1 of single specimen, model parameter has true but unknown values to be calibrated. In case 2 of multiple specimens, follow normal distributions and across specimens, and the parameters to be calibrated are .
The process to realize the system response prediction , i.e., the framework of model calibration/validation with the synthetic data is shown in Fig. 5, where the posterior distributions of calibration parameters together with the known distributions of and are propagated through the computational model in Eq. (15) to obtain the distribution of . Note that model validation is not considered in this example; only calibration is considered. The proposed test resource allocation approach can also handle model validation, as shown in the next numerical example.
Case 1: Single Test Specimen.
Optimization formulation 1.
where is the number of type I tests and is the number of type II tests. and are the decision variables, i.e., we need to decide the number of replications of each type of test.
The simulated annealing algorithm is used to solve Eq. (16), and Fig. 6 records the process of optimization. Figure 6(a) shows that the optimization starts at an initial design point and terminates at the optimal solution . Figure 6(b) shows that only some of the random walks are accepted and the maximized Sobol’ index sum is 1.89. The feasible region in Fig. 6(a) covers the combinations of and such that . Note that (1) this feasible region is obtained by extra computation and (2) this feasible region is shown only to help in visualizing the result but is not needed in the optimization.

Optimization of the mathematical example based on Eq. (16): (a) history of accepted random walks and (b) history of the Sobol’ indices sum

Optimization of the mathematical example based on Eq. (16): (a) history of accepted random walks and (b) history of the Sobol’ indices sum
As discussed in Sec. 3.1, since the robustness objective is maximized, the optimal solution for Eq. (16) should lead to consistent system response prediction regardless of the true values of . Three steps are pursued to verify it: (1) assume “true” values of ; (2) generate multiple sets of data with the size of based on the assumed value of from step 1; and (3) plot the family of system response prediction PDFs using the data sets in step 2 and observe whether they are consistent. Although the data are still synthetic, this is a simulation of the system response prediction using the actual test data since the model parameters are fixed at the same value across different data sets; while in the synthetic data generation for test resource allocation shown in Fig. 2, the model parameters are fixed within a single data set but vary across different data sets. The results of this verification are shown in Fig. 7. Figure 7(a) indicates that leads to consistent system response predictions if the true values of model parameters are ; similarly, Figs. 7(b) and 7(c) show that consistent system response predictions are also obtained if or .

Family of system response prediction PDFs at the solution of Eq. (16) of : (a) , (b) , and (c)

Family of system response prediction PDFs at the solution of Eq. (16) of : (a) , (b) , and (c)
As a comparison, Fig. 8 shows the same results as Fig. 7 but at a suboptimal solution of . This suboptimal solution spends the same cost as the optimal solution, but the enlarged variation across different PDFs in Fig. 8 indicates that this suboptimal solution cannot guarantee consistent predictions as the optimal solution. To quantify this conclusion, Table 1 compares the “variance of the variance of the prediction” at the optimal and suboptimal solution. This table clearly shows that the optimal solution always has smaller values of at different values of , which proves that the optimal solution gives more consistent predictions.
Optimization formulation 2.
The simulated annealing algorithm is used to solve Eq. (17), and Fig. 9 records the process of optimization. Figure 9(a) shows that the optimization starts at an initial design point and terminates at the optimal solution . Figure 9(b) shows that only some of the random walks are accepted and the minimized cost is 19. The feasible region in Fig. 9(a) covers the combinations of and such that and . Similar to Fig. 6, note that (1) this feasible region is obtained by extra computation and (2) this feasible region is shown only to help in visualizing the result but is NOT needed in the optimization.

Optimization of the mathematical example based on Eq. (17): (a) history of accepted random walks and (b) history of cost

Optimization of the mathematical example based on Eq. (17): (a) history of accepted random walks and (b) history of cost
As discussed in Sec. 3.1, since the robustness constraints are satisfied, the optimal solution for Eq. (17) should lead to consistent system response prediction regardless of the true values of . The same three steps for Fig. 7 are pursued to verify it. The results of this verification are shown in Fig. 10. Figure 10(a) indicates that leads to consistent system response predictions if the true values of model parameters are ; similarly, Figs. 10(b) and 10(c) show that consistent system response predictions are also obtained if or .

Family of system response prediction PDFs at the solution of Eq. (17) of : (a) , (b) , and (c)

Family of system response prediction PDFs at the solution of Eq. (17) of : (a) , (b) , and (c)
Case 2: Multiple Test Specimens.
Optimization formulation 1.
The simulated annealing algorithm is used to solve Eq. (18), and Fig. 11 records the process of optimization. Figure 11(a) shows that the optimization starts at an initial design point and terminates at the optimal solution . Figure 11(b) shows that only some of the random walks are accepted and the maximized Sobol’ index sum is 1.92.

Optimization of the mathematical example based on Eq. (18): (a) history of accepted random walks and (b) history of the Sobol' indices sum

Optimization of the mathematical example based on Eq. (18): (a) history of accepted random walks and (b) history of the Sobol' indices sum
As discussed in Sec. 3.1, since the robustness objective is maximized, the optimal solution for Eq. (18) should lead to consistent system response prediction regardless of the true values of . The results of this verification are shown in Fig. 12.

Family of system response prediction PDFs at the solution of Eq. (18) of : (a) , (b) , and (c)

Family of system response prediction PDFs at the solution of Eq. (18) of : (a) , (b) , and (c)
As a comparison, Fig. 13 shows the same results as in Fig. 12 but at a suboptimal solution of . This suboptimal solution spends the same cost as the optimal solution, but the enlarged variation across different PDFs in Fig. 13 indicates that this suboptimal solution cannot guarantee consistent predictions as the optimal solution. To quantify this conclusion, Table 2 compares at the optimal and suboptimal solution. This table clearly shows that the optimal solution always has smaller values of at different values of , which proves that the optimal solution gives more consistent predictions.
Optimization formulation 2.
The simulated annealing algorithm is used to solve Eq. (19), and Fig. 14 records the process of optimization. Figure 14(a) shows that the optimization starts at an initial design point and terminates at the optimal solution . Figure 14(b) shows that only some of the random walks are accepted and the minimized cost is 30.

Optimization of the mathematical example based on Eq. (19): (a) history of accepted random walks and (b) history of cost

Optimization of the mathematical example based on Eq. (19): (a) history of accepted random walks and (b) history of cost
As discussed in Sec. 3.1, since the robustness constraints are satisfied, the optimal solution for Eq. (19) should lead to consistent system response prediction regardless of the true values of . The results of this verification are shown in Fig. 15.

Family of system response prediction PDFs at the solution of Eq. (19) of : (a), (b) , and (c)

Family of system response prediction PDFs at the solution of Eq. (19) of : (a), (b) , and (c)
Multilevel Problem.
The second numerical example is a multilevel structural dynamics challenge problem provided by Sandia National Laboratories [30]. In this example, we have four types of tests and a single specimen, as explained in Sec. 3.3. As shown in Fig. 16, this multilevel problem consists of three levels. Tests are available at level 1 and level 2, and it is required to predict the system response in level 3.
![Structural dynamics challenge problem [30]: (a) level 1 (for testing), (b) level 2 (for testing), and (c) level 3 (for system prediction)](https://asmedc.silverchair-cdn.com/asmedc/content_public/journal/verification/2/2/10.1115_1.4037313/10/m_vvuq_002_02_021004_f016.png?Expires=1663247644&Signature=fivm~g5X3tN3fvOQ7crN65yW7Q5iiT7g9VPVk7oC8hs1w2xpUxy9NQm5xxbg9ddyt1JfQKEBrMVqydhjmqHmbSQs98yjVRb7ve5GSd4b9d79Hfg-mTw29NA0YzuYLK0pM~a5ap~cJ-i-OC6qf3LIWuGTxsu~75Arb2OkKdW3GDaIKzDYVxdK59sK~rB~PL~4MdhB5CTzyYVxuzvXzA3TGGuSnKVD3LGs58wvpY9gKJiO7uH8NQMlfQk7stdvatsM0OuVYAVu3Dm6cmov-NqeuMp~kc9nyQreWFXi63snEkAj4aOveU2JwGL-JO7cK-y2Y8e-D-Dl6g~UynruYX5YoQ__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Structural dynamics challenge problem [30]: (a) level 1 (for testing), (b) level 2 (for testing), and (c) level 3 (for system prediction)
![Structural dynamics challenge problem [30]: (a) level 1 (for testing), (b) level 2 (for testing), and (c) level 3 (for system prediction)](https://asmedc.silverchair-cdn.com/asmedc/content_public/journal/verification/2/2/10.1115_1.4037313/10/m_vvuq_002_02_021004_f016.png?Expires=1663247644&Signature=fivm~g5X3tN3fvOQ7crN65yW7Q5iiT7g9VPVk7oC8hs1w2xpUxy9NQm5xxbg9ddyt1JfQKEBrMVqydhjmqHmbSQs98yjVRb7ve5GSd4b9d79Hfg-mTw29NA0YzuYLK0pM~a5ap~cJ-i-OC6qf3LIWuGTxsu~75Arb2OkKdW3GDaIKzDYVxdK59sK~rB~PL~4MdhB5CTzyYVxuzvXzA3TGGuSnKVD3LGs58wvpY9gKJiO7uH8NQMlfQk7stdvatsM0OuVYAVu3Dm6cmov-NqeuMp~kc9nyQreWFXi63snEkAj4aOveU2JwGL-JO7cK-y2Y8e-D-Dl6g~UynruYX5YoQ__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Structural dynamics challenge problem [30]: (a) level 1 (for testing), (b) level 2 (for testing), and (c) level 3 (for system prediction)
Level 1: The three mass-spring-damper components are connected in series (Fig. 16(a)), and a sinusoidal force input is applied to . The observable quantity is the maximum acceleration at the top mass and the measurement error is . The computational model for can be found in structural dynamics text books [31]; thus, synthetic data of can be generated.
Level 2: The mass-spring-damper system is mounted on a beam supported by a hinge at one end and a spring at the other end (Fig. 16(b)), and a sinusoidal force input is applied on the beam. The observable quantity is the maximum acceleration at the top mass and the measurement error is . The computational model for based on finite element analysis is provided by Sandia National Laboratories [30]; thus, synthetic data of can be generated. Level 1 and level 2 are defined as lower levels, and test data are assumed to be available only at lower levels.
Level 3: This has the same configuration as level 2, but the input is a random process loading (indicating a difference in usage condition), as shown in Fig. 16(a). Level 3 is the prediction configuration of interest, and the response to be predicted is the maximum acceleration at the top mass at level 3. No test data are available at level 3. The computational models for are also provided by Sandia National Laboratories [30].
All three levels have the same model parameters, i.e., the three spring stiffnesses . This example assumes the case of single test specimen; thus, are the parameters to be calibrated. They are assumed to be deterministic but unknown, with independent prior distributions , , and .
Four types of tests are available in this example:
- (1)
Type I test measures and the resultant data set is used in model calibration;
- (2)
Type II test measures but the resultant data set is used in model validation;
- (3)
Type III test measures and the resultant data set is used in model calibration;
- (4)
Type IV test measures but the resultant data set is used in model calibration.
The unit costs of these four types of tests are denoted as , respectively, and the number of each type of test is denoted as , respectively.
The key step to predict is to estimate the values of the model parameters . A reasonable route is to quantify the model parameters using lower level calibration data of and , and propagate the results through the computational model at the system level. However, either or can be used to calibrate the same model parameters; thus, three calibration options are possible: (1) calibration using the data on alone; (2) calibration using the data on alone; and (3) calibration using the data on both and . The challenge in such a multilevel problem is how to select from or combine these alternative calibration results. This paper uses the roll-up method developed in Refs. [5] and [32] to solve this challenge. This roll-up method uses Bayesian model averaging of various calibration results and the weights for the averaging are obtained from model validation in each lower level. Thus, the framework of model calibration/validation for system response prediction considers both model calibration and validation. A brief introduction of this framework is given here:
- (1)
Model calibration by Bayesian inference to obtain the posterior distributions , , and , respectively.
- (2)
Model validation at lower levels using the model reliability metric [5,33]. The resultant model validity at level 1 and level 2 is denoted as and , respectively.
- (3)Obtain the integrated distribution by the roll-up formula [5,32,34] in the below equationwhere and and denotes the prior distribution of . In Eq. (20), the integrated distribution is a weighted average of four terms: in the first term, the posterior distribution uses the calibration data of both level 1 and level 2 and its weight is the probability that both models are valid; in the second and third terms, the posterior distribution uses the calibration data at level alone and its weight is the probability that the model at level is valid but the model at another level is invalid; in the last term, the weight of the prior distribution is the probability that both of the models are invalid. Recently, a more comprehensive approach incorporating the relevance between lower levels and level 3 has been developed in Ref. [6]; and the proposed method in this paper is also applicable for this new approach.(20)
- (4)
Propagate through the computational model of to predict the distribution of .
Since the computational models and measurement errors are known so that synthetic data of four types of test can be generated, and the framework of model calibration/validation is known, the proposed approach of test resource allocation is used to optimize the number of each type of test.
Optimization Formulation 1.
The simulated annealing algorithm is used to solve Eq. (21). The initial value is . Among 500 iterations, the random walks of 226 iterations are accepted. Figure 17 shows the change of index sum over the iterations and the maximized index sum at the optimal solution is 1.88. The final optimal solution is .
As discussed in Sec. 3.1, since the robustness objective is maximized, the optimal solution would result in consistent system response predictions regardless of the true value of model parameters k. Similar to the mathematical example in Sec. 5.1, verification of this multilevel test allocation result is shown in Fig. 18. Figure 18 indicates that consistent system response predictions with three different assumed true values of model parameters.
Optimization Formulation 2.
The simulated annealing algorithm is used to solve Eq. (22). The initial value is . Among 500 iterations, the random walks of 164 iterations are accepted. Figure 19 shows the change of cost over the iterations and the minimized cost at the optimal solution is 66. The final optimal solution is .
As discussed in Sec. 3.1, since the robustness constraints are satisfied, the optimal solution should lead to consistent system response prediction regardless of the true value of model parameters . Similar to the mathematical example in Sec. 5.1, verification of this multilevel test allocation result is shown in Fig. 20. Figure 20 indicates that consistent system response predictions with three different assumed true values of model parameters.
Summary
Test resource allocation aims to optimize the number of each type of test before any actual test is conducted. This paper focuses on the proposed robust test resource allocation, which means that the system response prediction is insensitive to the variability in the test outcomes so that consistent system response predictions can be achieved under different test outcomes.
The main challenge for the proposed approach is to quantify the contribution of test outcome uncertainty toward the uncertainty in the system response prediction. Since test resource allocation is needed before any actual test, this test outcome uncertainty is simulated by the uncertainty in the synthetic data. This paper analyzes the uncertainty sources in the synthetic data regarding different test conditions and concludes that consistent system response predictions will be achieved if the contribution of epistemic uncertainty regarding model parameters in the synthetic data can be maximized. This paper uses the global sensitivity analysis method Sobol’ indices to assess this contribution, so the desired consistent system response predictions can be guaranteed regardless of the true values of the parameters in the actual tests ( for single specimen and for multiple specimen).
Two cases of optimization are considered in this paper: (1) subject to the budget constraint, optimize the number of each type of test to reach the most robust design or (2) subject to the robustness requirement, find the number of each type of test to minimize the budget. In addition, the proposed approach can be applied in multiple situations: (1) only model calibration tests are performed or (2) both model calibration and model validation tests are performed. The method can also be applied to tests involving single or multiple specimens. The proposed method results in a discrete stochastic optimization problem, and a simulated annealing algorithm is used to solve this problem.
This paper assumes that the test inputs are from a range of values and represents the uncertainty regarding the test inputs through uniform distributions. Note that this paper is only focused on choosing the number of experiments after the available physical tests are identified. To answer the question that how to choose the physical tests, several factors should be considered, in particular the relevance and sensitivity of the experiments to the calibration quantity of interest. The assessment of relevance and sensitivity addressed in Ref. [1] may be useful in identifying the useful physical test configurations. This paper only addresses the variability of the test data, and on optimizing the number of each type of test so that we can get consistent predictions under different test data outcomes.
This paper assumes that the quantity of interest to predict is a scalar, so we can easily use variance as its uncertainty indicator; thus, the variance-based Sobol’ index can be easily used. If the quantity of interest is a vector, another indicator instead of variance may be needed, and the corresponding sensitivity index is also required. Thus, further work is needed to extend the proposed method to vector and field outputs.
Another direction for further work is regarding test design. The context of the proposed method is during the stage of budget planning, and usually at this stage, details of the test design are not known or considered. Thus, this paper only focuses on optimizing the number of each type of the test. The extension of the proposed approach to include test design, i.e., deciding the specific test conditions, can be studied in future work such that the resultant system response prediction uncertainty can be further reduced. This can be addressed in two ways: (1) by simultaneously optimizing the number of tests and the test inputs or (2) by adaptively deciding the number of tests and their input conditions based on the observation data as the test campaign progresses.
Acknowledgment
The authors appreciate valuable discussions with Joshua Mullins from Sandia National Laboratories.
Funding Data
Sandia National Laboratories (Contract No. BG-7732).