## Abstract

The objective of this work is to reduce the cost of performing model-based sensitivity analysis for ultrasonic nondestructive testing systems by replacing the accurate physics-based model with machine learning (ML) algorithms and quickly compute Sobol’ indices. The ML algorithms considered in this work are neural networks (NNs), convolutional NN (CNNs), and deep Gaussian processes (DGPs). The performance of these algorithms is measured by the root mean-squared error on a fixed number of testing points and by the number of high-fidelity samples required to reach a target accuracy. The algorithms are compared on three ultrasonic testing benchmark cases with three uncertainty parameters, namely, spherically void defect under a focused and a planar transducer and spherical-inclusion defect under a focused transducer. The results show that NNs required 35, 100, and 35 samples for the three cases, respectively. CNNs required 35, 100, and 56, respectively, while DGPs required 84, 84, and 56, respectively.

## 1 Introduction

Nondestructive testing (NDT) [1,2] is the process of evaluating, testing, or inspecting assemblies or components for discontinuities or damages without affecting the serviceability of the part. NDT can be performed either during manufacturing or while the component is in service. This is essential to ensure product integrity and reliability as well as lower production cost and maintaining a uniform level of product quality. NDT has been successfully used in various applications such as aircraft damage estimation [3,4], weld defect inspection [5,6], and flaw characterization [7,8].

Nondestructive testing measurements have traditionally relied on experimental methods. These methods, however, can be both time-consuming as well as costly to perform. In order to reduce this time and cost, various physics-based NDT models [911] have been developed and used to reduce the need for empirical data. These include numerical methods such as finite element methods [12,13] and boundary element methods [14,15], as well as approximation methods such as Gaussian beam superposition methods [16,17] and ray tracing methods [18,19]. The aforementioned physics-based models can be cheaper, both financially and computationally, when compared to experimental methods, and can also have high accuracy. This can prove essential to reducing the need for empirical data, by replacing experimental methods with physics-based models.

Sensitivity analysis [20,21] is an approach to quantify the effect of each individual input parameter or combinations of input parameters on the system response. It can be classified as either a local [22,23] or a global sensitivity analysis [24,25]. In local sensitivity analysis, small perturbations in the input parameter space are used to quantify their effects on the system response. For global sensitivity analysis, the variance of the system response due to the input parameter variability is quantified.

In this study, global variance-based sensitivity analysis with Sobol’ indices [26,27] is used to quantify the effects of input parameter variability of the output of physics-based ultrasonic testing (UT) simulations. The main elements of the model-based sensitivity analysis are: (1) identifying the important input variability parameters and their corresponding probability distributions, (2) propagating these input parameters through the physics-based model, called uncertainty propagation, and (3) performing sensitivity analysis using Sobol’ indices.

The key challenges of model-based sensitivity analysis include (1) each physics-based model can be computationally expensive to solve, (2) a large number of variability parameters can exist for the NDT systems, and (3) multiple and repetitive physics-based model evaluations are required for sensitivity analysis. This results in problems that are challenging to solve in a reasonable amount of time.

To alleviate this computational cost, various metamodeling methods [28,29] have been developed. Metamodeling methods replace the time-consuming and accurate physics-based models with a computationally efficient metamodel (also called surrogate model). Metamodeling methods can be broadly classified as either data-fit methods [28,30] or multifidelity methods [31,32]. Data-fit methods are constructed by fitting a response surface through evaluated model responses at sampled high-fidelity data points. Examples of data-fit methods include polynomial chaos expansions (PCE) [33], Kriging [34], and support vector machines [35]. In multifidelity methods, on the other hand, low-fidelity data are used to enhance the prediction capabilities of a data-fit model constructed from a limited number of high-fidelity data points. Examples of such methods include Cokriging [36] and manifold mapping [37].

Metamodeling methods have been utilized for various NDT applications. Bilicz et al. [38,39] used Kriging [34] for both forward and inverse problems using eddy current NDT. Support vector regression [35] was used by Miorelli et al. [40] to perform both sensitivity analysis and probability of detection for eddy current NDT systems. Du et al. [41] performed sensitivity analysis and probability of detection for UT NDT systems using PCE [42] and Kriging [34]. Du and Leifsson [43] developed the polynomial chaos-based Cokriging multifidelity method [43] and used it for model-based probability of detection in UT NDT systems.

This paper introduces and applies three different machine learning algorithms for sensitivity analysis of UT NDT systems. In particular, neural networks (NNs) [44,45], convolutional neural networks (CNNs) [46,47], and deep Gaussian processes (DGPs) [48,49] are introduced to the global variance-based sensitivity analysis of UT NDT systems. Note that for these applications, the machine learning algorithms used can be classified as data-fit metamodeling methods. The UT NDT cases considered in this study are three benchmark cases developed by the World Federation of Nondestructive Evaluation Centers (WFNDEC).2 The benchmark cases are (1) spherically void defect under a focused transducer, (2) spherically void defect under a planar transducer, and (3) spherical-inclusion defect under a focused transducer. The sensitivity analysis results from the machine learning algorithms are compared to those obtained from directly evaluating the high-fidelity physics-based model. In this work, the analytical UT model by Thompson and Gray [50,51] is used as the high-fidelity physics-based model.

The remainder of this paper is organized as follows. The following section describes the methods used in this paper to train the machine learning algorithms and perform sensitivity analysis. The next section presents the result of the application of these algorithms to the benchmark cases. This paper ends with the conclusion of this study and provides suggestions for future work.

## 2 Methods

The methods used in this work are described in this section. The workflow of the model-based sensitivity analysis is described first, followed by the sampling plan, a detailed description of the machine learning algorithms, their validation, and finally, the sensitivity analysis with Sobol’ indices.

### 2.1 Workflow.

A flowchart of the model-based sensitivity analysis is shown in Fig. 1. The process begins by sampling the input parameter space. Two separate sets of sampling plans are created: one for training and another for testing. The high-fidelity physics-based model simulations are then evaluated using those sampling plans to gather the output responses. The training data is then used to train the machine learning algorithms, and the accuracy of these algorithms are tested using the testing data. If this accuracy, measured in terms of the root-mean-squared error (RMSE), does not meet the threshold of $1%$ standard deviation of the testing points ($σt$), a new training set with a higher number of sample points is created and the previous steps are repeated. Once the machine learning-based models are accurate enough, sensitivity analysis with Sobol’ indices [26] is performed.

Fig. 1
Fig. 1
Close modal

### 2.2 Sampling Plan.

The process of selecting discrete samples in the input variability space is known as sampling. It is an iteration-based process in which the input variability parameters are randomly drawn from probability distributions assigned to the parameters. In this work, Latin Hypercube sampling (LHS) [52] is used to generate the training plan, while Monte Carlo sampling (MCS) [53] is used to generate the testing plan. MCS is also used to generate the sampling plan for the Sobol’ indices [26] calculation.

Monte Carlo sampling [53] starts by generating random numbers within the [0,1] interval with replacement. These random numbers are used as probabilities of the associated cumulative density functions of the variability parameters. The corresponding values can then be obtained using quantile functions. LHS [52,54] on the other hand aims at sampling the variability parameters more effectively than MCS. This is done by stratifying the probability distributions into equal intervals in the [0,1] range. Random samples are then selected from each interval. This prevents the clustering of the generated numbers. The remaining steps are the same as MCS.

### 2.3 Neural Networks.

NN [46] methods can be classified as a subclass of data-fit metamodeling methods. Any complex function can be approximated through an hierarchy of features. NNs [45] have multiple steps in the hierarchy, which starts with an input and ends with an output. Each step is known as a layer. The layers in-between the inputs and the outputs are called hidden layers [44]. Figure 2 shows an example of a schematic of a NN architecture with two hidden layers. The input and output layer size depends on the dimensionality of the input and output for a given problem. For the three UT benchmark cases, the input has three features, while the output has a size of one. Each hidden layer in a NN consists of neurons, which are a fundamental unit of computation in a NN [46]. Neurons calculate a weighted sum of the outputs from a previous hidden or input layer and outputs a nonlinear transformation of it. This nonlinear transformation is termed activation. In Fig. 2, each hidden layer has eight neurons. By changing the number of hidden layers as well as the number of neurons in each hidden layer, the NNs can approximate functions of arbitrary complexity [46].

Fig. 2
Fig. 2
Close modal
The activation function in each neuron of a given hidden layer, L, is given by Ref. [46]
$zj(L)=a(∑i=1N(L−1)ωji(L)zi(L−1)+b(L−1))$
(1)
where a is the activation function, $ωji(L)$ is the weight between the ith neuron in the L − 1 hidden layer and jth neuron in the L hidden layer, and b(L−1) is the bias unit in the L − 1 hidden layer. The weight and biases together are termed the parameters of the NN and are tuned using a gradient-based optimizer [46]. Here, the maximum value of i is N(L−1), that is the number of neurons in the L − 1 hidden layer. $zj(L)$ is the output of the jth neuron in the L hidden layer, and $zi(L−1)$ is the output of the ith neuron in the L − 1 hidden layer.
A loss function, $L$, is defined to capture the mismatch between the training data observations, given by y, and the predicted value of the NN, given by $y^$ [46]. The loss function chosen in this study is the mean-squared error between y and $y^$ and is averaged over all the training data. The loss function is written as
$L=∑l=1Ntr∑m=1No(y^m(l)−ym(l))2NtrNo$
(2)
where No is the size of the output layer, Ntr is the number of training data sets, and m is the index of the neuron in the output layer. In practice, averaging of the loss function is generally performed over a subset of the training data sets, known as mini-batch [46].

Training the NN involves solving an optimization problem where the objective function is the loss function and is minimized to improve the prediction capabilities of the NN. The gradient of this objective function is calculated with respect to the parameters of the NN and is done efficiently using the backpropagation algorithm [55]. The optimizer used is Adaptive Moments (ADAM) algorithm [56] and has the following steps:

1. Update the biased first moment estimate:
$Vk←β1Vk−1+(1−β1)Gk$
(3)
where Gk is the gradient of the loss function with respect to the parameters at a given iteration k, Vk is the exponential moving averages of the gradients (also called biased first moment estimate), and $β1$ is the exponential decay rate for Vk. The recommended value for $β1$ is 0.9 [56] and is used in this study.
2. Update the biased second moment estimate:
$Sk←β2Sk−1+(1−β2)Gk2$
(4)
where Sk is the exponential moving averages of the squared gradients (also called biased second moment estimate) and $β2$ is the exponential decay rate for Sk. 0.999 is the recommended value for $β2$ [56] and is used in this study.
3. Correct the bias in the first moment:
$V^k←Vk1−β1k$
(5)
where the bias introduced by setting V0 to zero is corrected by $V^k$.
4. Correct the bias in the second moment:
$S^k←Sk1−β2k$
(6)
where the bias introduced by setting S0 to zero is corrected by $S^k$.
5. Update the parameters:
$θk+1←θk+αkV^kS^k+η$
(7)
where αk is the learning rate, $θk$ are the NN parameter values, and $η$ is a small value that is specified in order to prevent the denominator from being zero.

The hyperparameters of a NN include the number of hidden layers, number of neurons in each hidden layer, the mini-batch size, the maximum number of epochs, the learning rate, and the activation function. Rigorous rules for selecting the hyperparameter settings for an NN do not exist. In this work, various hyperparameter settings were selected by iterating over them. The hyperparameters chosen were the ones that resulted in the lowest RMSE as described in Sec. 2.6. The NN architecture used in this work includes one hidden layer with 50 neurons. The mini-batch size selected is 10. Maximum number of epochs is set to 10,000. One epoch refers to one iteration over an entire training data set [44]. The learning rate selected is 0.01. The activation function chosen for this study is the hyperbolic tangent function [46]. To construct the NN, the Keras3 wrapper with Tensorflow [57] is used in this study.

### 2.4 Convolutional Neural Networks.

CNNs [46] are a special type of NN that are used to process data with a grid-like topology [46], such as images. In images, each grid location contains pixel values, and when combined together results in the final image. For this study, the variability parameters are used in the place of the pixels. Since CNN employs the mathematical operation called convolution, it is named as such [46]. Any NN with at least one convolutional layer is termed CNN. For image recognition tasks, the number of parameters in a NN can grow really fast, as it depends on the number of input features. For images, the size of the image defines the number of input features. In CNNs, the number of parameters is independent of the number of input features and is dependant on the size and number of filters of a convolutional kernel [46]. This reduces the number of parameters to be tuned and prevents overfitting in the presence of limited data.

Figure 3 shows a schematic of a CNN architecture that contains one convolutional layer, followed by one max-pooling layer, two fully connected layers, and an output layer. The output of a convolutional layer or max-pooling layer is termed feature maps. Feature maps are similar to images and also have a grid-like topology with pixel values. The number of channels, also known as depth of a feature map, depends on the hyperparameter called number of filters. In Fig. 3, this value is four. To convert a feature map to a fully connect layer, the flatten operation is performed. Flatten converts the three dimensional feature map into a one-dimensional fully connected layer. Similar to a NN, the number of these layers can vary.

Fig. 3
Fig. 3
Close modal

The input image in Fig. 3 has one channel (gray image) and has 32 pixels (grids) in the horizontal and vertical directions each. Note that colored images have three channels, namely, red, blue, and green, respectively. The kernel or filter, which contains parameters to be tuned, has five grids in the horizontal and vertical directions each, that is a 5 × 5 kernel. In a convolutional operation, an element-wise product between the kernel parameter value and image pixel value is performed. This process continues by moving the kernel over the image, from left to right, top to bottom. The hyperparameter stride is used to define how many pixels in both the horizontal and vertical directions to move after performing one convolutional operation.

The output of a convolutional operation is given by
$ci,j(Lc)=a(∑m=1Nc,v∑n=1Nc,hfm,nci+m−1,j+n−1(Lc−1)+b(Lc−1))$
(8)
where a is the activation function, Nc,h and Nc,v refer to the number of grids in the convolutional kernel in the horizontal and vertical directions, respectively, fm,n is the weight of the kernel at the mth row and nth column of the grid in the kernel, $c(Lc)$ is the output of the convolution in the Lc convolutional layer, and $c(Lc−1)$ is the input into the Lc convolutional layer. This value could be from either a max-pooling layer or a convolutional layer. The output of a max-pooling layer is given by
$pi,j(Lp)=max1≤m≤Np,v,1≤n≤Np,h(pi+m−1,j+n−1(Lp−1))$
(9)
where Np,h and Np,v refer to the number of grids in the max-pooling kernel in the horizontal and vertical directions, respectively, $p(Lp)$ is the output of the Lp max-pooling layer, and $p(Lp−1)$ is the input into the Lp max-pooling layer. The max-pooling kernel selects the maximum pixel value and unlike the convolutional kernel has no parameters. Max-pooling is performed to reduce the number of parameter in a CNN [46].

Training the CNN is similar to training the NN. The loss function described in Eq. (2) is used as the objective function for training the CNN. The gradients are calculated using the backpropagation algorithm [55] and the ADAM [56] optimizer is used to minimize the loss function.

The hyperparameters used in the CNN are the number of convolutional and max-pooling layers and the convolutional and max-pooling kernel size. The number of filters of each convolutional kernel and the stride of each kernel is considered as well. Other hyperparameters include the number of fully connected layers, the number of neurons in each layer, the mini-batch size, the maximum number of epochs, the learning rate, and the activation function. The CNN used in this work includes one convolutional layer, one fully connected layer, and no max-pooling layer. Only one convolutional kernel of size 1 × 1 is selected and has a stride of value one. The mini-batch selected has a size of 10, while the maximum number of epochs is set to 10,000. The number of neurons in the fully connected layer is 100 and the learning rate of the ADAM [56] optimizer is 0.01. Hyperbolic tangent function [46] is selected as the activation function for this study. The process of selecting these hyperparameters is similar to those used for the NN. To construct the CNN, the Keras4 wrapper with Tensorflow [57] is used in this study.

One final thing to note is that, while images are not used, each variability parameter is assigned to each grid and the convolutional operation is performed on it. The input grid therefore has a size of 3 × 1. While this grid size is lower compared to most image sizes, CNNs can be easily expanded to work on NDT cases with higher number of variability parameters.

### 2.5 Deep Gaussian Processes.

DGPs [48] are multi-layer generalizations of Gaussian processes (GP), where the inputs to one GP is from the outputs of another. The architecture is similar to a NN (Fig. 2); however, the activation functions in the neurons are replaced by GP mappings. DGPs are shown to overcome the limitations of single-layer GP, while retaining its advantages [49]. DGPs are shown to work well on limited amount of data [48], which is useful for NDT sensitivity analysis.

Consider a GP mapping given by
$zi=fi(X)+ϵi$
(10)
where $X∈Rm$ is the vector of m-dimensional input variability parameters, f is a zero mean GP mapping: fGP(0, k(X, X)), ε is the identically and independently distributed Gaussian noise ($N(0,σϵ2$)), and zi is the output of the ith neuron in the hidden layer. The covariance function, “k”, uses a Gaussian correlation [58] function (also known as automatic relevance determination correlation function [48]) and is given by
$k(X,X′)=σ2exp[−∑k=1l(Xk−Xk′hk)2]$
(11)
where $σ$ and hk are hyperparameters that need to be tuned, and l is the number of inputs to the neuron in the hidden layer. Another covariance function, the Matern-5/2 [59] correlation function, given by
$k(X,X′)=σ2[1+5∑k=1l(Xk−Xk′hk)2+53∑k=1l(Xk−Xk′hk)2]×exp[−5∑k=1l(Xk−Xk′hk)2]$
(12)
is used in this work.

Damianou and Lawrence [48] developed a Bayesian training framework5 to train all the parameters of the DGP. This framework is used to construct the DGP in the present work. The details of this framework are beyond the scope of this paper and can be found in Ref. [48].

The hyperparameters used in the DGP include the number of hidden layers, the number of GP mappings in each hidden layer, the correlation used as the GP mapping, and the total number of training iterations. Similar to NNs and CNNs, these hyperparameters were chosen that resulted in the lowest RMSE. For this study, the DGP used had one hidden layer, with a Matern-5/2 correlation [59] function for the hidden and output layer. The total number of iterations used was set to 1500.

### 2.6 Validation.

The global accuracy of the machine learning algorithms used in this work is measured using the RMSE given by
$RMSE=∑i=1Nt(y^t(i)−yt(i))2/Nt$
(13)
and the normalized RMSE (NRMSE), given by
$NRMSE=RMSE/(max(yt)−min(yt))$
(14)
where Nt is the total number of testing data, and $y^t(i)$ and $yt(i)$ are the machine learning estimation and high-fidelity observation of the ith testing point, respectively. The maximum and minimum values of the high-fidelity physics-based model observations of the testing points are max(yt) and min(yt), respectively. An RMSE less than or equal to $1%σt$ is considered an acceptable global accuracy criterion in this work.

### 2.7 Model-Based Sensitivity Analysis.

Sensitivity analysis with Sobol’ indices [27] is used to quantify the effect of each input variability parameter, as well as a combination of input parameters, on the output model response. A black-box model
$M(X)=f(X)$
(15)
where X is the m variable random input vector can be decomposed as Ref. [26]
$M(X)=f0+∑i=1mfi(Xi)+∑i
(16)
where f0 is a constant, fi is a function of Xi, and so on. One condition of this functional decomposition is that all the terms need to be orthogonal, which can then be decomposed in terms of the conditional expected values
$f0=E(M(X))$
(17)
$fi(Xi)=E(M(X)|Xi)−f0$
(18)
and
$fi,j(Xi,Xj)=E(M|Xi,Xj)−f0−fi(Xi)−fj(Xj)$
(19)
and so on. Here, fi refers to the effect of varying individual input parameter Xi alone. This is termed the main effect of Xi. fi,j is the effect of varying Xi and Xj simultaneously and is called the second-order interaction. Higher-order interactions have analogous definitions. The variance of Eq. (16) is then
$Var(M(X))=∑i=1mVi+∑i
(20)
where
$Vi=VarXi(EX∼i(M(X)|Xi))$
(21)
$Vi,j=VarXi,j(EX∼i,j(M(X)|Xi,Xj))−Vi−Vj$
(22)
and so on. The Xi notation denotes the set of all variables except Xi. Vi refers to the variance of the output due to Xi alone, while Vi,j is the variance of the output due to second-order interactions.
The main effect indices, given by the first-order Sobol’ indices [26], are
$Si=ViVar(M(X))$
(23)
where Si measures the contribution of each individual Xi on the output variance. The total-order indices, given by the total-effect Sobol’ indices [26], are
$STi=EX∼i(VarXi(M(X)|X∼i))Var(M(X))=1−VarX∼i(EXi(M(X)|X∼i))Var(M(X))$
(24)
where $STi$ is the measure of the output variance due to Xi alone as well as due to the interaction of Xi with other input parameters.

## 3 Numerical Examples

In this section, the three machine learning algorithms are applied to three UT benchmark cases developed by the WFNDEC. The accuracy of the sensitivity analysis results are compared to those obtained from directly sampling the high-fidelity physics-based models.

### 3.1 Description of the Benchmark Cases.

The three UT benchmark cases considered are the spherically void defect under a focused transducer (Case 1), spherically void defect under a planar transducer (Case 2), and the spherical-inclusion defect under a focused transducer (Case 3). The setup for these cases are shown in Fig. 4. Each of these cases has three variability parameters as inputs. Cases 1 and 3 have the probe angle $(θ)$, the x-coordinates (x), and the frequency related F-number (F) as variability parameters. For Case 2, the F-number is replaced with the y-coordinates (y). Table 1 lists the variability parameters along with their input distributions. The output response is the voltage waveform at the receiver.

Fig. 4
Fig. 4
Close modal
Table 1

The variability parameters used in the numerical examples

ParametersCase 1Case 2Case 3
$θ$ (deg)N(0, 0.52)N(0, 0.52)N(0, 0.52)
x (mm)U(0, 1)U(0, 1)U(0, 1)
y (mm)N/AU(0, 1)N/A
FU(13, 15)N/AU(8, 10)
ParametersCase 1Case 2Case 3
$θ$ (deg)N(0, 0.52)N(0, 0.52)N(0, 0.52)
x (mm)U(0, 1)U(0, 1)U(0, 1)
y (mm)N/AU(0, 1)N/A
FU(13, 15)N/AU(8, 10)

The Thompson–Grey analytical model [50] is used as the high-fidelity physics-based model in this study. The center frequency of the transducer is 5 MHz. The density of the fused quartz block is 2,000 kg/m3. The longitudinal wave speed is 5969.4 m/s, while the shear wave speed is 3774.1 m/s. A detailed description of the UT testing model can be found in Schmerr and Song [51], while the validation of this model with experimental data can be found in Du et al. [41]. Figure 5 shows the validation of the time-domain waveform obtained from the high-fidelity physics-based model and is compared with experimental data for Case 2. The physics-based model matches the experimental results well.

Fig. 5
Fig. 5
Close modal

### 3.2 Results.

In Sec. 2, as discussed, the machine learning algorithms are required to have a global accuracy, measured in terms of RMSE, of less than $1%σt$, before performing sensitivity analysis. For the three UT cases, the convergence of the RMSE with increasing number of high-fidelity training points is performed at a single defect radius size (a) of 0.5 mm and is shown in Figs. 68. For each machine learning algorithm and for each high-fidelity sample size, ten different LHS are generated to account for the variation in the input variability space. The RMSE plots in Figs. 68 show both the mean as well as the standard deviation in the RMSE arising due to the different input samples generated. In all these cases, the number of testing points is fixed and contains 1,000 high-fidelity MCS. Figures 68 also show the $10%σt$ and $1%σt$ values.

Fig. 6
Fig. 6
Close modal
Fig. 7
Fig. 7
Close modal
Fig. 8
Fig. 8
Close modal

In Figs. 68, the RMSE decreases with increasing number of high-fidelity sample points. For Case 1, both NNs and CNNs perform similarly and require around 35 high-fidelity sample points to reach the target global accuracy. DGP for this case requires around 84 high-fidelity samples to reach this same threshold. For Case 2, DGP requires 84 high-fidelity samples and outperforms both NNs and CNNs, which require 100 high-fidelity samples each. Finally, in Case 3, NN outperforms both CNN and DGP. NN requires 35 high-fidelity samples, while the other two machine learning algorithms require 56 samples each. Using the same number of high-fidelity samples as those required to reach the global accuracy, the machine learning algorithms were trained for different defect sizes ranging from 0.1 mm to 0.5 mm and this error is now measured using NRMSE.

Figures 911 show the NRMSE as a variation in the defect size for the three UT cases. For all these cases, NRMSE is nearly constant and within the $1%σt$ global accuracy, indicating that these machine learning algorithms are robust under varying defect sizes. Note that similar to the RMSE convergence plots, the NRMSE plots are generated using ten different samples at each defect size and for each machine learning algorithm. Figures 911 show both the mean and the standard deviation in the NRMSE.

Fig. 9
Fig. 9
Close modal
Fig. 10
Fig. 10
Close modal
Fig. 11
Fig. 11
Close modal

To perform the sensitivity analysis with Sobol’ indices, ten different sets containing 75,000 MCS each were generated for each of the machine learning algorithms and for each UT case. This analysis was only performed at a defect size of a = 0.5 mm. The trained machine learning algorithms were used to provide the model response for all the generated MCS. These model responses were then used to calculate the Sobol’ indices. The same number of MCS were also generated to directly evaluate the physics-based models in order to perform sensitivity analysis. Figures 1214 show the first-order Sobol’ indices for the three UT cases, while Figs. 1517 show the total-order Sobol’ indices for these UT cases. The black lines in these plots show the standard deviations due to the ten different sample sets chosen. For Case 1, the F-number has negligible effect on the model response as seen in Figs. 12 and 15, respectively. The same is true for Case 3, shown in Figs. 14 and 17, respectively. In Case 2, the y-coordinates have a small effect on the model response as seen in Figs. 13 and 16, respectively. For all the different cases, the Sobol’ indices values from the machine learning algorithms match well with those from the physics-based models.

Fig. 12
Fig. 12
Close modal
Fig. 13
Fig. 13
Close modal
Fig. 14
Fig. 14
Close modal
Fig. 15
Fig. 15
Close modal
Fig. 16
Fig. 16
Close modal
Fig. 17
Fig. 17
Close modal

## 4 Conclusion

Three different machine learning algorithms, namely, neural networks, convolutional neural networks, and deep Gaussian processes, were used to perform model-based sensitivity analysis for ultrasonic testing systems using three benchmark cases developed by the WFNDEC. First, the global accuracy of these algorithms was measured and the number of high-fidelity samples required to reach the desired global accuracy was noted. These globally accurate algorithms were then used to generate model responses in order to perform sensitivity analysis using Sobol’ indices. The sensitivity analysis results also matched well with those obtained by directly using the physics-based model.

This study shows that NN, CNN, and DGP machine learning algorithms can be used to provide fast and accurate sensitivity results values. Performing sensitivity analysis can assist in deciding which variability parameters need to be considered while performing physical experiments, which can reduce both cost and time of the experiments. Future work will include cases with non-spherical defects as well as cases with a larger number of variability parameters.

## Acknowledgment

The authors are supported in part by NSF Award No. 1846862 and the Iowa State University Center for Nondestructive Evaluation Industry-University Research Program.

## Conflict of Interest

There are no conflicts of interest.

## References

1.
Cawley
,
P.
,
2001
, “
Non-Destructive Testing—Current Capabilities and Future Directions
,”
J. Mater. Des. Appl.
,
215
(
4
), pp.
213
223
.
2.
Sharma
,
A.
, and
Sinha
,
A. K.
,
2018
, “
Ultrasonic Testing for Mechanical Engineering Domain: Present and Future Perspective
,”
Int. J. Res. Indust. Eng.
,
7
(
2
), pp.
243
253
.
3.
Capriotti
,
M.
,
Kim
,
E. H.
,
Scalea
,
L. F.
, and
Kim
,
H.
,
2017
, “
Non-Destructive Inspection of Impact Damage in Composite Aircraft Panels by Ultrasonic Guided Waves and Statistical Processing
,”
Materials
,
10
(
6
), pp.
1
12
.
4.
Poudel
,
A.
,
Strycek
,
J.
, and
Chu
,
T. P.
,
2013
, “
Air-Coupled Ultrasonic Testing of Carbon-Carbon Composite Aircraft Brake Disks
,”
Mater. Eval.
,
71
(
8
), pp.
987
994
.
5.
Choi
,
S.-N.
,
Hong
,
S.-Y.
, and
Hwang
,
W.-G.
,
2012
, “
Performance Demonstration for an Automated Ultrasonic Testing System for Piping Welds
,”
J. Nuclear Sci. Technol.
,
49
(
5
), pp.
562
570
.
6.
,
K.
, and
Lacki
,
P.
,
2017
, “
Assessment of Aluminum FSW Joints Using Ultrasonic Testing
,”
Arc. Metall. Mater.
,
62
(
4
), pp.
2399
2404
.
7.
Bai
,
L.
,
Velichko
,
A.
, and
Drinkwater
,
B. W.
,
2018
, “
Ultrasonic Defect Characterization – Use of Amplitude, Phase and Frequency Information
,”
J. Acoustic Soc. Amer.
,
143
(
1
), pp.
349
360
.
8.
Ahmed
,
A.
,
,
K. S.
,
Noor Ahmed
,
R.
, and
Gajendra
,
G.
,
2015
, “
Development of Ultrasonic Reference Standards for Defect Characterization in Carbon Fiber Composites
,”
Int. Res. J. Eng. Technol. (IRJET)
,
2
(
7
), pp.
840
844
.
9.
Ginzel
,
E.
,
2007
, “
NDT Modelling An Overview
,” Technical report, Materials Research Institute, Waterloo, Ontario, Canada.
10.
Darmon
,
M.
,
Chatillon
,
S.
,
Mahaut
,
S.
,
Calmon
,
P.
,
,
L.
, and
Zernov
,
V.
,
2011
, “
Recent Advances in Semi-Analytical Scattering Models for NDT Simulation
,”
J. Phys.
,
269
(
1
), pp.
1
12
.
11.
Kolkoori
,
S.
,
2014
, “
Quantitative Evaluation of Ultrasonic Wave Propagation in Inhomogeneous Anisotropic Austenitic Welds Using 3D Ray Tracing Method: Numerical and Experimental Validation
,” Ph.D. thesis,
Technical University Berlin
,
Berlin, Germany
.
12.
Wagner
,
D.
,
Cavalieri
,
F. J.
,
Bathias
,
C.
, and
Ranc
,
N.
,
2012
, “
Ultrasonic Fatigue Tests at High Temperature on an Austenitic Steel
,”
Propulsion Power Res.
,
1
(
1
), pp.
29
35
.
13.
Subair
,
S.
,
Balasubramaniam
,
K.
,
Rajagopal
,
P.
,
Kumar
,
A.
,
Rao
,
B. P.
, and
Jayakumar
,
T.
,
2014
, “
Finite Element Simulations to Predict Probability of Detection (PoD) Curves for Ultrasonic Inspection of Nuclear Components
,”
Procedia. Eng.
,
86
, pp.
461
468
.
14.
Zhang
,
C.
, and
Gross
,
D.
,
2002
, “
A 2D Hyper Singular Time-Domain Traction BEM for Transient Elastodynamic Crack Analysis
,”
Wave Motion
,
35
(
1
), pp.
17
40
.
15.
Westlund
,
J.
,
2011
, “
On the Propagation of Ultrasonic Testing Using Boundary Integral Equation Method
,” Ph.D. thesis,
Chalmers University of Technology
,
Gothenburg, Sweden
.
16.
Jeong
,
H.
, and
Schmerr
,
L. W.
,
2007
, “
Ultrasonic Beam Propagation in Highly Anisotropic Materials Simulated by Multi Gaussian Beams
,”
J. Mech. Sci. Technol.
,
21
(
8
), pp.
1184
1190
.
17.
Ye
,
J.
,
Kim
,
H. J.
,
Song
,
S. J.
,
Kang
,
S. S.
,
Kim
,
K.
, and
Song
,
M. H.
,
2011
, “
Model Based Simulation of Focused Beam Fields Produced by a Phased Array Ultrasonic Transducer in Dissimilar Meta Welds
,”
NDT/E Int.
,
44
(
3
), pp.
290
296
.
18.
Nam
,
Y.-H.
,
2001
, “
Modeling of Ultrasonic Testing in Butt Joint by Ray Tracing
,”
J. Mech. Sci. Technol.
,
15
(
4
), pp.
441
447
.
19.
Liu
,
Q.
,
,
G.
, and
Wirdelius
,
H.
,
2014
, “
A Receiver Model for Ultrasonic Ray Tracing in an Inhomogeneous Anistropic Weld
,”
J. Modern Phys.
,
5
(
13
), pp.
1186
1201
.
20.
Ferretti
,
F.
,
Saltelli
,
A.
, and
Tarantola
,
S.
,
2016
, “
Trends in Sensitivity Analysis Practice in the Last Decades
,”
Sci. Total. Environ.
,
568
, pp.
666
670
.
21.
Staelen
,
R. H. D.
, and
Beddek
,
K.
,
2015
, “
Sensitivity Analysis and Variance Reduction in a Stochastic NDT Problem
,”
Int. J. Comput. Math.
,
92
(
9
), pp.
1874
1882
.
22.
Ghanem
,
R.
,
Higdon
,
D.
, and
,
H.
, eds.,
2017
,
Handbook of Uncertainty Quantification
, 1st ed.,
Springer International Publishing
,
Switzerland
, pp.
1
20
.
23.
Sher
,
A.
,
Wang
,
K.
,
Wathen
,
A.
,
Maybank
,
P.
,
Mirams
,
G.
,
Abramson
,
D.
,
Noble
,
D.
, and
Gavaghan
,
D.
,
2011
, “
A Local Sensitivity Analysis Method for Developing Biological Models with Identifiable Parameters: Application to Cardiac Ionic Channel Modelling
,”
Future Generat. Comput. Syst.
,
29
(
2
), pp.
591
598
.
24.
Homma
,
T.
, and
Saltelli
,
A.
,
1996
, “
Importance Measures in Global Sensitivity Analysis of Nonlinear Models
,”
Reliab. Eng. Syst. Safety
,
52
(
1
), pp.
1
17
.
25.
Morio
,
J.
,
2011
, “
Global and Local Sensitivity Analysis Methods for a Physical System
,”
Euro. J. Phys.
,
32
(
6
), pp.
1
9
.
26.
Sobol’
,
I.
, and
Kucherekoand
,
S.
,
1993
, “
Sensitivity Estimates for Nonlinear Mathematical Models
,”
Math. Model. Comput. Exper.
,
1
(
4
), pp.
407
414
.
27.
Sobol’
,
I.
,
2001
, “
Global Sensitivity Indices for Nonlinear Mathematical Models and Their Monte Carlo Estimates
,”
Math. Comput. Simulat.
,
55
(
1–3
), pp.
271
280
.
28.
Forrester
,
A. I. J.
, and
Keane
,
A. J.
,
2009
, “
,”
Prog. Aerosp. Sci.
,
45
(
1–3
), pp.
50
79
.
29.
Queipo
,
N. V.
,
Haftka
,
R. T.
,
Shyy
,
W.
,
Goel
,
T.
,
Vaidyanathan
,
R.
, and
Tucker
,
P. K.
,
2005
, “
Surrogate-Based Analysis and Optimization
,”
Prog. Aerosp. Sci.
,
41
(
1
), pp.
1
28
.
30.
Forrester
,
A.
,
Sobester
,
A.
, and
Keane
,
A.
,
2008
,
Engineering Design Via Surrogate Modelling: A Practical Guide
,
John Wiley and Sons, Ltd.
,
UK
.
31.
Peherstorfer
,
B.
,
Willcox
,
K.
, and
Gunzburger
,
M.
,
2018
, “
Survey of Multifidelity Methods in Uncertainty Propagation, Inference, and Optimization
,”
Soc. Indust. Appl. Math.
,
60
(
3
), pp.
550
591
.
32.
Forrester
,
I. J. A.
,
Sobester
,
A.
, and
Keane
,
J. A.
,
2007
, “
Multi-Fidelity Optimization Via Surrogate Modelling
,”
Proc. R. Soc. A Math. Phys. Eng. Sci.
,
463
(
2088
), pp.
3251
3269
.
33.
Wiener
,
N.
,
1938
, “
The Homogeneous Chaos
,”
Am. J. Math.
,
60
(
4
), pp.
897
936
.
34.
Krige
,
D. G.
,
1951
, “
A Statistical Approach to Some Basic Mine Valuation Problems on the Witwatersrand
,”
J. Chem. Metall. Mining Eng. Soc. South Africa
,
52
(
6
), pp.
119
139
.
35.
Li
,
D.
,
Wilson
,
P. A.
, and
Jiong
,
Z.
,
2015
, “
An Improved Support Vector Regression and Its Modelling of Manoeuvring Performance in Multidisciplinary Ship Design Optimization
,”
Int. J. Model. Simulat.
,
35
(
3–4
), pp.
122
128
.
36.
Kennedy
,
C. M.
, and
O’Hagan
,
A.
,
2000
, “
Predicting the Output From a Complex Computer Code When Fast Approximations Are Available
,”
Biometrika
,
87
(
1
), pp.
1
13
.
37.
Echeverria
,
D.
, and
Hemker
,
P.
,
2008
, “
Manifold Mapping: A Two-Level Optimization Technique
,”
Comput. Visualiz. Sci.
,
11
(
4
), pp.
193
206
.
38.
Bilicz
,
S.
,
Vazquez
,
E.
,
Gyimothy
,
S.
,
Pavo
,
J.
, and
Lambert
,
M.
,
2010
, “
Kriging for Eddy-Current Testing Problems
,”
IEEE. Trans. Magn.
,
46
(
8
), pp.
4582
4590
.
39.
Bilicz
,
S.
,
Lambert
,
M.
,
Gyimothy
,
S.
, and
Pavo
,
J.
,
2012
, “
Solution of Inverse Problems in Nondestructive Testing by a Kriging-Based Surrogate Model
,”
IEEE. Trans. Magn.
,
48
(
2
), pp.
495
498
.
40.
Miorelli
,
R.
,
Artusi
,
X.
,
,
B. A.
, and
Reboud
,
C.
,
2016
, “
Database Generation and Exploitation for Efficient and Intensive Simulation Studies
,”
Rev. Progress Quant. Nondestruct. Eval.
,
1706
(
1
), p.
180002:1
.
41.
Du
,
X.
,
Leifsson
,
L.
,
Meeker
,
W.
,
Gurrala
,
P.
,
Song
,
J.
, and
Roberts
,
R.
,
2019
, “
Efficient Model-Assisted Probability of Detection and Sensitivity Analysis for Ultrasonic Testing Simulations Using Stochastic Metamodeling
,”
J. Nondestrut. Eval. Diag. Prognostics Eng. Syst.
,
2
(
4
), p.
0410020
. 09.
42.
Blatman
,
G.
,
2009
, “
Adaptive Sparse Polynomial Chaos Expansion for Uncertainty Propagation and Sensitivity Analysis
,” Ph.D. thesis,
Blaise Pascal University - Clermont II. 3, 8, 9
,
Clermont, France
.
43.
Du
,
X.
, and
Leifsson
,
L.
,
2020
, “
Multifidelity Modeling by Polynomial Chaos-Based Cokriging to Enable Efficient Model-Based Reliability Analysis of NDT Systems
,”
J. Nondestruct. Eval.
,
39
(
3
), pp.
1
15
.
44.
Schmidhuber
,
J.
,
2015
, “
Deep Learning in Neural Networks: An Overview
,”
Neural Net.
,
61
, pp.
85
117
.
45.
Haykin
,
S. S.
,
2009
,
Neural Networks and Learning Machines
, 3rd ed.,
Pearson Education
,
.
46.
Goodfellow
,
I.
,
Bengio
,
Y.
, and
Courville
,
A.
,
2016
,
Deep Learning
,
The MIT Press
,
Cambridge, MA
.
47.
le Cun
,
Y.
,
1989
, “
Generalization of Network Design Strategies
,” Technical Report, University of Toronto, Toronto, Canada, June.
48.
Damianou
,
A.
, and
Lawrence
,
N.
,
2013
, “
Deep Gaussian Processes
,”
AISTATS
,
Scottsdale, AZ
,
Apr. 29–May 1
, pp.
207
215
.
49.
Salimbeni
,
H.
, and
Deisenroth
,
M.
,
2017
, “
Doubly Stochastic Variational Inference for Deep Gaussian Processes
,”
NIPS
,
Long Beach, CA
,
Dec. 4–9
, pp.
4588
4599
.
50.
Thompson
,
R. B.
, and
Gray
,
T. A.
,
1983
, “
Analytic Diffraction Corrections to Ultrasonic Scattering Measurements
.” Library of Congress Cataloging in Publication Data, Springer, 2A.
51.
Schmerr
,
L.
, and
Song
,
J.
,
2007
,
Ultrasonic Nondestructive Evaluation Systems
,
Springer Science + Business Media, LLC
,
New York
.
52.
McKay
,
M. D.
,
Beckman
,
R. J.
, and
Conover
,
W. J.
,
1979
, “
A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code
,”
Technometrics
,
21
(
2
), pp.
239
245
.
53.
Metropolis
,
N.
, and
Ulam
,
S.
,
1949
, “
The Monte Carlo Method
,”
J. Am. Stat. Assoc.
,
44
(
247
), pp.
335
341
.
54.
Garg
,
V. V.
, and
Stogner
,
R. H.
,
2017
, “
Hierarchical Latin Hypercube Sampling
,”
J. Am. Stat. Assoc.
,
112
(
518
), pp.
673
682
.
55.
Chauvin
,
Y.
, and
Rumelhart
,
D. E.
,
1995
,
Backpropagation: Theory, Architectures, and Applications
,
Psychology Press
,
Hillsdale, NJ
.
56.
Kingma
,
D. P.
, and
Ba
,
J.
,
2015
, “
Adam: A Method for Stochastic Optimization
,”
ICLR
,
San Diego, CA
,
May 7–9
.
57.
,
M.
,
Agarwal
,
A.
,
Barham
,
P.
,
Brevdo
,
E.
,
Chen
,
Z.
,
Citro
,
C.
,
,
G. S.
,
Davis
,
A.
,
Dean
,
J.
,
Devin
,
M.
,
Ghemawat
,
S.
,
Goodfellow
,
I.
,
Harp
,
A.
,
Irving
,
G.
,
Isard
,
M.
,
Jia
,
Y.
,
Jozefowicz
,
R.
,
Kaiser
,
L.
,
Kudlur
,
M.
,
Levenberg
,
J.
,
Mané
,
D.
,
Monga
,
R.
,
Moore
,
S.
,
Murray
,
D.
,
Olah
,
C.
,
Schuster
,
M.
,
Shlens
,
J.
,
Steiner
,
B.
,
Sutskever
,
I.
,
Talwar
,
K.
,
Tucker
,
P.
,
Vanhoucke
,
V.
,
Vasudevan
,
V.
,
Viégas
,
F.
,
Vinyals
,
O.
,
Warden
,
P.
,
Wattenberg
,
M.
,
Wicke
,
M.
,
Yu
,
Y.
, and
Zheng
,
X.
,
2015
, “
TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems
,” Software Available From tensorflow.org.
58.
Ryu
,
J.
,
Kim
,
K.
,
Lee
,
T.
, and
Choi
,
D.
,
2002
, “
Kriging Interpolation Methods in Geostatistics and DACE Model
,”
Korean Soc. Mech. Engin. Int. J.
,
16
(
5
), pp.
619
632
.
59.
Gneiting
,
T.
,
Kleiber
,
W.
, and
Schlather
,
M.
,
2010
, “
Matern Cross-Covariance Functions for Multivariate Random Fields
,”
J. Am. Stat. Assoc.
,
105
(
491
), pp.
1167
1177
.