## Abstract

Design can be viewed as a sequential and iterative search process. Fundamental understanding and computational modeling of human sequential design decisions are essential for developing new methods in design automation and human–AI collaboration. This paper presents an approach for predicting designers’ future search behaviors in a sequential design process under an unknown objective function by combining sequence learning with game theory. While the majority of existing studies focus on analyzing sequential design decisions from the descriptive and prescriptive point of view, this study is motivated to develop a predictive framework. We use data containing designers’ actual sequential search decisions under competition collected from a black-box function optimization game developed previously. We integrate the long short-term memory networks with the Delta method to predict the next sampling point with a distribution, and combine this model with a non-cooperative game to predict whether a designer will stop searching the design space or not based on their belief of the opponent’s best design. In the function optimization game, the proposed model accurately predicts 82% of the next design variable values and 92% of the next function values in the test data with an upper and lower bound, suggesting that a long short-term memory network can effectively predict the next design decisions based on their past decisions. Further, the game-theoretic model predicts that 60.8% of the participants stop searching for designs sooner than they actually do while accurately predicting when the remaining 39.2% of the participants stop. These results suggest that a majority of the designers show a strong tendency to overestimate their opponents’ performance, leading them to spend more on searching for better designs than they would have, had they known their opponents’ actual performance.

## 1 Introduction

Design is an iterative process with multiple stages that can broadly be categorized into identifying a need, generating design concepts, detailed design, and implementation as exemplified by multiple formal processes [1,2]. During this process, designers usually do not have a complete understanding of the design or the evaluation space but rather acquire knowledge in a sequential manner through multiple design evaluations. Particularly in detailed design where the design domain is more or less bounded by the concept selected for further development, by using either simulation models or experiments, designers evaluate the candidates they “sample” from the design domain to develop an understanding of where to “search” for promising designs. This search is usually not random if any learning occurs throughout the sampling process.

In reality, such a design process is not simply an individual decision-making process but can be influenced by how other designers make decisions, too. *Design under competition* is one of the most common interactive decision-making scenarios where design decisions are heavily influenced by competitors’ decisions [3]. Using car design as an example, in order to identify the design features that are most preferred by customers and determine the values of design attributes (e.g., engine size, cargo space, etc.), not only do the designers need to consider the information from their own company, they must have information from their competitors so as to produce competitive products in the market for profit. Because of the existence of competitiveness, designers’ extrinsic motivation could be biased, thereby their decisions may become irrational. Therefore, the research question of how designers make sequential design decisions under competition does not only have practical implications in many engineering applications but also attracts scientific inquires. The answer to this question can directly advance the understanding of human cognition in complex design scenarios and inform the development of forward-looking decision support systems for human–artificial intelligence (AI) collaboration. While human cognition has been known to suffer from several biases under uncertainty such as availability heuristic (i.e., using the information that easily comes to mind) or anchoring bias (i.e., relying on the earlier information to make future decisions) [4–6], a collaborative AI system developed considering human decision-making processes can potentially help designers (or enterprises) make rational decisions from design to business.

The study of human decision-making process can be approached from three perspectives, namely, descriptive, prescriptive, and predictive, depending on the research interests and the questions to be answered. Descriptive analysis uses data aggregation and data mining to provide insight into the past design behaviors and answers: “What has happened and how were the design decisions made?”. Prescriptive analysis uses modeling and optimization to advise on possible outcomes and answers: “How should the design decisions be made?” Finally, predictive analysis uses statistical models and forecasting techniques to understand the future and answers: “What could the future design decisions be?”.

Prior literature has studied sequential design processes from different perspectives. For example, in support of product development and project management, design structure matrices [7–9] have been used for task sequencing to identify the sequence that minimizes expected project completion time. Other studies have primarily focused on the design search process based on optimization algorithms. Multi-objective formulations have been introduced in the literature to study the design process sequentially advancing through smaller sets of alternatives using models of increasing fidelity [10,11]. Additionally, the expected value of perfect information (EVPI) [12], Bayesian optimization [13], generic algorithm [14], and optimal learning [15] have been adopted in the literature to study optimal design sequences. These studies, however, are very different from the work presented in this paper in that these studies investigate the optimal sequential decision process using normative models which study how the design decisions should be made.

In the present paper, we investigate *humans’* actual sequential design decisions. Chaudhari et al. [16] recently performed study to identify models that provide the best description of a designer’s sequential decisions when multiple information sources are present and the total budget is limited. So, this study tries to find the appropriate *descriptive* model for sequential design decisions. Computational models have also been used to learn and “replay” designers’ sequential decision-making using Markov chains [17–20], simulated annealing [21], Gaussian process [22–24], and more recently the deep learning-based methods [25,26]. Yet, no computational models have been developed to date for predicting designers’ sequential decisions with the consideration of competition.

In Ref. [24], the authors have used a function optimization game for design research and studied designers’ information acquisition decisions under competition. That study has developed normative models of design decisions under competition to answer questions more related to the descriptive and prescriptive aspects of sequential design decisions. In the present paper, we adopt the experimental settings (detailed in Sec. 5) of Ref. [24] and propose a predictive approach to model what future decisions individuals could make in a sequential design process under competition given their past decisions. The uniqueness of this approach is that it first integrates sequence learning (using a long short-term memory (LSTM) network) with game theory (using a non-cooperative game) in studying engineering design under competition.

Specifically, we use prior data from human subject experiments with the function optimization game introduced in Ref. [27] to learn search behaviors of the participants (i.e., designers) under one-to-one competition and to predict their future decisions based on their past behaviors. The *research question* we aim to answer in this study is: “to what extent can the game theory and sequence learning-based models predict and explain designers’ sequential decisions under competition?” The answer to this question can provide insights as to how effective collaborative AI systems could be developed to pro-actively guide human design decision-making.

The rest of the paper is organized as follows. In Sec. 2, the technical background on the non-cooperative game and the LSTM model is provided. These two models are the core components in the proposed predictive framework. Section 3 presents how the research problem is formulated and the proposed research approach. In Sec. 4, we introduce the function optimization game, and describe the experimental settings based on such a game for the collection of the sequential design data. In Sec. 5, the proposed model which integrates LSTM and the non-cooperative game is introduced in detail. In Secs. 6 and 7, the results from both analysis and validation are presented and discussed. Insights from the analysis are also summarized in Sec. 7. At the end, we conclude this paper with a further discussion on how our approach can be transferred to other engineering design scenarios and the future work.

## 2 Technical Background

### 2.1 Non-Cooperative Games.

A design competition is typically modeled as a non-cooperative game. Non-cooperative games are often inherently zero-sum where what one wins the other loses. Designers’ behaviors are studied at the equilibrium of such games. Various game-theoretic models have been developed in design literature to provide insights into designers’ (or enterprises’) behaviors, their design options, and strategies. For example, market competition has been modeled with game-theory, and the optimal price and design decisions under competition has been studied with long-run and short-run equilibrium solutions [3,28,29]. Further, game-theoretic models have been used to model rationality of the designers for collaborative, decentralized design scenarios [30].

In game theory, each player (or decision maker) is assumed to act with rational behaviors. That means they are self-interested agents whose goal is to maximize their own payoff. When using non-cooperative games to model a competition, the payoff (*π*_{i}) is dependent on the prize of the competition (e.g., the total revenue of a certain product if that product wins the market) and the probability of winning that prize. The probability of winning (*P*_{i}) is a function of the quality of the submissions from every competitor, and depends on the competitors’ characteristics (e.g., expertise) and inputs (e.g., effort, time investment). Hence, a general model of a design competition based on game theory has three main parts: quality function, winning probability function, and payoff function [27].

*Quality function*describes the quality of a design solution (

*q*

_{i}) as a function of the designers’ characteristics, such as their expertise (

*K*

_{i}) and the inputs, such as effort (

*e*

_{i}). For a real-time competition where competitors’ design decisions are known, the effort

*e*

_{i}is affected by the design characteristics and inputs from other designers in a rational decision-making process. Designers can adjust the amount of effort to spend on improving their design to find a design solution better than their competitors’ for maximum pay-off. In this paper, we assume that competitions are not in real time, i.e., there is no real-time information exchange, and designers make decisions based on their past experience with competitors, which is included in

*K*

_{i}. Thus, in this paper,

*q*

_{i}is assumed to be independent from the characteristics and the inputs of other designers in the competition.

*Winning probability function*defines each competitor’s probability of winning as a function of the quality of all submitted design solutions [31], and is typically modeled in an additive form,

*f*(

*q*

_{i}) is a non-negative increasing function. The empirical results obtained from previous studies indicate that using a power function for the winning probability and an exponential function for the quality well captures the designers’ winning status in a two-player competition game [27].

*Payoff function*defines the expected value of the prize, e.g., in a winner-takes-all game, the payoff of an individual is

*C*

_{i}is the cost incurred in developing the solution. The Nash equilibrium of the game is generally used as the solution of the game. At the Nash equilibrium, a rational player

*i*chooses the input (

*e*

_{i}) that is the best response to other players’ best responses.

### 2.2 Sequence Learning.

Several machine learning methods have been designed to learn from independent and identically distributed data. Sequential design decisions are a particular type of data that does not fall within that category since subsequent decisions are correlated with the previous ones. Recurrent neural networks (RNNs) in sequence learning literature have been developed for such problems [32,33]. These networks use the output of the hidden layer corresponding to the previous sequence as an input to the next sequence to retain dependencies. Drawbacks of the classical RNNs such as vanishing gradients where gradients become smaller in each layer led to the emergence of a more powerful variant of RNNs known as LSTM networks [34,35].

Long short-term memory networks consist of a repeated chain of cells each of which represents the network for a time-step. Adapted from Ref. [36], Fig. 1 depicts the overall network architecture and the contents of a cell. These networks are capable of capturing long-term relationships in sequential data by taking inputs from both the output and the internal state of the previous cell. Three *sigmoid* functions in Fig. 1 output values in [0, 1] and allow selecting what information to retain during learning. Two *tanh* functions in the figure scale the input and output data into [−1, 1] and allow the gradients to sustain for long time periods. Multiple variants of the LSTM networks exist. We refer readers to Ref. [35] for a detailed discussion on the practical use of these variants.

In this paper, we use LSTM networks to predict future design decisions under a competition based on past decisions, assuming that past behavior can be related to future behaviors. Earlier, normative models have been used to explain designer behaviors under the same setting [24]. While these models provide interpretable outcomes, they make several assumptions regarding designer behavior such as assuming certain bounds for future design decisions. LSTM networks could extract complex behaviors without making such assumptions and use this knowledge to predict future design decisions. Also, a machine learning-based approach is more generalizable to different design problems.

Classical LSTM networks output a deterministic value for a given set of inputs, i.e., a point estimate of the expected value, while a model with uncertainty is necessary to predict future design decisions due to the natural variation in designer behaviors. We combine the classical LSTM network models with an existing method proposed for estimating prediction intervals from the literature, known as the Delta method [37], to develop a probabilistic model that predicts the designer decisions with a distribution.

Figure 2 shows an overview of the learning process in the predictive model presented in this paper. We use the sequential design data from a crowd to train an LSTM network that predicts the expected next designs to sample and the corresponding quality (i.e., function value in the game) based on past history of design decisions. Note that Fig. 2 depicts a minimization problem where smaller values of *y* in the vertical axis represent better designs. We use the LSTM network with the Delta method to estimate the uncertainty in the prediction. Finally, in a competition, designers form an opinion or a belief of how good the competitor’s design could be and determine how much to spend on improving the existing design based on that belief. We combine the future prediction and the corresponding uncertainty using game theory to estimate the designers’ belief regarding the quality of their opponent’s design and predict whether the designers will make the predicted design decision. For instance, the proposed model predicts that the designer will stop searching for new designs if the predicted design quality is better than the designer’s belief of the opponent’s design quality as depicted in Fig. 2. We present the details of our mathematical approach in Sec. 5.

## 3 Problem Formulation and Research Approach

In this study, we are particularly interested in designers’ decision-making behavior in the parametric design stage where a design problem has been already defined (i.e., the objective is set and the key design variables are known), and a designer’ task is to determine the value of a specific design variable (e.g., a dimension or a material property). The goal of the parametric design, therefore, is to search the design space to find a design that satisfies the requirements and constraints in an optimal way with respect to a given objective [38]. While there could be other views on this design process, this optimization-based view is adopted in this study. As highlighted by Papalambros and Wilde [39], “philosophically, optimization formalizes what humans (and designers) have always done. Operationally, it can be used in design, in any situation where analysis is used.” Based on this view, we focus on a design process with the following characteristics [24]:

- (C1)
A designer’s goal is to find the best design quantified by certain objective values.

- (C2)
Designers evaluate the performance of candidate designs, either through simulations or physical experiments.

- (C3)
There is a cost associated with searching. The term “cost” is used more generally. Costs can either be monetary cost or effort (computational, personnel, etc.).

- (C4)
More costs incurred in exploring the design space result in a better understanding of the design space, and therefore, designs of equal or potentially better quality.

In this paper, we focus on a design problem solved by individuals in a design competition that possesses all the characteristics we list in (C1–C4). In this competition, the expected payoff is not only determined by the quality of a designer’s own decisions but also by the decisions of other competing designers, see Eq. (2). For a contestant, further experimentation may yield better design quality, and hence a higher probability of winning (C1–C3), but also greater cost (C4). This situation is similar to the real-world competitive markets where companies need to balance the added value of building a new prototype and the corresponding cost in terms of time and budget.

We present a model to predict future design decisions in such a competition. We use the function optimization game discussed in Sec. 4.1 that emulates such a tournament to collect data for sequential design decisions from participants. The predictive model integrates probabilistic models of sequence learning with game theory. The model predicts whether a designer will continue searching for better designs, if so, what part of the design space this designer will explore and how much improvement the designer will achieve.

## 4 Experiment Description

### 4.1 Function Optimization Game.

Sha et al. [27] developed a function optimization game that creates a simplified scenario of design under competition yet capturing the essence of the design process characteristics specified in Sec. 3. So in this paper, we adopt the data collected from this game experiment in support of our investigation. Here, we summarize the key features of this game briefly and refer the reader to Ref. [27] for the details.

In the optimization game, the participants are asked to optimize a design characterized by a single variable *x* ∈ [−100, 100], and its performance is evaluated by an unknown function, *f*(*x*) as shown in Fig. 3. In this specific case, the participants are asked to minimize this unknown *f*(*x*). This resembles many real-world design problems where the functional behavior of an artifact is not completely known. Each participant can query the value of the function for a specified *x* at a cost of *c* tokens. Each participant plays the game against one other player, randomly selected during each period. At the end of each period, the participant whose design achieves a smaller function value wins the fixed prize (Π). This game embodies the sequential information acquisition decisions and enables the study of strategic decisions. The domain independent nature of the problem reduces the variations among designers due to the diversity in knowledge background, thus reduces noises in the data which will be beneficial to testing the models.

### 4.2 Experimental Setup.

Based on this game setting, an experiment was carried out with 44 senior undergraduate Mechanical Engineering students at Purdue University by Sha et al. for an earlier study [27]. The present paper uses the same data to illustrate the application of the proposed model. The following points summarize the key experimental settings, and further details of the experimental data are provided in Sec. 6.1.

The control factor is the cost. Each subject participated in two

*treatments*: low cost treatment (*c*= 10 tokens) and high cost treatment (*c*= 20 tokens). In both treatments, the participants start with 200 tokens. The experiments consists of four sessions where each session has two treatments as follows. Session 1: low treatment first and then high treatment; Session 2: high treatment first and then low treatment; Session 3: high treatment first and then low treatment; and Session 4: low treatment first and then high treatment.The repetition is realized by 15

*periods*. A period refers to one full competition cycle between two players based on a randomly generated quadratic function. The coefficients*a*and*b*of the function*F*(*x*) = (*x*−*a*)^{2}+*b*are randomly drawn from a uniform distribution in each period. So, a participant plays the game 15 times in low cost treatment and 15 times in high cost treatment with 30 different functions in total.The matching mechanism is random. In experiments involving partners, there may be chances for tit-for-tat strategies (learn from your partner). This could eventually affect the analysis of competitive decisions. So, in this experiment, two participants competing with each other are randomly matched; and at the beginning of each period, every pair of participants will be re-matched.

The awarding mechanism is winner-takes-all. For each period, the winning player receives the prize amount (200 tokens) minus the cost of sampling, whereas losing player gets nothing. At the end of the experiment, the final prize is accumulated from the prize received from each period.

With these experimental settings, a participant needs to sequentially make two essential decisions multiple times in each period of the game: (a) the decision to choose next *x* and (b) the decision to whether stop or not. The information available for these decisions are as follows: at the end of each period, the participants are informed whether they won or not, the best point archived by the winner, and the actual optimal value of the function in that period. Please note that this feedback does not directly reveal the opponent’s searching strategy in the future periods because the function is randomly generated and the competitors are randomly matched at the beginning of a new period. However, this allows the players to form a belief (or a guess) about their opponents’ best designs in the subsequent periods.

## 5 Sequence Learning With Game Theory

We use the dataset obtained from the experiment described in Sec. 4 to train an LSTM network that gives a probabilistic output, and combine it with the game theoretic model proposed in Ref. [24]. The LSTM network and the game-theoretic model serve complementary purposes. While the LSTM network predicts a distribution for the next design sample and the corresponding function value, the game theoretic model calculates the belief that a designer has regarding the performance of the competitor and determines whether a designer will continue sampling based on the predicted performance improvement. When using LSTM networks and game theory, we make the following assumptions regarding designer behaviors:

- (A1)
Predictability. Past design decisions and the corresponding outcomes are the predictors of future decisions and outcomes.

- (A2)
Rational behaviors. Individuals stop sampling when the expected improvement in their payoff given by Eq. (3) is negative.

### 5.1 Prediction of Next Design Samples and Outcomes.

We use three types of historical information available to the players to predict the future behaviors: a given number of previous design samples, function values, and cost of sampling. Instead of training different models for each treatment, we use cost as an input to combine the information regarding both the number of samples a participant has taken (i.e., the number of design iterations) and treatment. Cost linearly increases with the number of samples where the treatment determines the unit (marginal) cost of each design sample. Integration of cost into our model is also an advantage over the normative approach since we provide a single unified model that describes designer behavior under different experimental conditions.

**x**

_{p},

**y**

_{p},

**c**

_{p}are vectors containing

*p*number of past design samples, their corresponding function and cost values, respectively, and

**w**is the vector of network weights that minimizes the training error. Also, $.~$ denotes a predicted quantity. The experiment data contain sequences of varying length since each player freely determines how many samples to take from the design space. For training, we organize the entire data set into samples of length

*p*+ 1. For instance if a player takes

*q*samples from the design space in an experiment where

*q*>

*p*, that experiment gives

*q*−

*p*samples for training. Here, the number of steps to look back to obtain a good prediction depends on the application. While a longer sequence of past history could allow learning more complex behaviors, it also reduces the number of data points to use for training.

### 5.2 Prediction of Uncertainty.

The LSTM network provides a deterministic output for a given set of inputs whereas in reality, due to several human factors that are not included in the LSTM model, future decisions from different designers vary even if their past decisions are the same. Therefore, while the deterministic output of the LSTM network could serve as a point estimate of the expected value, a probabilistic output is necessary to account for the natural variation in human behaviors. Also, the stopping criteria presented in Sec. 5.3 require a model of uncertainty in the future predictions. There are multiple approaches in the literature to estimate the distribution of the output prediction from neural network models as summarized in Ref. [40]. In the present study, we use the Delta method [37] for its computational efficiency. This method makes an assumption of normality of the overall prediction error and calculates the variance of the prediction output based on the training error and the sensitivity of the network output with respect to the prediction weights. We provide a summary of the final output and refer readers to the original references for derivations [37,40].

*g*

_{0}is the gradient of the output of the LSTM network with respect to the prediction weights evaluated at a test point

**z**

_{0}given by

*J*is the Jacobian matrix of the LSTM network with respect to the prediction weights evaluated at the training samples

**z**

_{1}, …

**z**

_{k}given by

### 5.3 Stopping Criteria.

*i*) denote the information from the player

*i*and (−

*i*) denotes the information from the corresponding opponent. At a given time, let $y~*(\u2212i)$ be the best function value the player

*i*believes that the opponent has achieved, and $y*(i)$ is the best function that the player

*i*achieved up to that point. The stopping criterion from Ref. [24] is given by

*ϕ*is the standard cumulative normal distribution function. This criterion can be used to predict whether a player will continue sampling or not for given $y*(\u2212i)$, or it can be used to estimate a lower and upper bound on $y~*(\u2212i)$ using the data at the point where a player stopped sampling. This criterion combines the information learned from the crowd through the LSTM network and the information from the individual participants through their own beliefs regarding their opponents.

## 6 Results

### 6.1 Descriptive Analysis.

In this section, we present key statistics to describe the original experiment data obtained from the function optimization game described in Sec. 4. We report median and interquartile range (IQR) instead of mean and standard deviation for these statistics since a few outliers in the data impact the latter significantly. Since the experiment data consist of inputs from 44 participants and each participant plays the function optimization game for 15 periods per each of the two cost treatments, this results in a data set for 1320 periods in total. We omit the first five periods in each participant (i.e., remove 440 periods) in our analysis for offsetting participants’ learning curves.

For the remaining 880 periods, the median number of samples the participants have taken is 7.0 for winning players and 6.0 for losing players, and the IQR is 4.0 for both. The median of the absolute distance between participants’ best design sample and true optimal design for winning and losing players are 0.48 and 2.79 with an IQR of 1.38 and 8.96, respectively. Similarly, the median distance between participants’ best function value and true optimum for winning and losing players are 0.05 and 1.56 with an IQR of 0.45 and 19.65, respectively. These results are summarized in Table 1. Note that these numbers refer to the absolute quantities without any normalization.

Median | Median | IQR | IQR | |
---|---|---|---|---|

Statistic | (Win) | (Loss) | (Win) | (Loss) |

Number of samples | 7.0 | 6.0 | 4.0 | 4.0 |

Absolute distance of the best x from the optimum | 0.48 | 2.79 | 1.38 | 8.96 |

Absolute distance of the best y from the optimum | 0.05 | 1.56 | 0.45 | 19.65 |

Median | Median | IQR | IQR | |
---|---|---|---|---|

Statistic | (Win) | (Loss) | (Win) | (Loss) |

Number of samples | 7.0 | 6.0 | 4.0 | 4.0 |

Absolute distance of the best x from the optimum | 0.48 | 2.79 | 1.38 | 8.96 |

Absolute distance of the best y from the optimum | 0.05 | 1.56 | 0.45 | 19.65 |

The smaller variation in the performance (quantified by the distance between the design variable values/function values and the true optimum) of the winning players as compared to that in the losing players indicates that the players attempt to be competitive regardless of their opponent’s score. This result suggests that the players have developed a belief of a strong opponent regardless of the actual ability of their opponent.

### 6.2 Predictive Analysis.

#### 6.2.1 LSTM Network.

We use the history of previous three design samples, function values, and the cost incurred to train an LSTM network that predicts the next design sample and the function value. We remove the data points where participants take less than four samples, which is the minimum number of samples needed for the LSTM model. Using the data after the fifth period in all treatments, we pre-process the data from 880 game periods to organize them into chunks of four sequences where we use the first three to predict the last. This pre-processing gives 2481 data points to use for the LSTM model. Note that the number of samples the participants take (reported in Table 1) directly affects the number of data points to use for training. Had the participants taken more samples, it would have provided a larger and richer data set. We use six-fold cross validation and present the results for one of the subsets in this section. All the results in this section are evaluated on the test data.

Using a normal distribution, we can estimate a lower and upper bound on the predicted quantity, i.e., the next design sample or function value, with a given confidence. Figure 4 shows the prediction intervals for the next design sample and function value (from Eq. (4)) with one standard deviation (from Eq. (5)) away from the mean. Filled dots and hollow circles represent the upper and lower bounds, respectively, and the solid line corresponds to the perfect prediction. The *x* and *y* axes in the figure represent normalized design variable and function values. In Fig. 4(a) the design interval of [−100, 100] is mapped to [0, 1], and in Fig. 4(b), 0 corresponds to the true optimum of the function that participants try to minimize and 1 corresponds to the maximum function value observed by two opposing participants. The prediction intervals shown in Fig. 4(a) accurately bound 82% of the test points from above and below, and in Fig. 4(b), the accuracy is 92%. These prediction intervals can be narrowed down at the expense of losing accuracy.

Next, we present the prediction results for the overall behavior of the participants throughout the search process. The plot in Fig. 5(a) shows that the difference between two consecutive function values in the search process |Δ*y*| gradually decreases with the number of iterations (i.e., the samples that participant take) in the actual data. These results mean that the participants tend to make larger changes in their design early in the process compared to later. The plot in Fig. 5(b) depicts the same results from the prediction model. A similar trend is observed for the design samples but we omit those results for brevity. Since the LSTM model needs a past history of three samples, the prediction results start from the fourth sample. The overlap between two plots shows that the predictive model can reasonably capture the exploration and exploitation behavior as a function of the design iteration in this experiment. As opposed to the normative models from the literature such as simulated annealing [21] or Gaussian processes [24,27], the proposed model learns the exploration and exploitation from the input data without “hard-wiring” these behaviors.

#### 6.2.2 Game Theory.

We also present results for the ability of the game theoretic model in Eq. (8) to predict and explain some of the participant behaviors under competition. First, we show the results for the participants’ belief for their opponent’s best function value $y~*(\u2212i)$. We apply Eq. (8) that uses the predictions from the LSTM network to the data points where the participants stop sampling. Using the values for $\mu y~0$ and $\sigma y~0$ from the LSTM network and the participants best score at the time $y*(i)$ from the dataset, we obtain an upper and lower bound for $y~*(\u2212i)$ from the two inequalities in Eq. (8). Figure 6 compares these bounds for the participants’ belief with the actual best score of the opponent. The error in the upper and lower bounds increases as the opponent’s best function value increases. Fifty-three percent of the actual opponent scores fall within the bounds as shown in Fig. 6.

We also use the game theoretic model to predict when the participants stop sampling given their opponent’s best score. Figure 7 shows this prediction in comparison to the actual point where the participants stopped sampling. The comparison starts from the fourth sample, which is the earliest point in the sampling process our model can predict. The figure shows that the game theoretic model generally predicts an earlier stop compared to the actual data.

## 7 Discussion

In this section, we interpret and discuss the results presented in Sec. 6 and provide some key take-aways.

The prediction of the next design sample using our LSTM network provides an accurate interval when the next design sample yields a small function value. Figure 4(b) shows that if the next function value is small (e.g., *y*_{0} < 0.1), the estimate for the upper bound is mostly above and the lower bound is mostly below the line (with an accuracy of 96%). The error in the prediction interval tends to increase if the next function value is large. This error can partially be attributed to the lack of sufficient data points with a large *y*_{0} value. Considering that smaller function values represent better designs in this study, learning from the crowd data in the LSTM network is biased toward behaviors that yield better design outcomes. Such a result is expected since the primary goal of the participants in the design contest is to improve their designs and the crowd data contain that behavior.

The present LSTM network also has the ability to capture some interpretable designer behaviors in the experiment. The results in Fig. 5 show the gradual transition from exploration to exploitation as the participants take more samples from the design process. Since the participants gain a better knowledge of the design space with more samples, it is expected for participants to search for a promising region in the design space with exploration early in the process and focus on fine-tuning their results later. Recall that each sample incurs a cost. Therefore, it is not reasonable to make large changes in the design after identifying a promising region to search. This conclusion is consistent with the obtained from the analysis by the Wiener process model (a normative model) previously studied by Panchal et al. [24].

The results from the game theoretic model show a clear tendency to assume all opponents to be well-performing designers in this experiment. In Fig. 6 the upper and lower bounds for the participants’ belief regarding the opponent score obtained from the game-theoretic model are generally lower than the actual score. Since lower scores represent better designs, these bounds predict the opponents to perform better than what they actually achieve. The gap between the results from the model and actual data increases as the opponent score becomes worse.

There are several reasons for the difference between the game-theoretic prediction and the real opponent scores. First, the game-theory assumes the players to stop sampling when the expected improvement in their payoff is negative as discussed in Sec. 5. The actual behaviors may violate that assumption. As a result, if the participants keep sampling at the expense of unnecessary cost resulting in a prediction of lower opponent score than reality. While the overall accuracy of these bounds is 53%, the accuracy increases to 69% when we use only the opponents who achieved a relatively good score of $y*(\u2212i)<0.1$. Second, the prediction results are compared with the actual opponent score rather than the participants’ belief, which is not collected during the experiment. Some of the inaccuracy may stem from the error in the participants’ belief. Finally, a small amount of error coming from the LSTM prediction (a mean squared error of 0.005 between predicted $\mu y~0$ and the actual *y*_{0} in the test data) also has a contribution.

The results in Fig. 7 depict the same characteristics of our game-theoretic model from a different perspective. Using the actual score of the opponent, this figure shows that the prediction from our game-theoretic model is always less than or equal to the actual number of samples that participants take, i.e., there are no circles above the main diagonal as shown in Fig. 7. The size of the circles represents the frequency of occurrence. The circles on the main diagonal correspond to the cases where the prediction matches the reality. In this figure, 39.2% of the data points fall on the main diagonal and the remaining 60.8% of the points are below the diagonal. Circles become smaller as the actual number of samples participants take increase since there are relatively fewer participants who take a large number of samples in a single period. This figure shows that if the participants knew the actual opponent score, they would stop earlier under the assumptions of the game-theoretic model. The figure also supports our claim that the participants’ belief regarding their opponent’s function value is smaller (or better) than what their opponents actually achieved.

Note that the results we present represent a particular subset from a six-fold cross-validation approach. Each subset provides similar outcomes where the interpretations and conclusions are not affected by which subset is used. For instance, the prediction accuracy in Fig. 4(b) is 92% for the subset we present. The other subsets yield an accuracy of 93%, 88%, 89%, 88%, and 95%. Therefore, we avoid the presentation of the entire six-fold cross validation results for brevity.

## 8 Conclusions

We presented a mathematical model to predict sequential design decisions from human designers based on their past decision. The model uses LSTM networks to predict expected future design decisions, the Delta method to represent the prediction uncertainty, and game-theory to model the condition to stop searching for better designs based on the designers’ belief of their opponents’ performance. We use data collected from a function optimization game developed previously in the literature to illustrate the application of the proposed method.

The results for that application indicate that a long short-term memory network can predict the next design decisions and corresponding outcomes based on their past decisions. The results further suggest that the designers show a strong tendency to overestimate their opponents’ performance, leading them to spend more on searching for a better design than they would have, had they known their opponents’ actual performance. Our result is further supported and explained by previous empirical studies in social cognition. For example, a study in psychology performed a controlled experiment and observed that people were inclined to underestimate how good their partners were but to overestimate their opponents [41].

The application we choose is abstract enough to be domain independent. The mathematical framework we present is generalizable to the design problems that possess the characteristics of (C1–C4) under the assumptions of (A1–A2).

### 8.1 Take-Aways.

This study further advances the prior research on understanding human sequential decision-making. The results complement the normative models in the existing literature [24,27]. Similar conclusions regarding exploration and exploitation behaviors of the participants and the participants’ belief for a strong opponent further validates the model. Learning these behaviors from the input data automatically without explicitly building them into the model is a key advantage of the proposed approach over the normative models. This paper presents implications of such behaviors on prediction outcomes.

The present results suggest some practical implications for the development of AI-based decision support systems. The difference between predicted participant behaviors from game theory (based on the assumption of rationality) and the actual behaviors highlights a need for influencing human beliefs regarding opponent performance. A forward-looking decision support agent might provide a more objective view of the opponent and guide humans to make better informed decisions. Additionally, certain mechanisms or intervention (e.g., fake opponents) can be developed to artificially influence participants’ behaviors towards the direction where system designer desires.

Further, the results from our application show that the model generally learns behaviors that result in design improvements from the crowd since a significant majority of data points reflect such behaviors. In other applications, if the goal is to also learn low-performing design behaviors, a separate model trained with such decisions might be necessary.

### 8.2 Limitations and Future Work.

There are multiple ways to study design competition. The present set of results represent a non-real-time competition as opposed to real-time competition where there is a continuous information sharing between opponents. It will be interesting to study how the participants’ behavior, particularly their perception of the opponents’ performance, vary when there is a real-time information sharing. The results are also obtained from a one-on-one competition, whereas a team-based competition may yield different competition dynamics. Further, a team-based competition could be studied to analyze both inter-team and intra-team competition.

The present study uses LSTM networks to predict human design decisions and the Delta method to estimate the uncertainty in the network models. A comparison of alternative machine learning approaches such as deep neural networks to predict sequential design decisions and alternative uncertainty prediction methods such as Bayesian approaches is left to a future study. These methods could offer varying levels of success and computational efficiency in different design problem contexts.The approach presented in this paper is general enough for testing these methods without making any changes in the game-theoretic construct when investigating 1–1 competitions.

The implications of our results for human–AI collaboration need verification with further controlled experiments. It is interesting to study how human factors play a role in collaboration with a rational decision support agent. In addition, the conclusions we made based on the abstract function optimization game also need empirical validation with more concrete design problems.

## Acknowledgment

The authors greatly appreciate Dr. Jitesh Panchal for sharing the data collected from the function optimization game. The authors also gratefully acknowledge the financial support from the U.S. National Science Foundation (NSF) through grants CMMI-1842588. Any opinions, findings, and conclusions or recommendations expressed in this publication, however, are those of the authors and do not necessarily reflect the views of NSF.

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request. The authors attest that all data for this study are included in the paper. Data provided by a third party are listed in Acknowledgements.

## References

*The Power of Modularity*, Design Rules