Abstract
Bayesian optimization is a metamodel-based global optimization approach that can balance between exploration and exploitation. It has been widely used to solve single-objective optimization problems. In engineering design, making trade-offs between multiple conflicting objectives is common. In this work, a multi-objective Bayesian optimization approach is proposed to obtain the Pareto solutions. A novel acquisition function is proposed to determine the next sample point, which helps improve the diversity and convergence of the Pareto solutions. The proposed approach is compared with some state-of-the-art metamodel-based multi-objective optimization approaches with four numerical examples and one engineering case. The results show that the proposed approach can obtain satisfactory Pareto solutions with significantly reduced computational cost.
1 Introduction
In simulation-based design optimization, the objective functions are usually expensive to be evaluated [1]. In order to reduce the number of function evaluations during the search, metamodels or surrogates, including Kriging models [2,3], radial basis functions [4], artificial neural networks [5], and polynomial-based approximations [6], are commonly used to assist design optimization. Particularly, Kriging is a metamodeling approach that supports sequential sampling to reduce the overall number of samples and has been extensively applied for design optimization under uncertainty [7–12].
Recently, a new metamodel-assisted global optimization approach, Bayesian optimization (BO), attracted research attentions. The uniqueness of BO is that the sequential sampling strategies are designed to strike a balance between exploration and exploitation for robust global optimization under uncertainty. In addition to the surrogate model as an approximation of the objective function, an acquisition function is also constructed in the same solution space to guide the sampling procedure in BO. The new sample is taken to maximize the acquisition function for each iteration, instead of searching directly based on the objective surrogate. Commonly used acquisition functions include expected improvement (EI) [13,14], probability of improvement (PI) [15,16], and lower confidence bound (LCB) [17,18], where uncertainty associated with the predicted objective values from the surrogates is considered. BO has been studied for a wide range of applications, such as robotics [19,20], combinatorial optimization [21], machine learning [22], simulation-based design optimization [23,24], and others.
Most of the existing BO studies focused on single-objective optimization. In many real-world problems, multiple objectives need to be considered and are likely to be conflicting with each other. The purpose of multi-objective optimization is to find a set of optimum solutions in the Pareto sense so that there is no optimum solution that is superior to the other with respect to all objectives [25,26]. Heuristic algorithms, such as multi-objective particle swarm optimization and multi-objective genetic algorithm, are commonly used to obtain the Pareto solutions of multi-objective optimization problems because of their robustness in convergence [27–29]. Since these population-based algorithms require a large number of function evaluations in order to converge to Pareto solutions [26,30,31], a common strategy to improve the efficiency is using metamodels to replace the objectives [32–37]. In the metamodel-assisted multi-objective optimization approaches, usually one metamodel is constructed separately for each objective. The cost of function evaluation for the surrogates is much lower than that for the original objective functions.
Very limited work has been done on how to formulate BO for efficient multi-objective optimization. The research focus of multi-objective BO (MOBO) is how to design acquisition functions for solving multi-objective problems efficiently. In the existing work, the EI function has been extended to different forms in MOBO. A straightforward extension is to define multi-objective EI as the expected improvement based on the joint probability of improvement from individual objectives with the assumption of independence between objectives [38–41]. Knowles [38] proposed the ParEGO approach, which is an extension of the single-objective efficient global optimization or EI approach. Ponweiser et al. [40] introduced a new EI-based multi-objective optimization, which utilizes the S-metric or hypervolume contribution to decide which sample point to evaluate next. Keane [42] combined the design of experiment methods with Kriging models to enable the parallel evolution of multi-objective Pareto sets. Feliot et al. [43] proposed a Bayesian multi-objective optimization (BMOO) approach that handles multiple objectives and non-linear constraints simultaneously, in which a one-step look-ahead Bayesian strategy is used. Forrester et al. [44] defined the parental improvement based on the Euclidean distance. Bautista [45] defined it as the minimum improvement among all objectives. Pandita et al. [46] provided a reformulation of expected improvement over the dominated hypervolume to solve stochastic multi-objective optimization problems. Calandra et al. [47] proposed to exploit the metamodels to improve the estimation of the final Pareto front. However, in these approaches, the calculation of the expected improvement needs to compute multi-dimensional integrals, which usually relies on Monte Carlo sampling and is computationally expensive. To solve this problem, Zhan et al. [48] proposed the EI matrix (EIM) for multi-objective problems, where the elements in the EIM are the single-objective EIs for each sample point with respect to each objective, and an aggregated scalar value based on all elements in the EIM is used to guide the sampling.
In this paper, a new MOBO approach is proposed to solve multi-objective optimization problems with a new acquisition function. In the proposed approach, two metrics of the relative hyperarea difference (RHD) and overall spread (OS) [49,50] are modified to better measure the convergence and diversity of Pareto solutions. The new acquisition function is defined to simultaneously improve the convergence and diversity of Pareto solutions, in which the modified RHD and OS are combined, and the LCB functions are employed to consider the metamodel uncertainty.
The rest of this paper is organized as follows: in Sec. 2, the background of multi-objective optimization and BO is introduced. In Sec. 3, the proposed MOBO approach is presented. In Sec. 4, the performance and efficiency of the proposed MOBO approach are demonstrated using four numerical examples and an engineering case. The proposed approach is also compared with existing metamodel-based approaches. The advantages of the proposed approach are analyzed and summarized. Section 5 concludes the paper.
2 Background
2.1 Multi-Objective Optimization.
2.2 Kriging Model.
Kriging or Gaussian process regression model is usually used as the surrogate in BO. It provides the predictions of not only the mean value at an unobserved design location but also the associated uncertainty as variance.
2.3 Bayesian Optimization.
Bayesian optimization is a metamodel-based approach for global optimization under uncertainty with sequential sampling. The most used metamodel to approximate the unknown objective function f(x) is based on Kriging. An acquisition function is defined to make a trade-off between exploration (sampling in the most uncertain region) and exploitation (sampling in the region with the best predictions) in the sampling or searching process. Commonly used acquisition functions include EI, PI, and LCB functions. Here, the LCB function is introduced. More details about the EI and PI functions can be found in Ref. [24].
3 The Proposed MOBO Formulation
In the proposed multi-objective Bayesian optimization formulation, two metrics for the quality of Pareto set, the RHD and OS, are modified and used to define the new acquisition function of MOBO.
3.1 Modified Quality Metrics.
The RHD and OS are usually used to measure the quality of Pareto frontiers [49,50]. As illustrated in Fig. 1, RHD and OS represent the convergence and diversity of the obtained Pareto frontier, respectively. The RHD is the area of the polygon formed by the Pareto frontier and an ideal solution that is best in all objectives. The smaller the value of RHD is, the higher convergence of the Pareto frontier is. The OS is the bounding box of the frontier. The larger the value of OS is, the higher diversity of the Pareto frontier is.
The main disadvantages of the RHD and OS are (1) the extremely good and bad solutions pgood and pbad need to be defined before the calculation of the two metrics and (2) when a new Pareto solution is found between two Pareto solutions, the value of RHD for a Pareto front will increase, instead of monotonically decreasing. This contradicts to the original intention that when two Pareto frontiers are compared, the one with a smaller RHD is supposed to be better.
To overcome the above two shortcomings in calcuating RHD and OS and using them to measure the improvement, the modified hyperarea difference (MHD) and modified overall spread (MOS) are proposed here, as shown in Fig. 2. The shaded areas in Figs. 2(a) and 2(b) represent the values of the MHD and MOS, respectively. MHD and MOS are not ratios as in the original RHD and OS. Thus, the calculations do not rely on the hypothetically good and bad solutions. The areas are calculated based on the existing frontier. Furthermore, the value of MHD will decrease monotonically when a new Pareto point is found between two existing Pareto solutions. As illustrated in Fig. 3(a), the area for MHD monotonically reduces as new Pareto solutions (e.g., point e) are found and included in the Pareto set, which is different from RHD as in Fig. 3(b). Therefore, when MHD is used in the acquisition function, it encourages finding more Pareto solutions, which will provide designers more choices.

The changes in the values of MHD and RHD when a new Pareto solution are found. (a) Change in the value of MHD and (b) change in the value of RHD.
The effects of new Pareto solutions on the new metrics are further illustrated with a simple example in Fig. 4. Based on the four Pareto solutions in Fig. 4(a), MHD and MOS for the two-objective minimization problem are calculated to be 3.0 and 6.25, respectively. When a new Pareto solution outside the range of the existing ones is inserted, either at the upper left or lower right of the region as shown in Fig. 4(b), the diversity is improved with a larger MOS value. As shown in Fig. 4(c), a new solution that dominates some existing solutions will improve the convergence with a smaller MHD value. Also shown in Fig. 4(d), a new solution that does not dominate any existing solution will improve the convergence and reduce the MHD value. Therefore, a new Pareto solution can always improve either diversity (increasing MOS) or convergence (reducing MHD).
3.2 The Proposed Acquisition Function.
The basic requirement of multi-objective optimization is to obtain a sufficient number of Pareto solutions with good convergence and diversity. Since a new Pareto solution always brings an improvement of either MHD or MOS, our acquisition function is defined to simultaneously and rapidly improve both the convergence and diversity of the Pareto solutions.
3.3 The New MOBO Procedure.
The proposed MOBO approach begins with the initial sampling. The basic requirement is that the initial sample points are distributed uniformly in the whole design space to provide each region of the design space the equal opportunity to be detected. Hence, Latin hypercube sampling (LHS) [54,55] can be used to generate the initial sample set.
The MOBO procedure is summarized in Fig. 7, which includes the following steps:
Step 1: Generate an initial sample set by LHS and obtain the true responses from the original objectives.
Step 2: Construct the initial Kriging models based on the initial sample points, one model for each objective. The design and analysis of computer experiments toolbox [56] is used to construct the Kriging models.
Step 3: Sort the sample points to obtain the current Pareto solutions. Here, a fast non-dominated sorting approach is used [57]. In this study, the sorting approach is simplified in two ways to save the sorting time: (1) the simplified sorting approach only divides the sample points into dominated and non-dominated solutions, while the original approach classifies the sample points into multiple dominance levels and (2) the sample points that have been classified as dominated solutions are not sorted in the subsequent iterations.
Steps 4 and 5: The next sample point is determined by maximizing the proposed acquisition function. The new sample point is used to update the Kriging models and the acquisition function.
Step 6: If the stop criterion is not met, go to Step 3. Otherwise stop and output the Pareto solutions.
The optimization algorithm terminates when the following two conditions are met simultaneously:
When the difference between the current quality metrics and the ones in the previous iteration is less than a specific value (e.g., 1% in this study), or the number of iterations reaches a predetermined maximum level, which is set as 200 in this study.
When the optimization procedure finds at least k solutions so that a sufficient number of Pareto solutions give the decision maker enough choices. In this study, k is set as 20.
4 Examples and Results
In this section, several commonly used numerical examples (ZDT1, ZDT2, ZDT3, and FON) adapted from Refs. [32,58] and an engineering case adapted from Ref. [59] are used to demonstrate the applicability and efficiency of the proposed MOBO approach.
Since the EIM [48] and BMOO approaches [43] are the most relevant work, the proposed approach is compared with them through the test examples. For the EIM approaches, three multi-objective criteria were developed by combining the elements in the EIM into scalar functions in three different ways: (1) Euclidean distance-based EIM (EIMe), (2) maximin distance-based EIM (EIMm), and (3) hypervolume-based EIM (EIMh). For all of the approaches, the same initial sampling strategy is used and the initial sampling size is set as 30. The same stop criterion described in Sec. 3.3 is also used for all approaches.
4.1 Numerical Examples.
Each numerical example is solved 30 times with the same approach to account for the influence of randomness. The formulations of the four examples are listed in Table 1. For ZDT1, ZDT2, and ZDT3, the Kriging model is only constructed for f2(x) because f1(x) is very simple. For the FON problem, two Kriging models are constructed for the two objectives respectively.
The formulations of the numerical examples
Cases | Formulation |
---|---|
ZDT1 | |
ZDT2 | |
ZDT3 | |
FON |
Cases | Formulation |
---|---|
ZDT1 | |
ZDT2 | |
ZDT3 | |
FON |
As shown in Fig. 8, the Pareto frontiers from the EIM’s, BMOO, and the proposed MOBO approaches are in good agreement with the true Pareto frontier. This indicates that these metamodel-based approaches can obtain satisfactory Pareto solutions.
The comparisons of the quality metrics and computational efficiency for different approaches are summarized in Table 2. In Table 2, FC denotes the number of original function evaluations. The value ranges, means, and STDs of the MHD, MOS, and FC from 30 runs are listed. The box charts of the MHD, MOS, and FC obtained by different approaches for the ZDT1 problem are also shown in Fig. 9.

The box charts of the MHD, MOS, and FC obtained by different approaches. (a) Box chart of MHD, (b) box chart of MOS, and (c) box chart of FC.
Comparison of the results from different approaches for the ZDT1 problem
EIMe | EIMm | EIMh | BMOO | Proposed approach | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | |
MHD | [0.35, 0.38] | 0.37 | 0.01 | [0.34, 0.37] | 0.36 | 0.01 | [0.36, 0.37] | 0.36 | 0.00 | [0.36, 0.43] | 0.38 | 0.02 | [0.36, 0.37] | 0.36 | 0.00 |
MOS | [1.00, 1.56] | 1.13 | 0.15 | [1.01, 1.46] | 1.11 | 0.11 | [1.00, 1.57] | 1.11 | 0.16 | [0.95, 2.17] | 1.25 | 0.34 | [0.99, 1.01] | 1.01 | 0.00 |
FC | [53, 131] | 89.93 | 18.63 | [60, 150] | 91.20 | 19.54 | [67, 207] | 110.00 | 33.12 | [61, 175] | 105.27 | 28.45 | [50, 80] | 52.73 | 6.60 |
EIMe | EIMm | EIMh | BMOO | Proposed approach | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | |
MHD | [0.35, 0.38] | 0.37 | 0.01 | [0.34, 0.37] | 0.36 | 0.01 | [0.36, 0.37] | 0.36 | 0.00 | [0.36, 0.43] | 0.38 | 0.02 | [0.36, 0.37] | 0.36 | 0.00 |
MOS | [1.00, 1.56] | 1.13 | 0.15 | [1.01, 1.46] | 1.11 | 0.11 | [1.00, 1.57] | 1.11 | 0.16 | [0.95, 2.17] | 1.25 | 0.34 | [0.99, 1.01] | 1.01 | 0.00 |
FC | [53, 131] | 89.93 | 18.63 | [60, 150] | 91.20 | 19.54 | [67, 207] | 110.00 | 33.12 | [61, 175] | 105.27 | 28.45 | [50, 80] | 52.73 | 6.60 |
From Table 2 and Fig. 9, it can be seen that the means of quality metrics of Pareto solutions obtained by all four approaches are close to the ideal quality metrics. The proposed approach however shows the best stability with the minimum standard deviations. In terms of computational cost, the number of function evaluations in the proposed approach is significantly lower than those of other approaches. The proposed approach also has the smallest standard deviation of FC. The results show that the proposed approach is more efficient and stable than the other three approaches. To further compare the efficiency of different approaches to find a sufficient number of Pareto solutions, we counted the number of sample points (including initial sample points) required for different approaches to find 10, 15, and 20 Pareto solutions. The results are shown in Table 3. The proposed approach always has the lowest average number of samples. Obtaining the enough number of Pareto solutions is important for multi-objective optimization, because the designer can have more options to meet the design requirements from a larger number of Pareto solutions.
The average number of sample points required for different approaches to find 10, 15, and 20 Pareto solutions for the ZDT1 problem
The number of obtained pareto solutions | EIMe | EIMm | EIMh | BMOO | Proposed approach |
---|---|---|---|---|---|
10 | 46.13 | 44.07 | 49.97 | 48.44 | 40.60 |
15 | 72.07 | 76.30 | 81.10 | 78.81 | 46.80 |
20 | 89.77 | 91.17 | 109.87 | 104.93 | 52.73 |
The number of obtained pareto solutions | EIMe | EIMm | EIMh | BMOO | Proposed approach |
---|---|---|---|---|---|
10 | 46.13 | 44.07 | 49.97 | 48.44 | 40.60 |
15 | 72.07 | 76.30 | 81.10 | 78.81 | 46.80 |
20 | 89.77 | 91.17 | 109.87 | 104.93 | 52.73 |
The comparison results for the quality of Pareto solutions and computational efforts for all examples are summarized in Table 4. It is seen that the EIMe, EIMm, EIMh, and BMOO approaches, and the proposed one can obtain satisfactory optimal solutions for the ZDT2 problem with n = 3, as well as ZDT3 and FON problems. In terms of computational efficiency, the proposed approach has the lowest computational cost. In addition, for all of the three metrics, the proposed approach has the minimum standard deviations, which indicates that the proposed approach has the best stability for these problems. For the three high-dimensional examples (ZDT2 problems with n = 6, 8, and 10), searching for Pareto solutions is more difficult. The “/” symbol in Table 4 indicates that it is difficult for the optimization method to reach the termination conditions when the maximum number of iterations 200 is reached. More iterations are required to find at least 20 Pareto solutions with converged quality metrics. For ZDT2 problems with n = 6 and 8, the proposed approach shows the best efficiency and stability. When the maximum number of iterations is reached, it is difficult for the EIMh and BMOO approaches to meet the termination condition for the ZDT2 problems with n = 6 and 8. The EIMm approach cannot meet the termination condition for the ZDT2 problem with n = 8. For the ZDT2 problem with n = 10, all five approaches failed to meet the termination conditions. The reason is that the number of samples required to construct a Kriging model with good approximation grows exponentially as the dimension of the problem increases [24]. As a result, the prediction accuracy of the Kriging model will decrease as the dimension increases when the number of samples does not grow quick enough.
Quality and efficiency metrics of different approaches for ZDT2, ZDT3, and FON problems
EIMe | EIMm | EIMh | BMOO | Proposed approach | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
30 runs | Mean | STD | 30 runs | Mean | STD | 30 runs | Mean | STD | 30 runs | Mean | STD | 30 runs | Mean | STD | ||
ZDT2 (n = 3) | MHD | [0.69, 0.70] | 0.70 | 0.00 | [0.69, 0.70] | 0.70 | 0.00 | [0.69, 0.70] | 0.70 | 0.00 | [0.66, 0.73] | 0.71 | 0.02 | [0.69, 0.69] | 0.69 | 0.00 |
MOS | [0.70, 1.45] | 1.06 | 0.10 | [1.00, 1.33] | 1.05 | 0.06 | [0.99, 1.53] | 1.06 | 0.12 | [0.88, 1.93] | 1.02 | 0.18 | [1.00, 1.00] | 1.00 | 0.00 | |
FC | [51, 229] | 69.6 | 35.41 | [51, 69] | 56.03 | 4.53 | [63, 230] | 198.43 | 38.5 | [65, 115] | 79.33 | 10.84 | [50, 52] | 50.03 | 0.48 | |
ZDT3 | MHD | [0.43, 0.84] | 0.75 | 0.07 | [0.71, 0.84] | 0.78 | 0.03 | [0.69, 0.86] | 0.77 | 0.03 | [0.52, 0.93] | 0.77 | 0.06 | [0.69, 0.91] | 0.76 | 0.03 |
MOS | [1.11, 3.09] | 1.81 | 0.41 | [1.46, 4.12] | 1.82 | 0.54 | [1.51, 3.88] | 2.10 | 0.52 | [1.14, 4.15] | 2.08 | 0.55 | [1.39, 1.74] | 1.54 | 0.07 | |
FC | [106, 229] | 183.13 | 35.31 | [115, 229] | 181.87 | 29.12 | [127, 229] | 195.1 | 31.71 | [91, 215] | 168.43 | 28.69 | [84, 179] | 150.5 | 28.13 | |
FON | MHD | [0.46, 0.70] | 0.66 | 0.05 | [0.51, 0.70] | 0.67 | 0.04 | [0.53, 0.70] | 0.67 | 0.04 | [0.57, 0.70] | 0.67 | 0.03 | [0.66, 0.70] | 0.69 | 0.01 |
MOS | [0.62, 0.96] | 0.92 | 0.07 | [0.76, 0.96] | 0.93 | 0.04 | [0.76, 0.96] | 0.93 | 0.05 | [0.76, 0.96] | 0.94 | 0.04 | [0.93, 0.96] | 0.96 | 0.01 | |
FC | [73, 179] | 115.6 | 34.99 | [68, 230] | 126.13 | 53.52 | [80, 230] | 158.1 | 54.56 | [80, 230] | 158.1 | 54.56 | [63, 103] | 79 | 9.56 | |
ZDT2 (n = 6) | MHD | [0.69, 0.71] | 0.70 | 0.01 | [0.65, 0.72] | 0.70 | 0.01 | / | / | / | / | / | / | [0.68, 0.69] | 0.69 | 0.00 |
MOS | [1.00, 1.87] | 1.18 | 0.20 | [0.97, 1.98] | 1.14 | 0.21 | / | / | / | / | / | / | [0.99, 1.05] | 1.01 | 0.02 | |
FC | [53, 93] | 66.12 | 9.50 | [50, 73] | 57.97 | 4.79 | / | / | / | / | / | / | [50, 56] | 51.66 | 1.49 | |
ZDT2 (n = 8) | MHD | [0.67, 0.72] | 0.71 | 0.01 | / | / | / | / | / | / | / | / | / | [0.68, 0.70] | 0.69 | 0.01 |
MOS | [1.00, 2.09] | 1.29 | 0.33 | / | / | / | / | / | / | / | / | / | [0.99, 1.05] | 1.01 | 0.02 | |
FC | [53, 118] | 77.19 | 18.59 | / | / | / | / | / | / | / | / | / | [50, 60] | 54.8 | 2.23 | |
ZDT2 (n = 10) | / | / | / | / | / | / | / | / | / | / | / | / | / | / | / | |
/ | / | / | / | / | / | / | / | / | / | / | / | / | / | / | ||
/ | / | / | / | / | / | / | / | / | / | / | / | / | / | / |
EIMe | EIMm | EIMh | BMOO | Proposed approach | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
30 runs | Mean | STD | 30 runs | Mean | STD | 30 runs | Mean | STD | 30 runs | Mean | STD | 30 runs | Mean | STD | ||
ZDT2 (n = 3) | MHD | [0.69, 0.70] | 0.70 | 0.00 | [0.69, 0.70] | 0.70 | 0.00 | [0.69, 0.70] | 0.70 | 0.00 | [0.66, 0.73] | 0.71 | 0.02 | [0.69, 0.69] | 0.69 | 0.00 |
MOS | [0.70, 1.45] | 1.06 | 0.10 | [1.00, 1.33] | 1.05 | 0.06 | [0.99, 1.53] | 1.06 | 0.12 | [0.88, 1.93] | 1.02 | 0.18 | [1.00, 1.00] | 1.00 | 0.00 | |
FC | [51, 229] | 69.6 | 35.41 | [51, 69] | 56.03 | 4.53 | [63, 230] | 198.43 | 38.5 | [65, 115] | 79.33 | 10.84 | [50, 52] | 50.03 | 0.48 | |
ZDT3 | MHD | [0.43, 0.84] | 0.75 | 0.07 | [0.71, 0.84] | 0.78 | 0.03 | [0.69, 0.86] | 0.77 | 0.03 | [0.52, 0.93] | 0.77 | 0.06 | [0.69, 0.91] | 0.76 | 0.03 |
MOS | [1.11, 3.09] | 1.81 | 0.41 | [1.46, 4.12] | 1.82 | 0.54 | [1.51, 3.88] | 2.10 | 0.52 | [1.14, 4.15] | 2.08 | 0.55 | [1.39, 1.74] | 1.54 | 0.07 | |
FC | [106, 229] | 183.13 | 35.31 | [115, 229] | 181.87 | 29.12 | [127, 229] | 195.1 | 31.71 | [91, 215] | 168.43 | 28.69 | [84, 179] | 150.5 | 28.13 | |
FON | MHD | [0.46, 0.70] | 0.66 | 0.05 | [0.51, 0.70] | 0.67 | 0.04 | [0.53, 0.70] | 0.67 | 0.04 | [0.57, 0.70] | 0.67 | 0.03 | [0.66, 0.70] | 0.69 | 0.01 |
MOS | [0.62, 0.96] | 0.92 | 0.07 | [0.76, 0.96] | 0.93 | 0.04 | [0.76, 0.96] | 0.93 | 0.05 | [0.76, 0.96] | 0.94 | 0.04 | [0.93, 0.96] | 0.96 | 0.01 | |
FC | [73, 179] | 115.6 | 34.99 | [68, 230] | 126.13 | 53.52 | [80, 230] | 158.1 | 54.56 | [80, 230] | 158.1 | 54.56 | [63, 103] | 79 | 9.56 | |
ZDT2 (n = 6) | MHD | [0.69, 0.71] | 0.70 | 0.01 | [0.65, 0.72] | 0.70 | 0.01 | / | / | / | / | / | / | [0.68, 0.69] | 0.69 | 0.00 |
MOS | [1.00, 1.87] | 1.18 | 0.20 | [0.97, 1.98] | 1.14 | 0.21 | / | / | / | / | / | / | [0.99, 1.05] | 1.01 | 0.02 | |
FC | [53, 93] | 66.12 | 9.50 | [50, 73] | 57.97 | 4.79 | / | / | / | / | / | / | [50, 56] | 51.66 | 1.49 | |
ZDT2 (n = 8) | MHD | [0.67, 0.72] | 0.71 | 0.01 | / | / | / | / | / | / | / | / | / | [0.68, 0.70] | 0.69 | 0.01 |
MOS | [1.00, 2.09] | 1.29 | 0.33 | / | / | / | / | / | / | / | / | / | [0.99, 1.05] | 1.01 | 0.02 | |
FC | [53, 118] | 77.19 | 18.59 | / | / | / | / | / | / | / | / | / | [50, 60] | 54.8 | 2.23 | |
ZDT2 (n = 10) | / | / | / | / | / | / | / | / | / | / | / | / | / | / | / | |
/ | / | / | / | / | / | / | / | / | / | / | / | / | / | / | ||
/ | / | / | / | / | / | / | / | / | / | / | / | / | / | / |
4.2 An Engineering Example.
In this section, the proposed approach is applied to the design optimization of a torque arm. As shown in Fig. 10, the torque arm is fixed to the hole at the left end. Two forces, P1 = 8.0 kN and P2 = 4.0 kN, are placed at the center of the right end. Young’s modulus and Poisson’s ratio of the material are 200 GPa and 0.3, respectively.
Each of the five approaches is run 10 times for this problem. The Pareto frontiers obtained by these different approaches are plotted in Fig. 11. As shown in Fig. 11, the Pareto frontiers obtained by the different approaches are in good consistency. The diversity of the proposed approach is better than the other four approaches. The Pareto solutions obtained by the EIMe and EIMm approaches cluster in some regions and are not as diverse or distributed as the ones obtained by the EIMh, BMOO, and the proposed approach. Obviously, a more uniformly distributed solution set is more beneficial for the designer. The design variable values of five Pareto solutions, labeled with numbers in Fig. 11, and the corresponding responses are listed in Table 5.
The five Pareto solutions and the corresponding responses of the engineering case
No. | Design variables | Volume | Maximum displacement |
---|---|---|---|
1 | (3.75, 27.25, 90.03, 23.32, 12.05, 8.03) | 444.25 | 1.59 |
2 | (4.18, 25.00, 90.00, 29.29, 12.00, 8.00) | 470.97 | 1.11 |
3 | (4.50, 31.78, 90.00, 30.00, 12.00, 8.00) | 523.64 | 0.83 |
4 | (4.50, 35.00, 90.00, 29.99, 22.00, 13.00) | 650.46 | 0.64 |
5 | (4.50, 35.00, 120.00, 30.00, 22.00, 13.00) | 799.36 | 0.634 |
No. | Design variables | Volume | Maximum displacement |
---|---|---|---|
1 | (3.75, 27.25, 90.03, 23.32, 12.05, 8.03) | 444.25 | 1.59 |
2 | (4.18, 25.00, 90.00, 29.29, 12.00, 8.00) | 470.97 | 1.11 |
3 | (4.50, 31.78, 90.00, 30.00, 12.00, 8.00) | 523.64 | 0.83 |
4 | (4.50, 35.00, 90.00, 29.99, 22.00, 13.00) | 650.46 | 0.64 |
5 | (4.50, 35.00, 120.00, 30.00, 22.00, 13.00) | 799.36 | 0.634 |
The detailed comparisons of the performances are summarized in Table 6. It can be seen in Table 6 that the proposed approach has better efficiency and diversity than the other approaches. For the two quality metrics and one efficiency metric, the proposed approach has the smallest standard deviations, which shows the best stability.
Comparison results of different approaches for the design of the torque arm
EIMe | EIMm | EIMh | BMOO | Proposed approach | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | |
MHD | [48.17, 63.85] | 55.22 | 4.76 | [50.13, 64.25] | 57.81 | 4.5 | [34.96, 46.95] | 47.82 | 3.67 | [43.47, 55.77] | 47.82 | 3.67 | [44.39, 52.26] | 47.64 | 2.64 |
MOS | [96.98, 294.83] | 170.84 | 58.7 | [99.83, 188.77] | 155.81 | 25.52 | [111.10, 235.32] | 185.88 | 39.08 | [110.92, 310.45] | 270.44 | 19.21 | [314.30, 356.84] | 330.74 | 15.48 |
FC | [94, 167] | 122.2 | 20.34 | [97, 139] | 114.5 | 15.58 | [56, 71] | 64.1 | 4.72 | [62, 89] | 68.13 | 6.54 | [60, 69] | 63.3 | 2.41 |
EIMe | EIMm | EIMh | BMOO | Proposed approach | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | Range | Mean | STD | |
MHD | [48.17, 63.85] | 55.22 | 4.76 | [50.13, 64.25] | 57.81 | 4.5 | [34.96, 46.95] | 47.82 | 3.67 | [43.47, 55.77] | 47.82 | 3.67 | [44.39, 52.26] | 47.64 | 2.64 |
MOS | [96.98, 294.83] | 170.84 | 58.7 | [99.83, 188.77] | 155.81 | 25.52 | [111.10, 235.32] | 185.88 | 39.08 | [110.92, 310.45] | 270.44 | 19.21 | [314.30, 356.84] | 330.74 | 15.48 |
FC | [94, 167] | 122.2 | 20.34 | [97, 139] | 114.5 | 15.58 | [56, 71] | 64.1 | 4.72 | [62, 89] | 68.13 | 6.54 | [60, 69] | 63.3 | 2.41 |
5 Conclusion
In this paper, a new acquisition function is proposed for multi-objective Bayesian optimization, where the improvement of the hyperarea and overall spread are used to guide the sequential sampling to search the Pareto solutions. Two modified metrics of hyperarea and overall spread of the Pareto frontier are proposed and used in the new acquisition function, which encourages searching for a large number of Pareto solutions and provides a balance between convergence and diversity of the Pareto frontier. The LCB functions are used to incorporate uncertainty in surrogate modeling and find the exploration–exploitation balance, while keeping the computational efficiency.
The proposed approach is demonstrated with four numerical examples and an engineering case. Its performance is compared with other four expected improvement matrix approaches. The results show that the proposed approach is computationally more efficient with fewer numbers of function evaluations to obtain the same number of Pareto solutions. The new approach is also stable and robust with the minimum level of variations.
While computational efficiency is the consideration of using the LCB functions, the main computational cost still remains at constructing the Kriging models. Scalability has been a major issue for Kriging as the dimension of searching space increases. The computational efficiency of the current scheme can be further improved. As the number of sample points increases, the computational time for sorting the non-dominated solutions and calculating MHD and MOS becomes longer. More efficient data structures and algorithms can be developed. In addition, a constant exploitation–exploration coefficient is used in our LCB functions. A dynamically adaptive coefficient value can have a better balance between the exploitation and exploration. In future work, we will also examine the efficiency and accuracy of the proposed approach for optimization with more than two objectives.
Acknowledgment
This research has been supported in part by the National Natural Science Foundation of China (Grant Nos. 51775203, 51805179, and 51721092) and the Fundamental Research Funds for the Central Universities at HUST (Grant No. 2016YXMS272). The support of China Scholarship Council is also appreciated.