Novel machine learning driven design strategy for high strength Zn Alloys optimization with multiple constraints

Table of Contents

Alloy performance dataset and data distribution

In this work, three datasets were constructed based on laboratory test data, relevant literature^{32,33,34,35,36,37,38,39,40,41,42}, and the Matweb database, each containing the same types of compositional features and different performance metrics: UTS, EL, and hardness. Missing values in the datasets were filled with zeros. Detailed data can be found in Tables S1–S3 of the Supplementary Materials. Three predictive models were trained to predict the mechanical properties based on alloy composition under constant forming conditions. Consequently, the compositional features in the datasets were used as inputs, while the UTS, EL, and hardness properties were used as outputs.

Figure 2a–c present the distribution of data based on the amount of elements in the alloys for the three mechanical properties. The samples in the UTS, EL, and hardness datasets primarily consisted of ternary to six-component alloys. The alloy composition diversity contributes to a broader information space for the ML models, thereby enhancing their generalization ability and robustness and providing an effective basis for subsequent application work.

**Fig. 2: Data distribution in the UTS, EL, and Hardness datasets.**

The distribution of element types and contents in the datasets is exhibited in Fig. 2d–f. The UTS, EL, and hardness datasets involved 10 alloying elements. In all three datasets, the element with the widest content range was Al, ranging from 0 to 43 wt.%, followed by Cu, with a range of 0–13 wt.%, and Mg, with a range of 0–3 wt.%. The content of the other elements ranged generally between 0 and 1.2 wt.%. This aligns with the extensive research on doping Zn alloys with Al, Cu, and Mg^43,44,45. Furthermore, to enhance the reliability and accuracy of ML model applications, the composition search space was defined based on the element content ranges in the dataset, thereby reducing the risk of model performance degradation due to extrapolation.

The distribution of the three mechanical properties in the dataset is plotted in Fig. 2g–i. The UTS, EL, and hardness values ranged within 25–457 MPa, 0–20.1%, and 53–200 HV, respectively. The proportion of samples with a UTS between 0 and 300 MPa was 72.8%, with an EL between 0-5% was 76.5%, and with a hardness between 40 and 115 HV was 62.7%. It is evident that the majority of Zn alloy samples exhibited relatively low performance and failed to demonstrate well-balanced mechanical properties, indicating significant room for improvement and optimization. Introducing ML models to guide Zn alloy design based on data-driven insights is essential for accelerating the development of high-performance Zn alloys.

Feature selection

Prior to feature selection, this work first divided the dataset into a training set and a test set in an 8:2 ratio using random sampling. All feature analysis and selection steps were performed only on the training set, with the test set not involved in this process. This was intended to preserve the independence of the test set during the evaluation and analysis phase, thereby effectively reducing the risk of data leakage.

Here, the correlation coefficients between features and target properties were calculated and visualized as proportions (Figure S1). The Pearson correlation coefficient analysis indicated that Cu and Mg have a strong linear correlation with the UTS and hardness of Zn alloys, while the RE elements La and Ce exhibited a highly linear correlation with EL (Figure S1a). Similarly, the Spearman correlation coefficient indicated that Cu is strongly correlated with both UTS and hardness (Figure S1b). Nevertheless, most compositional features exhibited varying correlation strengths with the target properties across the two correlation coefficients, with some, e.g., Mg, even exhibiting an inverse correlation trend. This implies that relying solely on a single correlation coefficient may not comprehensively capture the true relationships between variables. Thus, using both correlation coefficients for feature selection would be both necessary and effective. To reduce the dimensionality of the input features, enhance model training efficiency, and mitigate the risk of overfitting, the Ca and B features, which had a correlation with the target of less than 0.25, were eliminated.

To conduct a more comprehensive feature selection, the input features were ranked using two common evaluation metrics for feature importance based on random forest (RF) (Fig. 3). The feature ranking based on the mean decrease in impurity (MDI) indicated that Al, Cu, and Mg have the most significant impact on UTS and Hardness, while La and Mn have a considerable effect on EL (Fig. 3a–c). Similarly, the ranking based on permutation importance (PI) revealed that Cu has the strongest effect on UTS, while La and Mg are the features that most significantly impact EL and Hardness, respectively (Fig. 3d–f). To reduce model complexity, the Ca, Ti, and B features, which consistently ranked lowest across six evaluations via the two RF-based metrics, were selected for further consideration. In conjunction with the correlation analysis results, it was decided to exclude the Ca and B features from the final model.

**Fig. 3: RF-based feature importance ranking.**

Model selection and hyperparameter optimization

In this work, a performance evaluation was conducted on four standard statistical models: three linear regression models (unregularized linear regression, regularized Lasso, and Ridge) and a response surface model. Multiple random splits of the dataset were used for training, and the prediction performance of the models was evaluated based on RMSE and the coefficient of determination (R²) for both the training and testing sets. The evaluation results are shown in Tables S4, S5, and S6. For predicting UTS and EL, the R² values of the linear regression models are all below 0.5, with some negative R² values in the test set. This indicates that the linear regression model is not suitable for predicting the mechanical properties of zinc alloys. Although the linear regression model achieved an R² of around 0.75 for predicting hardness, it still does not meet the requirements for practical application. As for the response surface model, it showed large negative values and high standard deviations across all test set R² values, indicating poor performance in predicting the alloy properties.

To improve the efficiency of this process, three ensemble algorithms were employed: Extreme Gradient Boosting (XGBoost)⁴⁶, RF⁴⁷, and Adaptive Boosting (AdaBoost)⁴⁸. Each algorithm was used to build prediction models for the three mechanical properties. The models were evaluated by comparing the average R² over 100 random dataset splits. Moreover, the performance variations of the models under different training/testing set split ratios were also investigated.

Figure 4a–c present the prediction accuracy of the three algorithms for each mechanical property under different split ratios. The models constructed by the same algorithm exhibited similar R² trends across different target performance metrics. The R² values on the test sets of the three models improved with increasing training set proportion, reaching relatively high values when the training set ratio was 0.8. As regards the training set R², the models built using the RF and XGBoost algorithms exhibited a consistent upward trend, similar to the test set, while the model built with the AdaBoost algorithm presented a notable decline. This can be attributed to the strong dependence of the model on the dataset, where increasing the training data results in greater model complexity and a higher likelihood of overfitting. Based on the performance evaluation of models for different target metrics, the model with the highest average R² at a training set ratio of 0.8 was selected for each target. In particular, XGBoost was chosen for predicting UTS and Hardness, while RF was selected for predicting EL.

**Fig. 4: Performance evaluation and optimization of predictive models.**

To tune the hyperparameters of the prediction models for different target performances, Bayesian optimization⁴⁹ was applied. To avoid overfitting or underfitting while improving model accuracy, the hyperparameter search space was constrained. The specific hyperparameters and their respective ranges for XGBoost and RF are listed in Table 1.

Table 1 Hyperparameters and their corresponding ranges for the optimization of the predictive models

Figure 4 presents the optimal hyperparameter combinations and model accuracy after optimization. In these plots, red points represent the training set samples, while green points represent the test set samples. In addition, the dashed lines represent the ideal fit line with a slope of 1; the closer the predicted points to this line, the higher the accuracy of the model. The UTS prediction model achieved an R² of 0.93 on the training set and 0.92 on the test set, with a root mean square error (RMSE) of 22.29 MPa (Fig. 4d). The EL prediction model reached an R² of 0.97 on the training set and 0.95 on the test set, with an RMSE of 0.95% (Fig. 4e). The Hardness prediction model achieved an R² of 0.96 on the training set and 0.93 on the test set, with an RMSE of 6.91 HV (Fig. 4f). All three models demonstrated high predictive accuracy on both the training and test sets, with R² differences of less than 0.03, indicating strong generalization ability. No significant overfitting or underfitting was observed, providing a stable and reliable basis for the further application of these prediction models.

Model interpretability analysis

The interpretability analysis of ML-based alloy performance prediction models not only helps understand the impact of different alloy compositions on performance but also reveals the potential interactions and relationships between compositional features. This analysis is fundamentally based on the training and prediction sample points, where both the quality and quantity of these points significantly affect its effectiveness.

Inspired by the strategies employed by Chen et al. ⁵⁰ and Dong et al. ⁵¹ to improve the accuracy of predictive models via virtual sample generation, this study utilized PSO to generate virtual samples before conducting interpretability analysis of the models. This approach expanded the original dataset, enriching the original feature space and providing a more robust data basis for the interpretability analysis. Figure S2a–c illustrate the distribution of the original dataset across features for UTS, EL, and Hardness, respectively. It is evident that the distribution of the original dataset in the feature space was uneven, limiting the extraction of relational information among model features. In contrast, Figure S2d–f depict the distribution of the augmented datasets across features, exhibiting a more uniform distribution with richer sample information. This enhanced distribution helps better elucidate the relationships among model features.

Furthermore, the SHAP values for each model were calculated based on the augmented dataset, facilitating a more intuitive analysis of how input features affect target performance. Figure 5a visualizes the relative importance of all compositional features for the predictive models of UTS, EL, and Hardness, along with the overall importance ranking for the combined mechanical properties. Features located further outward in the ring indicate greater importance for the overall mechanical properties of the Zn alloy, while the relative importance of a compositional feature reflects its effect on the corresponding target performance in the predictive model. By observing the relative importance of the features in each model, it can be deduced that Cu and Al were the dominant features affecting UTS performance. As regards EL, Ce and La had a significant impact, with Mg and Cu exhibiting nearly equal importance. In the Hardness prediction model, Mg and Cu were the most dominant features. From the overall importance ranking, it becomes evident that Cu, Mg, and Al were the three most critical elements affecting the mechanical properties of Zn alloys. Figure 5b–d visualize the SHAP values for all samples in the augmented datasets of the three mechanical properties. The horizontal axis represents the SHAP value, with its magnitude indicating the extent to which a feature affects target performance. Positive or negative SHAP values indicate the direction of the feature effect on target performance, with positive values suggesting a positive contribution to the performance prediction. The color of each sample point reflects the magnitude of a feature for that specific sample. In the UTS model, higher values of Al, Cu, and Si are associated with higher predicted target values, suggesting that these three elements contribute positively to enhancing the strength of Zn alloys. In the EL model, higher values of Ce, La, and Mn are associated with higher predicted elongation, suggesting that the addition of RE elements and Mn contributes positively to the toughness of Zn alloys. In the Hardness model, Mg, Cu, and Si are positively correlated with the predicted hardness, highlighting their significant role in improving the hardness of Zn alloys. Although Al, Ti, Ce, and La also contribute to hardness improvement, their effects are less pronounced. Furthermore, combining the SHAP value analyses from both the UTS and EL models reveals that a high Mg content affects negatively both target performances. However, a positive correlation with both target properties is observed with decreasing Mg content, suggesting that trace amounts of Mg are beneficial for developing high-strength, tough Zn alloys.

**Fig. 5: SHAP analysis of the prediction model.**

Composition design system development and application

In alloy design, it is essential not only to enhance a single target property but also to comprehensively optimize multiple properties. In certain specific application scenarios, performance must meet requirements within a specified range. For instance, when Zn alloys are used as biodegradable materials in the medical field, the implant material must possess human bone-like mechanical properties along with a moderate degradation rate in the body. To enhance the application of ML in alloy design, this work introduces ZACDS, which is based on the previously developed predictive models for three Zn alloy properties and the BOA, addressing compositional design tasks under various target property constraints. The flowchart of ZACDS is illustrated in Figure S3. The core concept involves using the BOA to iteratively search within a defined space, minimizing the error between predicted and target performances to achieve optimal composition design. The specific method is described below.

First, it is crucial to establish the constraints. After defining the target performance metrics for UTS, EL, and Hardness, the ranges of all compositional features should be determined based on additional criteria, e.g., cost, thereby constructing the search space for the composition design scheme. Notably, to reduce the potential uncertainty and prediction errors in the model during extrapolation, the feature ranges should be aligned with the fluctuation intervals of the content of each element in the dataset, enhancing the reliability of the output results. The objective function for the BOA is defined as follows:

$${Score}={W}_{{UTS}}\left|{1-R}_{{UTS}}\right|+{W}_{{EL}}\left|{1-R}_{{EL}}\right|+{W}_{{HB}}\left|{1-R}_{{HB}}\right|$$

(1)

where ${Score}$ is the objective value in BOA, ${W}_{{UTS}}$, ${W}_{{EL}}$, and ${W}_{{HB}}$ are the importance coefficients for UTS, EL, and Hardness, respectively, with values of either 1 or 0, and ${R}_{{UTS}}$, ${R}_{{EL}}$, and ${R}_{{HB}}$ are the ratios of predicted to target values for UTS, EL, and Hardness, respectively.

Due to the significant differences in the magnitudes of the three performance metrics, simply summing their prediction errors makes it difficult to obtain valid results via error minimization. To address this issue, the ratio of predicted to target values was utilized. Furthermore, an importance coefficient was assigned to each performance metric, reflecting its significance. For instance, when a single target performance is specified for UTS, its coefficient is set to 1, while the coefficients for the other two performances are set to 0. This allows the composition design to focus on a single target. In scenarios requiring multi-objective design tasks with two or three specified performances, the corresponding coefficients are set to 1, while those for the other performances are set to 0. This approach effectively addresses all potential specific performance design tasks. Finally, iterative Bayesian optimization is used to minimize the objective value score, achieving minimal error in target performance. Once the maximum number of iterations is reached, the minimum error value is compared with the allowable deviation; if the condition is met, the design solution is provided; otherwise, “no solution” is output. In cases where a design solution is not obtained during runtime, the number of iterations is increased or the allowable deviation is relaxed. In addition, adjusting the search space can also improve solution efficiency.

Considering the alloy design cost and the interpretability analysis results, Al, Cu, and Mg were selected as the search elements. Referring to the distributions in the dataset, the search ranges were set as 3-8 wt.% for the Al content, 0-6 wt.% for the Cu content, and 0-0.05 wt.% for the Mg content.

The mechanical properties were derived from samples in the dataset that contain doped Al, Cu, and Mg elements, as well as the performance of commercial Zn alloys outlined in GB/T 1175-2018 (Fig. 6). The visualization of these sample points provides a design space with optimized overall mechanical properties. UTS was given as 300 MPa, with a permissible deviation of 20 MPa; EL was specified at 3.5%, with an allowable deviation of 0.5%; hardness was given as 130 HV, with a permissible deviation of 10 HV. After 50 iterations, the desired composition was successfully identified and designated as ZACM alloy, with predicted Al content of 3.96 wt%, Cu content of 5.83 wt%, and Mg content of 0.05 wt%. The predicted UTS, EL, and hardness performances were 298.52 MPa, 3.44%, and 128.64 HV, respectively, represented by the yellow star in Fig. 6. To verify the accuracy of the prediction, the ZACM alloy was prepared, and component testing was conducted using inductively coupled plasma optical emission spectroscopy (ICP-OES). The actual Al, Cu, and Mg contents were found to be 4.01 wt%, 5.92 wt%, and 0.02 wt%, respectively. The UTS, EL, and hardness values of the prepared ZACM alloy were 298 ± 14 MPa, 3.2 ± 0.2%, and 133 ± 2 HV, with the test results shown in Figure S4 and represented by the red star in Fig. 6. The errors between the experimental and predicted UTS and hardness values were both less than 3%, while that concerning the EL performance was slightly higher (7.5%). All errors were within the acceptable range, and the performance was within the initially specified range.

**Fig. 6: ZACDS application and composition optimization.**

Microstructure analysis

To further evaluate the performance of the newly designed alloys, this work includes a comparison with the commercial zinc alloy ACuZinc5, known for its high strength and wide application. According to the ASTM 240-13 standard, the Al content of ACuZinc5 is 2.8–3.3 wt%, the Cu content is 5.2–6 wt%, and the Mg content is 0.035–0.05 wt%. The alloy was cast using the same melting process as the newly designed zinc alloy ZACM. The composition was analyzed using ICP-OES, and the results showed that the Al content was 3.3 wt%, the Cu content was 5.5 wt%, and the Mg content was 0.04 wt%, all of which conform to the standard. The tensile strength, elongation, and hardness values of ACuZinc5 were 253 MPa, 0.8%, and 128 HV, respectively. Compared to ACuZinc5, the newly designed ZACM zinc alloy exhibited a 17.8% improvement in tensile strength, along with increased elongation and hardness.

To further investigate the reasons for the performance enhancement of the new alloy, microstructural characterization was conducted. As shown in Figure S5, the optical microscopy results at different magnifications are presented. Both ZACM and ACuZinc5 exhibit the precipitation of a second phase. The precipitates in ZACM appear shorter and smaller, while those in ACuZinc5 are more elongated. In addition to the precipitate phase, some regions with finer microstructures (such as the small area indicated by the red box) can be observed in Figure S5d. A similar region also appears to be visible in Figure S5h, although it is very small in area.

To clarify the composition of the precipitate phase and better observe the regions with fine microstructures, further observations were conducted using a scanning electron microscope. As shown in Figure S6, the SEM and EDS characterization results of the two alloys are presented. Points 2 and 4 in the figure correspond to the precipitate phases observed under the optical microscope, and their composition analysis results are shown in Table S7. The Zn and Cu contents at these two points are highly consistent with the CuZn4 phase reported in the literature^52,53. Furthermore, these studies have shown that CuZn4 exhibits high hardness, which contributes to the improvement of the alloy hardness properties. Additionally, from the overall elemental analysis and the characterization results at points 3 and 6, we observed that the regions with fine microstructures mainly consist of Zn and Al elements. This microstructure is highly consistent with the Zn-Al eutectic region reported by Priya et al. ⁵⁴. Their work suggests that, with a higher zinc content in the alloy, each eutectic colony can be considered a grain containing a zinc-rich layer. Compared to ACuZinc5, the ZACM alloy exhibits a more distinct and abundant Zn-Al eutectic region. The presence of the Zn-Al eutectic region refines the Zn phase in the ZACM matrix and impedes the growth of CuZn4, making it shorter and smaller. This change leads to finer grains in ZACM, which, according to the Hall-Petch⁵⁵ grain refinement strengthening effect, significantly enhances the strength of the material.

link

Novel machine learning driven design strategy for high strength Zn Alloys optimization with multiple constraints