17 May 2021 |
Research article |
Materials & Manufacturing
Bayesian Neural Network Predictive Model for Superconductor Materials
Purchased from Istockphoto.com. Copyright.
Much research in recent years has focused on using empirical machine learning (ML) approaches to extract useful insights on the structure-property relationships of superconductor material. However, this assessment cannot be based solely on an open black-box machine learning, which is not fully interpretable. It can be counter-intuitive to understand why the model may give an appropriate response to a set of input data for superconductivity characteristic analyses, e.g. critical temperature. This study describes and examines an alternative approach for predicting the superconducting transition temperature Tc from the SuperCon database of Japan’s National Institute for Materials Science. The authors are putting forth a generative machine-learning framework called Variational Bayesian Neural Network using superconductor chemical elements and formula to predict Tc. Keywords: Critical transition temperature, machine learning, Bayesian neural network, variational inference, stochastic optimizatioalgorithm, high temperature superconducting (HTS).
Variational Inference for Bayesian Neural Network
The Bayesian-based generative model prediction is advantageous for two key reasons: i) uncertainty is intrinsically described, useful for analysis and prediction, ii) overfitting is avoided by natural penalization of overly complicated models. Therefore, this paper describes the development of a novel machine learning technique for the critical temperature prediction of a superconductor using variational Bayesian neural networks (VBNN). As shown in Fig. 1, obtaining the predictive model is an example of a Bayesian inference framework. The difference between this expectation and the likelihood of a function peak (i.e. reality) will be determined as the prediction error. The variance of the prediction is the key element to control the uncertainty of the model, and the variance of the likelihood function is known as noise in the model.
Technically, a Bayesian linear regression model is used to approximate the generating function. The cost function is calculated, and gradient descent optimization is used to find the best weight matrix that minimizes the cost function. Then, a probabilistic model based on said matrix and evidence hypotheses is conducted to predict the output given new input. The problem is to find the best hypothesis over an entire range of hypotheses. This is done with a likelihood function, which results in the probability of a hypothesis to be true. The Kullback-Liebler (KL) divergence is used to obtain an approximate posterior inference. A new term called Evidence Lower Bound (ELBO)—defined as the difference between the likelihood and the KL divergence result derived from the inference model—is optimized. To evaluate the predictive model, two error metrics are considered, namely, the root mean square error (RMSE), and the R-squared (R2) value.
Critical Temperature Prediction Model
We can skip the mathematical underpinning of the VBNN model and preliminary backgrounds of an inference model. However, based on the necessary explanation in the previous section, Fig. 2 provides a conceptual Tc prediction method by applying the VBNN model and prediction evaluation. Xi is each superconductivity material with full formula and chemical elements, and each yi is the target Tc that the predictive model tries to predict. Therefore, the VBNN will learn and approximate the distribution of noise λ and the weight wi for each superconductor material, respectively.
As a consequence, this work shows an improvement over the interpretability of the structure-property relationship of superconductors. To that end, the authors tried to approximate distribution in the latent parameter space using variational inference, and to evaluate the model’s predictive performance. They used a stochastic optimization algorithm called the Monte Carlo sampler. Advantages in using such generative neural networks are that the models can be directly compared with the training data without a validation set. Overfitting (natural in ML applications) can be avoided because the model inherently penalizes the parameters. These advantages address the limitations of other machine learning models—such as random forests, SVMs, etc.—in terms of choice of parameters/kernels, memory requirements, and computational resources.
Comparison Between Performance of VBNN and other Techniques
The presented Bayesian regression approach can also be applied directly to predict the critical temperature of a superconductor, as shown in Table I. Our confidence scores R2 have a strong overall concordance with previous predictions (R2 = 0.94). Besides, a significant improvement was obtained in the RMSE at 3.83 K. The result is a striking illustration of VBNN performance compared with other techniques.
Table I. Numerical result comparison
Our results are encouraging. However, replicated experiments should be reproducible in the interest of worthy investigations. First, an important feature for future studies is to use the pre-trained VBNN predictive model to validate its performance on different superconductor datasets. Possible directions are customizing the “transfer learning” paradigm to take advantage of the VBNN neural network’s optimized hyper-parameters. Second, future work should focus on exploring feasible compounds as new superconductors. It will be beneficial to get initial feedback to determine the accuracy and efficiency of alternative compounds before conducting costly, arduous experiments in real practice.
The material data science, specifically in superconductor exploration, is in the early stages of ML adoption. There are a growing number of single-use applications, but more intelligible models are yet to be seen. In this work, we developed a new probabilistic approach using a variational Bayesian neural network to estimate the Tc value of high-temperature superconductors.
Our results are in general agreement with existing studies in Tc predictive models. These preliminary results demonstrate the feasibility of using a generative neural network, which provides compelling, helpful evidence to understand the underlying superconductivity physics. This finding is promising and should be investigated with other advanced predictive models, eventually leading to the discovery of new superconductors in the future.
For more information on this research, please read the following research paper:
Le, T. D., Noumeir, R., Quach, H. L., Kim, J. H., Kim, J. H., & Kim, H. M. “Critical temperature prediction for a superconductor: A variational Bayesian neural network approach“. 2020 IEEE Transactions on Applied Superconductivity, 30(4), 1-5.
Thanh Dung Le
Thanh Dung Le is a PhD student in Electrical Engineering at ÉTS. His research interests are Bayesian inference and regularization for model uncertainty.
Program : Electrical Engineering
Rita Noumeir is a professor in the Electrical Engineering Department at ÉTS. Her research includes applying artificial intelligence methods to create decision support systems as well as video and image processing.
Program : Electrical Engineering