Abstract: In recent years, data-driven machine learning models have begun to offer alternative approaches and outperform pure physics-driven models in many tasks.

This article is shared from huawei Cloud community “How to Introduce Knowledge into Machine Learning Model to Improve Generalization Ability?” , author: PG13.

Physics-based models are at the heart of today’s technology and science. In recent years, data-driven machine learning models have begun to offer alternative approaches and outperform pure physics-driven models in many tasks. However, data-driven model training requires large amounts of data, and their decision reasoning can be difficult to interpret, and generalization performance remains a challenge. You can have the best of both worlds by combining data and physics at the same time. When machine learning algorithms learn, they are actually searching for solutions in a hypothetical space defined by your chosen algorithm, architecture, and configuration. Even for simple algorithms, the hypothesis space can be quite large, and the data is our only guide to finding a solution in this vast space. But what if we could use our knowledge of the world (physics, for example) along with the data to guide the search for solution space?

How can physics guide machine learning algorithms

There are two main ways to use physics to guide machine learning models :(1) use physics theory to calculate additional features (feature engineering), and input them into the model together with measured values for training; (2) A physical inconsistency penalty term is added to the loss function to punish the predictions inconsistent with physics.

The first approach, feature engineering, is widely used in the field of machine learning. Whereas the second method is much like adding a regularization term to punish the overfitting phenomenon, they add a physical inconsistency penalty term to the loss function. Therefore, when optimizing parameters, the optimization algorithm also needs to minimize physically inconsistent results.

In their paper [1], Karpatne et al combine these two approaches with neural networks and demonstrate an algorithm they call physically guided neural networks (PGNN). PGNN can provide two main advantages:

  • Achieving generalization is a fundamental challenge in machine learning. Since most physical models do not rely on data, they can perform well on data they may not have seen before, even if the data comes from a different distribution.

  • Machine learning models are sometimes referred to as black box models because it is not always clear how the model makes a particular decision. ** Interpretable AI (XAI) ** There is a lot of work to be done to improve model interpretability. PGNN, on the other hand, can provide the basis for XAI because they can present physically consistent and explicable results.

Application example: lake temperature modeling

In the paper [1], lake temperature modeling is taken as an example to prove the effectiveness of PGNN. It is well known that water temperature controls the growth, survival and reproduction of the species living in the lake. Therefore, accurate temperature observations and predictions are critical to understanding the changes taking place in the community. The task of the paper was to develop a model that could predict the water temperature of a lake based on a given depth and time.

Now, let’s see how they apply (1) feature engineering and (2) loss function modification to solve this problem. For feature engineering, they proposed a model called GLM to generate new features and feed them into a neural network. It is a physics-based model that captures the processes that control the dynamics of a lake’s temperature (heating due to the sun, evaporation, etc.). So how do we define this physical inconsistency term? Dense water is known to sink deeper, and the physical relationship between water’s temperature and its density is well known. Therefore, our model should follow the fact that the deeper the point, the higher the prediction density. If, for two points, the model predicts a higher density for the point closer to the lake, this is a physically inconsistent prediction.

After the above analysis, this idea can now be incorporated into our loss function. If ρA> ρB, that is, the prediction does not conform to the physical consistency, we need to punish, otherwise do not punish. This can be easily done by adding the value of the function Max (ρA- ρB, 0) to the loss function. If ρA> ρB (that is, physical inconsistency), the function will give A positive value, which will increase the value of the loss function, otherwise it will be zero, leaving the loss function unchanged.

At this point, we need to make two changes to the function :(1) we need to consider the physical inconsistencies of all pairs of points, not just one pair. Therefore, the Max (ρA- ρB, 0) values of all point pairs can be averaged. (2) In addition, it is critical to minimize the weight of physical inconsistency penalty terms. This can be done by multiplying the mean physical inconsistency term by the hyperparameter (similar to the regularized parameter). As shown in the following formula:

The results of the four models are compared, and they are:

  • PHY: Universal Lake Model (GLM)

  • NN: neural network

  • PGNN0: Neural network with feature engineering, GLM model results as additional features input neural network.

  • PGNN: Neural networks with characteristic engineering and modified loss functions.

And two evaluation indicators:

RMSE: indicates the root mean square error

Physical inconsistency score: The percentage of a model’s predictions that do not conform to physical consistency.

Comparing NN with PHY, we can conclude that NN provides more accurate predictions at the cost of losing physical inconsistency results. By comparing PGNN0 and PGNN, we can see that the physical inconsistency is eliminated by modifying the loss function. The improvement of prediction accuracy is mainly due to some contributions of feature engineering and loss function.

Overall, these preliminary results indicate to us that PGNN is very promising to provide more accurate and physically consistent results. In addition, we improve the generalization performance of the machine learning model by converting the physics knowledge into the loss function. This deceptively simple idea has the potential to radically improve the way we approach machine learning and scientific research.

reference

[1]Physics-guided Neural Networks(PGNN): An Application in Lake TemperatureModeling.

[2]Theory-guided Data Science: A New Paradigm for Scientific Discovery from Data.

Click follow to learn about the fresh technologies of Huawei Cloud