“What will be hot and what to learn in 2022? This article is participating in the” Talk about 2022 Technology Trends “essay campaign.
The article brief introduction
By following this article, you will learn the following:
- What is transfer learning
- Why transfer learning
- Advantages of transfer learning
- Classification of transfer learning methods
- Future prospects of transfer learning
What is transfer learning?
Generally speaking, an idiom is most appropriate to describe it — to draw inferences from one another. Transfer learning leverages and synthesizes knowledge derived from similar tasks, as well as valuable lessons gained from the past, to facilitate learning of new problems. The key is to find common ground/similarities between existing knowledge and new knowledge.
For example, after Xiao Bao learned to ride a bicycle, electric bikes and motorcycles can be used quickly, but driving a car needs to learn systematically again. For example, if you learn programming, you first learn C language. Once you have a basic knowledge of C language, you can quickly learn Python, Java and other computer languages, but C language will not help you learn Japanese.
In transfer learning, researchers usually divide data into source data and target data. Source data is other data that is not directly related to the task to be solved and usually has a large data set. Target data is the data directly related to the task, and the amount of data is generally small. Bicycles mentioned above can be understood as source data, while motorcycles and electric vehicles can be understood as target data.
What transfer learning does is to make full use of source data to help the model improve its performance on target data.
For example, xiaobao is learning the direction of V-I track recognition of NILM electricity meter. The related public data set can reach tens of thousands of data at most, and no pre-training model for NILM is proposed, but there are many image recognition models. Examples include AlexNet, VGG-16, Googlet and ResNET-50. These models are carefully trained based on millions of images. If we migrate these models into NILM, the accuracy of NILM trajectory recognition will be greatly improved.
In transfer learning, if the knowledge correlation between source domain and target domain is low, the effect of transfer learning will be poor, which is called “negative transfer”. For example, the migration performance of text data model to image data model will be poor. But for text to image, there is not no solution, we can connect two seemingly unrelated fields through one or more intermediate fields, this is called ** “transitive transfer learning” **, transitive transfer learning is also one of the hot topics of researchers.
For example, to migrate between text and images, Transitive Transfer Learning in Proceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining – KDD ’15 uses annotated images as intermediate fields.
Why transfer learning?
In the field of AI(artificial intelligence) and ML(machine learning), the incentive for transfer learning to be used in AI is stronger than ever, as it can address the two limitations of large amounts of training data and training costs.
As for why transfer learning should be carried out, Professors Wang Jindong and others summarized it into four aspects:
- Contradiction between big data and little annotation: In the era of big data, massive data is generated all the time, but the data lacks perfect data annotation. The training and update of machine learning model all depend on data annotation, and only a small amount of data is annotated at present.
- Contradiction between big data and weak computing: massive data requires huge storage and computing power, while strong computing power is very expensive. In addition, training massive data requires a lot of time, which leads to the contradiction between big data and weak computing.
- Universal model and the contradiction between the personalized demand: the purpose of machine learning is to build a generic model as much as possible to meet different users and different demands of different equipment, different environment, which requires the model has high generalization ability, but the practice of universal universal model cannot meet the needs of individuation, differentiation, this leads to the model with the contradiction between the personalized needs.
- Requirements of specific applications: in reality, there are often some specific applications, such as the cold start problem of recommendation system, which requires us to use existing models or knowledge as much as possible to solve the problem.
To sum up, when we use ARTIFICIAL intelligence to solve problems, the biggest obstacle lies in the large amount of data and parameters required for model training. On the one hand, we usually cannot get the data (with annotations) required for model construction. On the other hand, the training of the model takes a lot of time. Transfer learning, on the other hand, can significantly improve the learning performance of traditional AI techniques by leveraging valuable knowledge and previous experience from similar tasks.
Advantages of transfer learning
In conclusion, compared with previous machine learning and deep learning, transfer learning has the following advantages:
- Improving the quality and quantity of training data: Transfer learning works by selecting and transferring knowledge from similar fields with large amounts of high-quality data
- Speed up the learning process: The rate of learning can be significantly increased with the benefit of valuable knowledge shared and/or past knowledge from other similar fields
- Reduction of computation: Most of the data in transfer learning are trained by other source domains before the trained model is transferred to the target domain, thus greatly reducing the computation requirements in the training process of the target domain.
- Reduce communication overhead: there is no need to send a lot of raw data, just transfer knowledge
- Protect data privacy: Instead of learning from raw data from other domains, users can learn from their own trained models (represented by weights), thus protecting data privacy.
Classification of transfer learning methods
Sample-based migration
Samples-based migration is to select the samples with high similarity to the target domain data from the source domain data set according to a similarity matching principle, and transfer these samples to the target domain to help the learning of the target domain model, so as to solve the problem of insufficient or no label samples in the target domain.
In general, the weight of the sample is trained by the similarity between the source domain and the target domain. The data samples of the source domain with high similarity are considered to have strong correlation with the data of the target domain, which is beneficial for the data learning of the target domain. Otherwise, the weight is reduced.
The traditional method is sample weighting, which uses discriminant method to distinguish source data from target data, kernel average matching method and function estimation method to estimate the weight, but it is difficult to calculate the density ratio of source domain and target domain (MMD and KL equidistance measurement)
Migration based on model parameters
Model-based transfer learning is the common knowledge shared by source task and target task at model level, including model parameters, model prior knowledge and model architecture. It can be divided into two types: knowledge transfer based on shared model components and regularization knowledge transfer. The former uses the model components or hyperparameters of the source domain to determine the target domain model. The latter prevents model overfitting by limiting model flexibility.
In layman’s terms, the model is pre-trained with a large amount of data from the source domain, then the weight parameters obtained are migrated, and finally the full connection layer is retrained with a small amount of target data.
Feature-based migration
The core of feature transfer method is to find typical features between source domain and target domain to further weaken the differences between the two domains so as to achieve cross-domain transfer and uptake of knowledge.
The feature transfer method can be further divided into feature extraction transfer learning method and feature mapping transfer learning method according to whether the original features are selected or not. The advantage is that the similarity between models can be used, while the disadvantage is that the model parameters are not easy to converge.
Feature extraction and migration method
Definition: Reuse a pre-trained local network in the source domain to transform it into a part of the deep network in the target domain.
CNN model is usually used as feature extractor, and then a small amount of data is used to fine-tune the network. The performance of CNN model trained based on different fine-tuning strategies is also different, so fine-tuning strategy is the focus of this kind of method. At present, the fine-tuning strategy is mainly multi-scale feature transfer, by fine-tuning different network layers to learn the characteristics of the target data.
Feature mapping migration method
Definition: Mapping instances from source and target domains to a new data space where instances from both domains have similar data distributions suitable for joint deep neural networks
The feature mapping transfer method expands the training set and enhances the effect of transfer learning by adjusting the marginal distribution or conditional distribution of the data in the source region.
Compared with feature extraction method, feature mapping transfer learning is more complicated. First, the common feature representation between the source domain and the target domain is found, and then the data is mapped from the original feature space to the new feature space.
Feature-based transfer learning is applicable to a wide range and can be used regardless of whether the source domain and target domain data have labels. However, when the data is labeled, the measure of domain invariance is not easy to calculate. It is also difficult to learn cross-domain common features when data is unlabeled.
Future prospects of transfer learning
Transfer learning combined with generative adversarial networks
Generative Adversarial Networks (GAN) is a deep learning model, which is one of the most promising methods of unsupervised learning on complex distributions. The Model generates a fairly good output through game learning between at least two modules of the framework: the Generative Model and the Discriminative Model.
Generative adversarial network (GAN) is a widely used data enhancement method in recent years. This method can generate false samples similar to real samples, so that training samples can be expanded and model performance can be improved.
GAN and transfer learning are combined to form DANN adversarial network (the structure of DANN adversarial network is shown in the figure below). DANN adversarial network directly optimizes the loss in the source domain. The H δ H distance between source domain and target domain is optimized by adversarial method. Minimizes the loss upper bound of the target domain. However, for DANN network, training will be relatively difficult, and it is difficult to expand from single source domain to multi-source domain, which is also a problem we need to solve later.
Transfer learning combined with attentional mechanisms
The Attention Mechanism originates from the study of human vision. In cognitive science, due to information processing bottlenecks, humans selectively focus on a portion of all information while ignoring other visible information. The attentional mechanism has two main aspects: deciding which parts of the input to focus on; Allocate limited information processing resources to important parts. ———— Baidu Encyclopedia
In the literature Transferable Attention for Domain Adaptation, the method TADA(the figure below is TADA structure diagram) is proposed to select Transferable images and key regions in images through the Attention mechanism to improve model performance. Two transfer processes combined with the Attention mechanism are proposed: Transferable Local Attention and Transferable Global Attention
The attention mechanism can improve the accuracy of the model to a certain extent, but it will also occupy too much computing resources. Therefore, various lightweight attention mechanisms have been proposed in recent years, but the lightweight attention mechanism will lose some model accuracy. Therefore, it remains to be studied how to combine the lightweight attention mechanism and transfer learning method more effectively on the premise of ensuring accuracy.
Transfer learning combines federated learning
Federated learning is a machine learning setup in which many clients work together to train models under the coordination of a central server, while keeping the training data decentralized and distributed. Long-term goal of federated learning: Analyze and learn data from multiple data owners without exposing the data. (Purpose: To solve data silos)
In the 2020 Global Artificial Intelligence and Robotics Summit (CCF-GAIR 2020), Professor Yang Qiang introduced the key technologies and application cases of federated learning, and further introduced the newly launched combined research of federated learning and transfer learning as well as the key research directions in the future.
Professor Yang Qiang said, “We can’t build AI without people. Protecting people’s privacy is a particularly important point in the development of AI at present, which is also the requirement from the government, individuals, enterprises and society. In addition, the AI protects the model against malicious or non-malicious attacks. Data privacy has become an obstacle that AI development has to overcome.
And federated transfer learning (FTL) replaces the Homomorphic Encryption and Polynomial Approximation of Differential Privacy by applying the Homomorphic Encryption and Polynomial Approximation. Provides a safer and more reliable approach for specific industries. At the same time, based on the characteristics of transfer learning, FTL participants can have their own feature space without forcing all participants to have or use the data of the same feature, which makes FTL suitable for more application scenarios.
Measurement of transfer learning domain
Because the performance of transfer learning depends largely on the similarity between domains. Therefore, the study of measurement method is one of the important fields of transfer learning in the future. The accuracy of measurement and the convenience of calculation will affect the development of transfer learning.
Knowledge of multi-source domain migration
Only from a source domain knowledge is limited, if we can implement multiple domain knowledge, a comprehensive study to combine multiple learning and migration, this will increase the to find a better learning opportunities and the target domain, so as to improve the learning efficiency and the effect of migration study, makes the migration study become more security and stability, effectively avoid the occurrence of negative transfer.
conclusion
As a new research field, transfer learning has excellent performance by using the original knowledge to help the training of the target field. Transfer learning can also be combined with many methods, such as federated learning, attention mechanism and so on, which have excellent performance. There are also many problems and challenges in transfer learning: negative transfer, which reduces accuracy rather than improving model capability; Inter-domain measurement is too complicated and there is no good measurement method.
But as Noted Stanford University professor Andrew Ng said at the 2016 NIPS Conference, “Transfer learning will be the next driver of machine learning business success after supervised learning.” With the deepening of research, transfer learning will become another shining star in the field of artificial intelligence.
Refer to the link
- Jindong Wang et al. Transfer Learning Tutorial. 2018.
- Domain-adversarial Training of Neural Networks (DANN)
- Transferable Attention for Domain Adaptation
- Reference: A Review of Transfer Learning Research
- Federated Learning(FL)
- Federated learning OR Transfer learning? No, we need federated transfer learning
After the language
I am battlefield small bag, a fast growing small front end, I hope to progress together with you.
If you like xiaobao, you can pay attention to me in nuggets, and you can also pay attention to my small public number – Xiaobao learning front end.
All the way to the future!!