Transfer Learning is a very powerful idea in machine learning. In transfer learning we take knowledge , a neural network learned from one task and apply it to the other task.
“Transfer learning will next driver of machine learning success”
says Andrew Ng, at his famous NIPS 2016 tutorial
Briefly Understanding transfer learning:
Now let’s say a model learned to classify between dogs breed and then applied to do a job of reading x ray scan. This technique is called transfer learning.
Traditionally machine learning approach is to train the network for task a and b differently but through transfer learning we can transfer the knowledge learnt from task a to task b.
How to use Transfer learning?
I will explain how we do transfer learning by taking the same example of transferring the knowledge from a task of classifying dog breeds to a task of reading x-ray scans.
First there is a neural network that is trained on a “XY” data set where X is images and Y is some object in the image.
If we want to transfer or adapt this network in doing a different task such as x ray diagnosis, we will just take the output layer and delete it. Also delete the weights for this layer. Now we will randomly initialize the weights for the last layer and have that now output the x-ray scans.
Training new data:
For transfer learning, we now swap in new XY , where X is x-ray images and Y is the diagnosis you need to predict and initialize the weights for the last layer and now we train the network on this new dataset.
Different options to do this:
We have couple of options how it train it. The rule of thumb is , if we have small data then just retrain last layer and if there is enough data we can train all the layers.
What really happened in this example?
You have taken knowledge learn from the classification of dogs breed and applied it or transferred it to the X-ray scans dataset. A lot of low level features like curves , edges etc. are same, it learned about structure and nature of how image look like.So learning from a very large image data set might help your learning algorithm to do better in x Ray diagnosis.
Another example can be speech recognition.
When transfer learning makes sense?
- When you have a lot of data for task A than task B .
- Task A and B have same input X(like images or audio)
- Low level features from A could be helpful for B
Some pre-trained models;
- Oxford VGG model
- Google Inception model
- Microsoft Resnet
- Google word2vec
- Stanford GloVe model
Transfer learning is a very powerful method to use some pre-trained model and apply it to a new task to solve it efficiently.
Click here to do a small exercise on transfer learning
If you have any doubt or feedback, let me know in comments or ask in the forum and also don’t forget to subscribe to get update on every post.And share it if you find it worthy.