Transfer learning makes use of the knowledge gained while solving one problem and applying it to a special but related problem.

For example, knowledge gained while learning to acknowledge cars are often wont to some extent to acknowledge trucks.

Pre-Training

When we train the network on an outsized dataset(for example: ImageNet) , we train all the parameters of the neural network and thus the model is learned. it’s going to take hours on your GPU.

Fine Tuning

We can give the new dataset to fine tune the pre-trained CNN. Consider that the new dataset is nearly almost like the original dataset used for pre-training. Since the new dataset is analogous, equivalent weights are often used for extracting the features from the new dataset.

If the new dataset is extremely small, it’s better to coach only the ultimate layers of the network to avoid overfitting, keeping all other layers fixed. So remove the ultimate layers of the pre-trained network. Add new layers. Retrain only the new layers.

If the new dataset is extremely much large, retrain the entire network with initial weights from the pretrained model.

How to fine tune if the new dataset is extremely different from the original dataset?

The earlier features of a ConvNet contain more generic features (e.g. edge detectors or color blob detectors), but later layers of the ConvNet becomes progressively more specific to the small print of the classes contained within the original dataset.

The earlier layers can help to extract the features of the new data. So it’ll be good if you fix the sooner layers and retrain the remainder of the layers, if you bought only bit of knowledge.

If you’ve got great deal of knowledge, you’ll retrain the entire network with weights initialized from the pre-trained network.