What is Transfer Learning

Transfer learning or inductive transfer is a research problem in machine learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks. This area of research bears some relation to the long history of psychological literature on transfer of learning, although formal ties between the two fields are limited. ( Source: Wikipedia )

Pre-trained models are a blessing. It can make your work a lot easier and you can save a lot of time.
There are many ways to use pre-trained models, the choice of which generally depends on the size of the data set and the extent of computational resources available. These include:

Fine tuning

In this scenario, the final classifier layer of a network is swapped out and replaced with a softmax layer the right size to fit the current data set, while keeping the learned parameters of all other layers. This new structure is then further trained on the new task.

Freezing

The fine-tuning approach necessitates relatively large computational power and larger amounts of data. For smaller data sets, it is common to “freeze” some first layers of the network, meaning the parameters of the pre-trained network are not modified in these layers. The other layers are trained on the new task as before.

Feature extraction

This method is the loosest usage of pre-trained networks. Images are fed-forward through the network, and a specific layer (often a layer just before the final classifier output) is used as a representation. Absolutely no training is performed with respect to the new task. This image-to-vector mechanism produces an output that may be used in virtually any downstream task.
Below is a link-list of the models available.

TensorFlow

TensorFlow is an open-source software library for dataflow programming across a range of tasks. It is a symbolic math library and is also used for machine learning applications such as neural networks. It is used for both research and production at Google. TensorFlow was developed by the Google Brain team for internal Google use. It was released under the Apache 2.0 open source license on November 9, 2015. The TensorFlow official models are a collection of example models that use TensorFlow’s high-level APIs. They are intended to be well-maintained, tested, and kept up to date with the latest TensorFlow API. They should also be reasonably optimized for fast performance while still being easy to read.

Some available Pre-trained-Models:

https://github.com/tensorflow/models/tree/master/official

PyTorch

PyTorch is an optimized tensor library for deep learning using GPUs and CPUs. PyTorch is an open source machine learning library for Python, based on Torch, used for applications such as natural language processing. It is primarily developed by Facebook’s artificial-intelligence research group, and Uber’s “Pyro” software for probabilistic programming is built on it.

Some available Pre-trained-Models:

http://pytorch.org/docs/master/torchvision/models.html#id1

Keras

Keras is an open source neural network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, Theano, or MXNet. Designed to enable fast experimentation with deep neural networks, it focuses on being user-friendly, modular, and extensible. It was developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System), and its primary author and maintainer is François Chollet, a Google engineer. Keras Applications are deep learning models that are made available alongside pre-trained weights. These models can be used for prediction, feature extraction, and fine-tuning.

Some available Pre-trained-Models:

https://keras.io/applications/

CNTK

Microsoft Cognitive Toolkit, previously known as CNTK and sometimes styled as The Microsoft Cognitive Toolkit, is a deep learning framework developed by Microsoft Research. Microsoft Cognitive Toolkit describes neural networks as a series of computational steps via a directed graph.

Some available Pre-trained-Models:

https://www.microsoft.com/en-us/cognitive-toolkit/features/model-gallery/?filter=Recipe

Last but not least:

Why PyTorch is becoming increasingly popular

There are many frameworks like Keras, TensorFlow, Theano, Torch, Deeplearning.4J, etc which can be used for deep learning. Out all these my favorite is Keras on top of TensorFlow. Keras works great for a lot of mature architectures like CNN, feed forward neural network, LSTM for time series but it becomes bit tricky when you try to implement new architectures which are complex in nature. Since Keras was built in a nice modular fashion it lacks flexibility. PyTorch which is a new entrant, provides us tools to build various deep learning models in object-oriented fashion thus providing a lot of flexibility. A lot of the difficult architectures are being implemented in PyTorch recently. If you are looking to implement your own layers and doing prototyping and research… go with PyTorch.

Fast.ai (Facebook) has switched from Keras+TensorFlow to PyTorch. Here is why:

http://www.fast.ai/2017/09/08/introducing-pytorch-for-fastai/

We will see what the future will bring in terms of Transfer Learning.

Your AISOMA Team

If you are interested in Natural Language Processing in action, try our free Demo Web App (NLP in practice – text summarization, Named-entity extraction and sentiment analysis)

Transfer Learning – Some available Pre-trained models for TensorFlow, PyTorch, Keras and CNTK

What is Transfer Learning

Fine tuning

Freezing

Feature extraction

TensorFlow

PyTorch

Keras

CNTK

Why PyTorch is becoming increasingly popular