In this article we will build a neural network in tensorflow to classify hand written digits. Tensorflow is an open source library by google, developed by google brain team and Neural network is the core algorithm behind deep learning .
If you are not familiar with neural networks and tensorflow , I’ll suggest you to have a quick read of these articles and come back here : Neural Network Representation , Neural Network Learning , Intro to Tensorflow – 1 .
To begin with any machine learning problem, the most important part is the data. We need to get a informative dataset which is suitable for our problem.
We’re taking the MNIST database of handwritten digits, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.
I’ve pre-processed the dataset and the code is available here .
Now let’s load the data and start with building our first neural network.
I’ll import some necessary libraries :
Creating the Model :
The first step to create a model is to define the number of layers in the network and the number of units in each layer. We’re creating a 2 layer network and it contains :
- Input layer
- Hidden layer
- Output layer
To define the units in input layer , we need to evaluate the shape of X (input).
Note that we’ve 50,000 images and each image was of 28*28 size, which we reshaped to 784.
So the number of units in each layer :
As you remember , in tensorflow we first create the computational graph and then compute it. We need to define the inputs to the graph , which will not be fixed but dynamic . For this case tensorflow provides a very valuable function called as placeholder . So we’ll use it to create placeholders for our inputs.
Note : shape=(None,784) means that we can put any number of images ( in 1-d array format) with size of 784 at a time.
Initialize Parameters :
The first step to create a neural network is to randomly initialize the parameters of the network. We’ll initialize the weights with Xavier_initializer and bias unit with zeros.
Xavier_initializer returns an initializer performing “Xavier” initialization for weights.This function implements the weight initialization from Xavier Glorot and Yoshua Bengio’s paper (2010).
One important thing to note here is the shape of the weights and bias .
Forward Propagation :
The next step is to define the operations for forward propagation in our neural network .
- tf.matmul() : It computes the matrix multiplication of the given parameters.
- tf.nn.relu() : It applies relu function to the given parameter.
- tf.nn.sigmoid() : It calculates the sigmoid function for the given input.
Cost function :
The next step for a neural network is to compute the cost function. But before that we need to define our cost function, we’ll be using softmax cross entropy loss.
Back propagation :
Back propagation is said to be one of the complex part of neural networks and this is where deep learning frameworks comes to rescue. For back prop we need to calculate gradients/derivatives and frameworks like tensorflow do this , we just need to select the optimizer we need by calling it’s function.
We’ve completely defined our computation graph and to train the neural network I need to initialize all the variables by calling global_variable_initializer( ).
Quick Note : We need to convert the y into a one-hot vector before running the model ( This could have also be done with OnehotEncoder in sklearn library )
Running The Model
Since we’ve successfully created our graph, now we need to run this model on our dataset . We’ve input the data in a batch of 500 images to speed up the training process. And as you remember , we have to create a tensorflow session to compute anything. We can give the inputs to replace the placeholders using feed_dict parameter. The following code runs a session to optimize the cost we defined earlier on our training data.
It’s always a good practice to keep track of your train and test loss.
Our neural network is giving over 96% accuracy for both, the train and test data which is awesome for our first neural network with tensorflow.
If you have any thoughts or suggestions , you can ask in the comments or you can directly contact me through the contact forum. Please support us by sharing this article and giving your feedback.
Also if you want me to write about any specific topic related to AI, you can tell in the comments section and I’ll look into that for my upcoming articles.