Do you ever wonder how the technology is changing and it’s all driven by artificial intelligence. Now a days there are machines that are doing jobs better than human at tasks like recognizing an image etc.. It’s all possible because of a technique called neural networks. In this blog I will explain the basic concept of neurons and neural networks and the maths behind it.
A neural network is an old idea which had fallen favor for a while but today it is the state of art technique for many different machine learning problems. The basic concept behind neural networks was to stimulate neurons of brain. Scientists were trying to create an algorithm which mimic could our brain. Neural networks were used widely in 80’s and 90’s. But however diminished in late 90’s due to their computational cost and lack of hardware resources. But there is recent resurgence in neural networks since now the computational cost is pretty affordable because of the GPUs.
What is a NEURON?
A neuron is simply a computational unit in a brain. Neuron communicate with another neuron with the help of little pulse of electricity also called spikes. Neuron gets it’s input pulse from Dendrite and then do some computation and send output via Axon to other neuron. It’s how our senses and muscles work.
ARTIFICIAL NEURAL NETWORKS(ANN):
Artificial neural networks are computational systems inspired from our brain. An ANN is basically the connection of connected units or nodes called artificial neuron . This connection between artificial neuron can transmit a signal from one to another . The artificial neuron that receives the signal can process it and then signal artificial neuron connected to it. The signal at a connection between artificial neurons is a real number, and the output of each artificial neuron is calculated by a non-linear function.Artificial neurons and connections typically have a weight(Θ) that adjusts as learning proceeds.
ANN can have many layers depending on it’s structure.
Now let’s understand the Maths behind this revolutionary technique.
In our model, our dendrites are like the input features x1⋯xn, and the output is the result of our hypothesis function. In this model our x0 input node is sometimes called the “bias unit.” It is always equal to 1. In neural networks, we use the same logistic(sigmoid) function as in classification, 1/(1+e^(−θ’x)), yet we sometimes call it a sigmoid(logistic) activation function. In this situation, our “theta” parameters are sometimes called “weights”.
We can have intermediate layers of nodes between the input and output layers called the “hidden layers.”
In this example, we label these intermediate or “hidden” layer nodes a20⋯a2n and call them “activation units.”
if there’s only one hidden layer:
the value of activation node will be:
We compute our activation nodes by using a 3×4 matrix of parameters. We apply each row of the parameters to our inputs to obtain the value for one activation node. Our hypothesis output is the logistic function applied to the sum of the values of our activation nodes, which have been multiplied by yet another parameter matrix Θ(2) containing the weights for our second layer of nodes.
Each layer gets its own matrix of weights, Θ(j).
The dimensions of these matrices of weights is determined as follows:
If network has sj units in layer j and sj+1 units in layer j+1, then Θ(j) will be of dimension sj+1×(sj+1).
The activation function sums everything and produces the output. If it’s a classification problem(binary in this case) then if h(x)>=0.5, the output is 1 and if h(x)<5 , the output is 0.
In the next Blog I’ll explain how this neural network learns and updates it’s weight and then next week I’ll post a real world problem solved using an ANN so please follow the blog to learn more. And since it’s my first blog , do comment what you feel about this and I’d love to hear your feedback . Share it if you like it.