Neural Nets
1. Initialize The Network
First we will define the network that how many layers we want and how many neurons in each layers. So
def initialize_parameters(layer_dims):
L = len(layer_dims)
parameters = {}
# np.random.seed(1)
for i in range(1,L):
W = np.random.randn(layer_dims[i], layer_dims[i-1]) / np.sqrt(layer_dims[i-1])
b = np.zeros((layer_dims[i], 1))
parameters["W"+str(i)] = W
parameters["b"+str(i)] = b
return parameters
Learnings
1. Thanks to Automatic gradient calculation engine
While implementing this network I got to introduce with a strange behaviour, the loss was getting up instead of getting down over the time. The code was working fine, forward pass was also correct. After one week of debuging still I did not find the bug. At last I gave this code to gpt and it spotted the bug in backward pass of the relu activation function.
For relu we calculate gradient as follows: if input to the relu is > 0 then 1 else 0. so what I did is instead of doing this to the input I did it for the gradients of the next layer.
Even simple and small mistakes can make neural networks work wrong way. And that is why forward propagation easy, backwardpropagation is hard and takes time. But nowadays we dont write backward pass thanks to autograd engines of different libreries.