Autograd Engine and Backpropagation

Introduction

Backpropagation is the single most important technique that trains all the neural nets that you have seen so far including recent models like ChatGPT. And when I learned this first time during my masters and understood what it does, and how it does, I was amazed. Earlier before understanding backprop and neural network it was black box to I didnt get it what it is doing. lets say you have some equations and you feed input to that equation and the final result will be some output and that output you will feed as input to other equation called loss function and you would either like to increase the loss or decrease the loss. And

Algorithm for Neural Networks

Info

Here we want to map input x with the output y using some function.

\(x\) is a single training sample

\(x \in \R^{nx}\) Where \(nx\) is feature dimention

\(y\) is a output

\(m\) no of total training examples

\(X \in \R^{(nx, m)}\) here \(X\) is a matrix containing all training exmaples where so \(nx\) is a feature dimention and \(m\) is no of trainign examples. the dimention o f first training example is \(X^{(0)} \in \R^{(nx, 1)}\)

\(Y \in \R^{(1,m)}\) and \(Y\) is a a matrix but here as we have only one output per training example so it is a row vector where there is 1 row as we expect only one output per training semple and there is \(m\) columns as no of training samples.

$$Z = WX + b$$
$$A = g(Z)$$

\(g()\) can be any activation function$

  • relu:
  • sigmoid: 1/ 1+ e-x

Operations that we need to implement

  • add
  • subtract
  • multiplication
  • divide
  • power

activation functions

  • relu
  • sigmoid

loss function

  • binary cross entropy