Different Types of Gradient Descent
The different types of Gradient Descent are -
Batch Gradient Descent
This is a type of gradient descent which processes all the training examples for each iteration of Gradient Descent.
When the number of training examples is large, then batch gradient descent is computationally very expensive. SO, it is not preferred.
Instead, we prefer to use stochastic gradient descent or mini-batch gradient descent.
Stochastic Gradient Descent
This is a type of gradient descent that processes a single training example per iteration.
Hence, the parameters are being updated even after one iteration in which only a single example has been processed.
Hence, this is faster than Batch Gradient Descent. When the number of training examples is large, even then it processes only one example which can be additional overhead for the system as the number of iterations will be large.
Mini-Batch Gradient Descent
This is a combination of both Batch and Stochastic Gradient Descent.
The training set is divided into multiple groups called batches.
Each batch has a number of training samples in it.
At a time, a single batch is passed through the network which computes the loss of every sample in the batch and uses their average to update the parameters of the neural network.