Difference Between Batch, Mini-Batch and Stochastic Gradient Descent

Gradient Descent is one the key algorithm used in Machine Learning. While training machine learning model, we require an algorithm to minimize the value of loss function.

Gradient Descent is one of the optimization algorithm , that is used to minimize the loss.

There are mainly three types of Gradient Descent algorithm 1. Batch Gradient Descent Batch Gradient Descent uses the entire dataset together to update the model weight. It calculates the loss for each data point in the training dataset, but updates the model weight after all training data points have been evaluated. For example if we have one thousand data points , then model’s weight update will happen after all the one thousand data points are evaluated.

2. Mini batch Gradient Descent Mini batch Gradient Descent splits the training dataset into small batches. These batches are used to calculate model loss and update model coefficients. For example if we have one thousand data points , then we create 10 batches of 100 data points each.

3. Stochastic Gradient Descent Stochastic Gradient Descent calculates the loss and updates the model for each data point in the training dataset. This uses a single data point at a time, it does not matter how many data points we have.

If you want to understand the fundamentals of Gradient Descent, or the basics of Gradient Descent Algorithm , you like to watch below video.

It is about implementation of Gradient Descent From Scratch in Python.