Machine learning algorithms often involve optimizing a model to fit the given data. One such optimization technique is gradient descent, a fundamental algorithm used to minimize a cost function. In this tutorial, we’ll explore the implementation of batch gradient descent for a simple linear regression problem using Python and NumPy.
Introduction
Linear regression is a common approach in machine learning for modeling the relationship between a dependent variable and one or more independent variables. The goal is to find the best-fit line that minimizes the difference between predicted and actual values. Gradient descent is an iterative optimization algorithm used to minimize a cost function, which measures the difference between predicted and actual values.
Prerequisites
Before we dive into the implementation, make sure you have the following installed:
- Python
- NumPy
You can install NumPy using the following command:
pip install numpy
Implementing Batch Gradient Descent
Let’s start by implementing the batch gradient descent algorithm. Open your favorite Python environment and follow along.
import numpy as np def gradient_descent(X, y, learning_rate, num_iterations): # Initialize model parameters (weights and bias) with random values np.random.seed(0) num_features = X.shape[1] weights = np.random.rand(num_features, 1) bias = np.random.rand(1) # History to store cost values over iterations cost_history = [] # Batch Gradient Descent for iteration in range(num_iterations): predictions = np.dot(X, weights) + bias mse_cost = np.mean((predictions - y)**2) cost_history.append(mse_cost) gradient_weights = (2 / len(y)) * np.dot(X.T, (predictions - y)) gradient_bias = (2 / len(y)) * np.sum(predictions - y) weights -= learning_rate * gradient_weights bias -= learning_rate * gradient_bias # Print the current cost every 100 iterations if iteration % 100 == 0: print(f"Iteration {iteration}, Cost: {mse_cost}") return weights, bias, cost_history # Example usage: # Assume X is a numpy array of input features and y is a numpy array of target values # X = ... # y = ... # Set learning_rate and num_iterations according to your requirements learning_rate = 0.01 num_iterations = 1000 # Call the gradient_descent function optimized_weights, optimized_bias, cost_history = gradient_descent(X, y, learning_rate, num_iterations) # Print the final model parameters print("Optimized Weights:", optimized_weights) print("Optimized Bias:", optimized_bias)
Understanding the Code
- Initialization: We initialize the model parameters (weights and bias) with random values.
- Forward Pass: We compute the predictions using the current model parameters.
- Cost Computation: We calculate the mean squared error (MSE) as the cost function.
- Backward Pass: We compute the gradients of the cost with respect to the model parameters.
- Parameter Update: We update the model parameters using the gradients and the learning rate.
- Monitoring Progress: We print the current cost every 100 iterations to monitor the optimization progress.
Conclusion
In this tutorial, we implemented batch gradient descent for a simple linear regression problem. This is a basic example, and in real-world scenarios, you might use more advanced techniques and machine learning libraries. Experiment with different learning rates and numbers of iterations to observe their effects on convergence.
Remember, understanding the fundamentals of gradient descent is crucial for delving into more complex machine learning algorithms. Feel free to adapt and extend this code for more sophisticated regression problems and datasets.
Happy coding!