Understanding Ridge Regression: Managing Overfitting in ML

For Linear Regression in Machine learning with two variables we have to find 2 coefficient. In case of overfitting these 2 coefficients can be very high.

Y=mX+c

So to handle this, Idea is that we want to reduce the value of coefficient(m). By doing this biasness can increase but variance decreases. Which is called bias variance tradeoff. In linear regression cost function

Mathematical representation of the loss function for Ridge Regression, showing the sum of squared differences between actual and predicted values.

We add a term Lambda*m^2 and then updated cost function is

Mathematical expression showing the cost function L for ridge regression, including summation, squared errors, and regularization term with lambda and m squared.

Now we will try to minimize the above cost function.

Mathematical representation of the cost function for linear regression, showing the summation of squared differences between predicted and actual values, along with a regularization term.

Dervating the above cost function with respect to ‘b’ and ‘m’

After derivating the above equation and doing the calculation we can find the value of m as followed

Mathematical formula for calculating the slope (m) in linear regression, depicting sums of products and a regularization term lambda (λ) for Ridge Regression.

Below is equation for lnear regression

Mathematical formula for calculating the slope (m) in linear regression, displayed on a black background with white text.

As we can see that only difference in value of m is the term λ.

λ is a hyperparameter or alpha. larger the value of λ lesser the value of m.

Now let’s implement the code for Ridge Regression.

import pandas
import numpy 
from sklearn.datasets import make_regression
import matplotlib.pyplot as plt

Create X and y

X,y=make_regression(n_samples=100,n_features=1,n_imformative=1,n_targets=1,noise=20,random_state=13

plt.scatter(X,y)

Scatter plot displaying the relationship between two variables in a linear regression model, showing data points spread across the graph.

Now importing the linear regression and fittign the simple linear regression model.

from sklearn.linear_model import LinearRegression
lr=LinearRegression()
lr.fit(X,y)
print(lr.coef_)
print(lr.intercept_)

[27.82809103]
-2.29474455867698

Now importing the ridge regresson class from sklearn

from sklearn.linear_model import Ridge
rr=Ridge(alpha=10)
rr.fit(X,y)
print(rr.coef_)
print(rr.intercept_)

[24.9546267]
-2.1269130035

Now fitting Ridge Regression with different value of Alpha

from sklearn.linear_model import Ridge
rr1=Ridge(alpha=100)
rr1.fit(X,y)
print(rr1.coef_)
print(rr1.intercept_)

[12.9344]
-1.424844

Plotting all 3 fitted lines in above models to show you how is the line formed and how is the slope of all 3 fitted models

plt.plot(X,y,'b.')
plt.plot(X,lr.predict(X),color='red',label='alpha=0')
plt.plot(X,rr.predict(X),color='green',label='alpha=10')
plt.plot(X,rr1.predict(X),color='orange',label='alpha=100')

A scatter plot showing data points in blue and three fitted lines representing linear regression with different alpha values: red for alpha=0, green for alpha=10, and orange for alpha=100.

Implement Ridge Regression from scratch

Let’s create a class called myRidge

class myRidge:
    def __init(self,alpha=0.1):
        self.alpha=alpha
        self.m=None
        self.b=None
    def fit(self,X,y):
        num=0
        den=0
        for i in range(X.shape[0]):
            num=num+(y[i]-y.mean())*(X[i]-X.mean())
            den=den+(y[i]-y.mean())*(X[i]-X.mean())
        self.m=num/(den+self.alpha())
        self.b=y.mean()-(self.m*X.mean())
        print(self.m,self.b)
            
            
    def predict(self,X_test):
        pass
rg=myRidge(alpha=100)
rg.fit(X,y)

Implementing Ridge Regression from Scratch

By

Implement Ridge Regression from scratch

Like this:

Related

By

Related Post

Why RAG Chatbots Struggle in Production

Measuring ROI for a GenAI Initiative in Healthcare

Unique Strings with Odd and Even Swapping Allowed

Leave a ReplyCancel reply

You missed

Why RAG Chatbots Struggle in Production

Measuring ROI for a GenAI Initiative in Healthcare

Unique Strings with Odd and Even Swapping Allowed

Applying SOLID Principles and Dependency Injection in Python

By

Implement Ridge Regression from scratch

Share this:

Like this:

Related

By

Related Post

Leave a ReplyCancel reply

You missed

Discover more from Geeky Codes