How to calculate slope and intercept of regression line in easy steps

Introduction

In this tutorial readers will learn how to calculate regression line which is also called goodness of fit. Best part of this article would be the calculation of slope and intercept would be done using programming skills. I will also write the equation for both slope and intercept and implement it. This will make article very interesting for the readers. I also urge readers to join AI Sangam youtube channel if they are not from the below link.

AI Sangam youtube channel

In the end, I guess readers will learn how to calculate slope and intercept very easily. Not only this I also request passionate readers to do the code by hand so that they may understand the things better. If readers are interested to know what could be achieved using deep learning, then they can also read from the article below.

Real time face recognition using Facenet | AI Sangam

Table of Contents

  1. Where do we apply regression model
  2. How to calculate slope and intercept of regression line
  3. Final words

Where do we apply regression model

To start this section let us discuss what is machine learning and what are its types. I have started from this section because regression comes under machine learning. Machine learning is a technique to make machines think like humans. Machine learning is of three types named as:

  1. Supervised 
  2. Unsupervised
  3. Reinforcement learning

Supervised Machine Learning: In supervised learning we have targets with respect to data. It is further categorized into two types named as classification and regression. Classification deals with categorical data whereas regression deals with continuous type of data. Linear regression lies in the regression type which is focus of this article

Unsupervised Machine Learning: In this type of learning, there are no targets and classification is done based on clustering. Best example is K means clustering where new data is classified into that cluster where maximum features match.

Reinforcement learning: This is the third type where learning is done with experience and live situation. Best example is driving a car without a driver.

From the above discussion it is clear that if there is a target given and the problem is of continuous data type then regression is applied. Regression can be linear, polynomial, ridge or lasso. You can watch ai sangam video on lasso regression from below link

Understanding Ridge, Lasso and Elastic Net

This tutorial will focus on linear regression with single column data and single column target which is called univariate data. To end this section let us define the equation of straight line because regression line is same as equation of straight line where slope is m and intercept is c

                                                                                  y = mx + c

How to calculate slope and intercept of regression line

Let us see the formula for calculating m (slope) and c (intercept).

m = n (Σxy) – (Σx)(Σy) /n(Σx2) – (Σx)2

Where 

n is number of observations

x = input variable

y = target variable

Let us implement a code to calculate slope of regression line

""" 
Calculating linear regression slope for univariate data
Note: Independent column is also called predictor variable
      Dependent column is also called criterion variable  
"""

predict_var = [1, 2, 3]
criter_var = [4 ,5 , 6]
n1 = len(predict_var)
n2 = len(criter_var)
print("Error is printed only if lengths are not equal")
assert(n1==n2)

mul_pred_criter = [
                  a * b for a,b in 
                  zip(predict_var, criter_var)
                  ]

square_each_element_predict = [a*a for a in predict_var]
square_each_element_criter =  [b*b for b in criter_var] 

sum_predict_var = sum(predict_var)
sum_criter_var = sum(criter_var)
sum_square_predict = sum(square_each_element_predict)
sum_square_criter = sum(square_each_element_criter)
sum_mul_pred_criter = sum(mul_pred_criter)

numerator = (
            n1*sum_mul_pred_criter - 
            sum_predict_var*sum_criter_var
            )

denominator = n1*sum_square_predict - sum_predict_var**2

slope = numerator/denominator
print("Slope is", slope)

Calculating intercept of regression line

c = (Σy)(Σx2) – (Σx)(Σxy)/ n(Σx2) – (Σx)2

Let us see the code for this in python

""" 
Calculating linear regression intercept for univariate data
Note: Independent column is also called predictor variable
      Dependent column is also called criterion variable  
"""

predict_var = [1, 2, 3]
criter_var = [4 ,5 , 6]
n1 = len(predict_var)
n2 = len(criter_var)
print("Error is printed only if lengths are not equal")
assert(n1==n2)

mul_pred_criter = [
                  a * b for a,b in 
                  zip(predict_var, criter_var)
                  ]

square_each_element_predict = [a*a for a in predict_var]
square_each_element_criter =  [b*b for b in criter_var] 

sum_predict_var = sum(predict_var)
sum_criter_var = sum(criter_var)
sum_square_predict = sum(square_each_element_predict)
sum_square_criter = sum(square_each_element_criter)
sum_mul_pred_criter = sum(mul_pred_criter)

numerator_first_half = (sum_criter_var) * sum_square_predict 
numerator_second_half = sum_predict_var*sum_mul_pred_criter
numerator = numerator_first_half - numerator_second_half

denominator = n1*(sum_square_predict)-sum_predict_var**2

intercept = numerator/denominator
print("Intercept is\n", intercept)

Final words

In the section, we have seen where the regression line is to be used. One of the use cases would be to buy the house based on the area. This is a regression problem. Another example would be to predict the closing price of stocks based on open, low and high. In the second section we have seen how to calculate slope and intercept of this regression line and implemented python code for such. Real time code implementation will help readers to gain practical scenario of this tutorial. Here I would like to end this section at this point that if predicted outcome and actual values are very close then this implies that loss is minimum and regression line fits the data very well. I hope readers have enjoyed reading the article. I also hope that readers would enjoy future articles also.

Leave a Reply

Your email address will not be published.