Linear Regression

Deep Gojariya
4 min readMar 23, 2021

--

What is linear regression ?

Linear regression is an ML algorithm which finds the relationship between an independent variable and an dependent variable in the form of a ‘best fit line’.

In-depth Intuition

We know that equation of a straight line is given by : y = mx + c , where

m=Slope of line, c=y-intercept of the line.

But if we consider the same scenario in ML use-case then ‘m’ and ‘c’ will still remain the same but y=dependent variable and x=independent variable.

Source :https://trevorpythag.wordpress.com/tag/graph-equation-straight-line/

Now, we understood the simple straight line equation so we can go ahead and learn about how linear regression works.

Consider the below given graph of House Prices v/s Area. Suppose you are given a task of calculating the predicted house price given its area so you are most probably going to use Linear regression.

House Price v/s Size

Here, y = Price, x = Area ,m = slope ,c = y-intercept,

Consider the two lines in the below graph now out of these two lines we need to find the best-fit line for our problem. For doing this we would consider something called as the Cost function which is the sum of squared difference between the actual points and their projections on the regression-line.

Method 1

The Cost function is given as :

Where J=Cost Function, n=Total number of points in dataset, pred(i)=projected value, y(i)=actual value.

Now we calculate the cost function for both the lines and whichever is minimum we consider that line as our best-fit line.

Here, we have considered only two lines but in real-world we need to calculate the cost function for ’n’ number of lines and that is a tedious process so we also have an more efficient way for finding out the best-fit line which is called as Gradient Descent.

Consider the above graphs Graph 1 is the actual graph for which we need to find out the best-fit line. We assume c=0 in our straight line equation y`=m*x + c.

In the rest 3 graphs we have plotted three different lines based on different slopes i.e. by changing the slopes we have created new lines and for each line we will calculate the cost function and whichever line’s cost function will be minimum that line will be our best fit line. But here you will wonder how is that different from the approach discussed previously? and here comes GD.

In Gradient descent , a Gradient Descent curve is generated by calculating thee cost functions for different slope values and a curve is plotted. And our goal is to reach the global minima.

Gradient Descent

We initialize ‘m i.e. the slope to any random value let’s initialize it to 0.25. And we calculate it’s Cost J(m) and draw a tangent on our GD curve at (0.25,Cost).Now we calculate new ‘m’ value using the convergence theorem which is given below :

Convergence theorem

Here slope represents the slope of the tangent line and alpha is the learning rate which is initialized to be a very small value. Now the slope of tangent line will affect the m(new) in two ways :

  1. Negative slope : m(new)>m(old).
  2. Positive slope : m(new)<m(old).

Since we have considered m=0.25 we will get a negative slope and our m(new) will be greater than 0.25. We will repeat the process till we reach global minima or when m(new)=m(old).

Gradient Descent Process

As we reached global minima we can say that we have found out the best fit-line in an efficient way.

This is how linear regression works, hope you have understood each and every aspect about linear regression through this blog my next blog will be an application of linear regression using python. Thank you.

--

--