Linear Regression (I) ML

Greetings, This blog contains the required information regarding Linear Regression. In the previous blogs, we have discussed the machine learning lifecycle. This blog contains -

Introduction to Linear Regression.
The concept behind Linear Regression.
Condition behind Linear Regression.

As discussed in the previous blogs. Machine learning is used to create predictive models. Predictive models means models which can predict some things. Linear Regression is a supervised learning regression problem. This means that we can use linear regression when we have a label i.e a specific column to predict and that specific label should have continuous values. Linear regression means prediction at continuous outcomes with respect to the line.

Basic Understanding behind Linear regression :

Let's try to understand Linear regression with human intelligence. Suppose you have a case -

X : 1 , 2 , 3 , 4 , 5

y : 2 , 4 , 6 , 8 , 10.

This is our training dataset where X is our Input i.e feature and y is our output i.e target feature. We need to find the pattern between X and y. There are various patterns that we can find and frame in the form of equations. some equations with respect to the above case are:

y = 2 `\times` X. ( equation of line : y = m`\times` X + c where c = 0)
y = last value of y + 2. ( is not useful )

Now, let's test our intelligence with respect to the test dataset. The test data is :

X : 6 , 7 , 20 , 30

y : ? , ? , ? , ?

Our second equation i.e y = last value of y + 2 will fail in some cases because at X = 20 and X = 30, we cannot predict y since we don't have the value of y when X = 19 and X = 29 i.e last value of y. In simple terms, we don't have the last value of y hence we cannot find y with second equations all the time.

We can solve all the cases with the first equation i.e y = 2`\times` X. let's see our results.

y = 2`\times` 6 = 12 for X = 6.

y = 2`\times` 7 = 14 for X = 7.

y = 2`\times` 20 = 40 for X = 20.

y = 2`\times` 30 = 60 for X = 30.

Training dataset visualization (desmos.com)

Testing dataset visualization (desmos.com)

In this case, our intelligence has found that 2 is the number by which if we multiply the value of X, we will get the value of y. This 2 i.e value of m (slope) is known as weights. We observed that weight is 2 and hence we are able to predict each and every use case with 100 % accuracy (100% accuracy because it is a hypothetical example). Now, This same intelligence we have to give to a machine. The machine should predict the value of y when X is given. The machine should know how to find weights i.e 2 in our case (How ? will be discussed in further blogs). Let's solve another use case

This time our training dataset is :

X : 1 , 2 , 3 , 4

y : 3 , 5 , 2 , 10.

and testing data is :

X : 6 , 10 , 9 , 5.

y : ? , ? , ? , ?.

For this case, It is nearly impossible to predict accurate value but we can predict to some extent. The reason behind we are finding it difficult to set the weight criteria i.e m and c. We are finding it difficult to find weights because the data points are only linear to some extent. Hence we need to find the weights of a line i.e m and c to form a line that gives us the least possible error in our prediction. Due to data points that are only linear to some extent. y = m `\times` X + c will not give us the exact result. Hence in every case, there can be some error in the results. The main analogy is that we are considering the points are perfectly linear but in a real scenario, these points are a little bit deviated and scattered.

Working of Linear Regression :

Let's take another case for understanding the actual visualization of linear regression.

Suppose you have some dataset for training we have :

X : 2 , 9 , 5 , 7

y : 3 , 7 , 2 , 9

and for testing we have

X : 10

y : ?

For training, we have blue points. For testing, we have green points. for line, we have a red line. This line is also known as a hypothesis.

Dataset visualization (Desmos.com)

Explanation of Linear Regression (Desmos.com)

The figure on the left describes that the data points are only linear to some extent. AI model has to find the weights and the line formed with that weights should give the least possible error with respect to the dataset. The figure on right shows that there is some distance between the actual point and line point. The distance between the actual point and the line plot is known as residual ( error ).

At the time of training, weights will be found by the model with the help of y as well as X. At the time of testing, the model will find the value of y with the help of weights(found in training) and X. At the time of prediction X will be mapped by the weights of the line (m*X + c) to get the predicted value i.e y. The line with the minimum error is known as the best fit line.

Conditions to apply Linear Regressions :

The main conditions for applying Linear Regression is our target feature i.e y should be continuous and the plots between y and x should contain linearity to some extent. It is also mentioned by correlation. Correlation means proportionality. For linear regression, data points should be directly proportional or inversely proportional to each other. The value of correlation is generally :

Directly proportional - close to 1
Inversely proportional - close to -1
no correlation - close to 0

X is also known as an independent feature because it is not dependent on anyone whereas y is known as a dependent feature because we are finding the value of y with the help of X. Assumptions will be discussed in further blogs. In the next blog, we will discuss the maths behind Linear regression.

Some important labels that have the same meaning :

X / Feature / Independent feature.
y / Target feature / predict feature / Dependent feature .
Loss / Error / Residual
Line / best fit line / Hypothesis / y = mx + c / y = w0 + w1 `\times` X.
Weights / W / β .

Summary :

Regression means prediction at continuous outcomes
y = W0 + W1`\times` X
In training, the machine will have X, y, and the machine will find weights i.e m and c.
In testing, the machine will have m, c (weights), X, and the machine will find y.
To apply linear regression, data points should be linear to some extent.
Are Data points are linear to some extent? can be checked with the help of correlation.
correlation means proportionality i.e directly and inversely proportional.
correlation should be close to +1 or -1.
if the correlation is close to 0 that means there is no relation between X and y and you will not get a good result after applying linear regression.

-Santosh Saxena

Linear Regression (I) ML - 3

Basic Understanding behind Linear regression :

Working of Linear Regression :

Conditions to apply Linear Regressions :

Some important labels that have the same meaning :

Summary :

Contact Form