1
00:00:00,870 --> 00:00:06,900
No, simple linear regression is a straightforward approach for predicting Y on the basis of a single

2
00:00:06,900 --> 00:00:08,160
predictor variable X.

3
00:00:08,520 --> 00:00:13,290
So if you take only one predictor variable, it is called simple linear regression.

4
00:00:14,620 --> 00:00:17,320
If we assume linear relationship between X and Y.

5
00:00:17,880 --> 00:00:27,240
Mathematically, it can be written as Y is approximately equal to be zero plus between X. It is nearly

6
00:00:27,240 --> 00:00:33,870
equal to because the value of Y that model will give may not be exactly equal to the value in our observation.

7
00:00:35,610 --> 00:00:37,090
Will come back to the thought later.

8
00:00:39,260 --> 00:00:44,800
Here, let us select only one variable from the dataset to predict price of the house.

9
00:00:45,550 --> 00:00:52,440
Let's say I choose the average number of rooms for this simple model so we will regress house price

10
00:00:52,740 --> 00:01:00,120
onto the number of rooms by putting the model price is nearly equal to the zero plus bedevilling times.

11
00:01:00,120 --> 00:01:09,510
Room number error zero and B1 are the unknown terms which are known as model coefficients or model parameters

12
00:01:11,310 --> 00:01:17,880
for the particular case of simple linear regression between is the slope and which is the intercept.

13
00:01:20,460 --> 00:01:26,580
Once we use the training data to estimate these two parameters because it won't be the one, you'll

14
00:01:26,580 --> 00:01:32,030
be using this heart symbol to denote estimated parameters from our data.

15
00:01:32,190 --> 00:01:37,370
So we will raise prices are going to be zero plus between cap times number of rooms.

16
00:01:38,680 --> 00:01:44,500
So for estimating the values of what parameters, we will be using the data points in our dataset.

17
00:01:45,270 --> 00:01:49,290
If you remember, our house pricing data set has 506 observations.

18
00:01:50,670 --> 00:01:56,720
This number of data points for general-purpose will be denoted by an smolan.

19
00:01:58,020 --> 00:02:01,110
So Smolan is 506 for our dataset.

20
00:02:02,460 --> 00:02:09,590
What this means is we have five hundred six pairs of X and Y values and that goal is to obtain coefficient

21
00:02:09,600 --> 00:02:18,460
estimate, zero cap and we have UNCAP such that the linear model fits the available data such as Vivint

22
00:02:18,460 --> 00:02:28,020
cap is equal to zero plus one cap X one, and if you generalize it for any eye, it is Y is nearly equal

23
00:02:28,020 --> 00:02:29,250
to be zero.

24
00:02:29,250 --> 00:02:30,960
Karplus be the one cap ex.

25
00:02:31,680 --> 00:02:37,130
In other words, we want our estimate in line to be as close to this point as possible.

26
00:02:38,820 --> 00:02:45,360
One method for measuring this closeness of our line is called the least squared method, which will

27
00:02:45,360 --> 00:02:46,080
discuss now.

28
00:02:47,470 --> 00:02:53,830
Once we done the model and get a line, the line will be predicting a value of by at each point I.

29
00:02:55,360 --> 00:02:59,170
This predicted value will be denoted by a gap.

30
00:02:59,480 --> 00:03:04,750
Now, we do have the actual values of where each of these points, the difference between these actual

31
00:03:04,750 --> 00:03:14,860
values and the predicted value is that, Miss, this is the residual and it is denoted by, as you can

32
00:03:14,860 --> 00:03:21,910
see in the graph, using the training data that we had, we have the line using the Betaseron be the

33
00:03:21,910 --> 00:03:23,370
one that we calculated.

34
00:03:23,650 --> 00:03:26,630
And this line is drawn here in the blue color.

35
00:03:27,160 --> 00:03:28,950
Each of these points is also plotted.

36
00:03:29,230 --> 00:03:37,150
Some of these points are exactly on the line, but most of them are missing the distance of that point

37
00:03:37,240 --> 00:03:42,760
from the line you need, as it will at some point, this residual is positive.

38
00:03:42,760 --> 00:03:44,410
At some point this is negative.

39
00:03:45,520 --> 00:03:52,150
When we are taking out the total residual of the sample, we cannot straight away from the map because

40
00:03:52,150 --> 00:03:57,980
some are positive and negative, therefore will define a new quantity called residual sum of squares.

41
00:03:58,960 --> 00:04:05,200
Now, since Odyssey's is something the root of each residual, it is representing the total error in

42
00:04:05,200 --> 00:04:05,770
this formula.

43
00:04:05,780 --> 00:04:08,650
You can see that for each of the point.

44
00:04:09,010 --> 00:04:14,470
We are subtracting the actual observed value from the predicted value and then squaring it.

45
00:04:14,890 --> 00:04:17,480
And we are doing this for all of the points.

46
00:04:18,280 --> 00:04:24,780
Now we have the total error of our predicted line and we want to minimize this error.

47
00:04:26,080 --> 00:04:33,160
So using calculus and matrix algebra, we will get these formulas for Betaseron be the one for which

48
00:04:33,160 --> 00:04:34,630
this error is minimized.

49
00:04:35,710 --> 00:04:42,850
So this approach is called Lee Square Meter, because we are minimizing the squared error squared sum

50
00:04:42,850 --> 00:04:43,400
of errors.

51
00:04:43,420 --> 00:04:50,970
So this Odyssey's value we are trying to minimize by differentiating and putting it to zero will get

52
00:04:50,980 --> 00:04:53,120
these values of between carbon zero.

53
00:04:53,200 --> 00:04:58,060
Got what this value, what these values are zero.

54
00:04:58,060 --> 00:05:01,860
And B, the one the calculated sum of squares will be minimum.

55
00:05:03,010 --> 00:05:09,100
So we this summation of X minus X, but if you remember, what is the meaning of the sample.

56
00:05:09,670 --> 00:05:17,290
So for each data point we will find out this different of each point from each mean and then will multiplied

57
00:05:17,290 --> 00:05:25,150
with the difference of each y y variable with X mean, you'll sum this product for all the points and

58
00:05:25,150 --> 00:05:30,760
will divide it by the difference of X from its mean squared I all points.

59
00:05:32,140 --> 00:05:34,540
Similarly, what does it mean.

60
00:05:34,540 --> 00:05:38,820
Value of Y minus beta one group times mean value of X.

61
00:05:39,130 --> 00:05:41,350
So we have mean value of X and Y.

62
00:05:41,380 --> 00:05:43,510
We first need to calculate the value.

63
00:05:43,990 --> 00:05:48,370
When we put between value and this formula, we will get to be the zero value.

64
00:05:50,310 --> 00:05:55,920
So using these formulas for simple linear regression, you can get the Beatles, you know, and be the

65
00:05:55,950 --> 00:05:56,670
one values.

66
00:05:59,590 --> 00:06:07,510
For our model, where we selected House Price as Y and Bromium as X environ this morning, this thought

67
00:06:07,510 --> 00:06:08,860
assortment, I get this result.

68
00:06:09,950 --> 00:06:13,240
I have highlighted the the values in this bluebox.

69
00:06:14,620 --> 00:06:19,420
This intercept is zero and room number is the X variable.

70
00:06:19,720 --> 00:06:22,570
And this is giving the confusion of this variable.

71
00:06:22,600 --> 00:06:23,700
So this is Beethoven.

72
00:06:24,880 --> 00:06:31,810
So Beethoven is coming out as nine point zero nine and intercept is coming is minus thirty one point

73
00:06:31,810 --> 00:06:32,330
six nine.

74
00:06:33,190 --> 00:06:40,240
In other words, this means that if I increase the number of rooms by one unit, the price of house

75
00:06:40,240 --> 00:06:41,990
will increase by nine units.

76
00:06:43,690 --> 00:06:50,170
What is the meaning of all these other values that will be learning in the coming reduce funding to

77
00:06:50,170 --> 00:06:50,830
NOTARIS?

78
00:06:51,160 --> 00:06:57,220
You do not need to remember these formulas because these software packages will be doing it for you.

79
00:06:59,160 --> 00:07:04,680
As you saw in this video and you will see in the coming videos, we'll be telling you the mathematical

80
00:07:04,680 --> 00:07:09,720
concept behind the theory and discussing those mathematical formulas.

81
00:07:09,720 --> 00:07:14,310
Also, keep in mind that you do not need to remember these formulas.

82
00:07:14,460 --> 00:07:17,210
You just need to understand the concept behind them.

83
00:07:17,670 --> 00:07:21,810
The indication that I tell you that will help you interpret the result.

84
00:07:22,200 --> 00:07:24,950
That understanding of result is very important.

85
00:07:26,010 --> 00:07:32,850
But you do not need to memorize these formulas since you will be using a software package which will

86
00:07:32,850 --> 00:07:38,350
be applying all these formulas and getting the results for you to preparing the data is important for

87
00:07:38,370 --> 00:07:44,220
running a model is important and interpreting the data accurately is the most important.

88
00:07:45,630 --> 00:07:53,580
Remembering formula is not important in machine learning, nor also even if you do not understand the

89
00:07:53,580 --> 00:07:55,010
mathematical part of this.

90
00:07:55,380 --> 00:07:56,190
Don't be worried.

91
00:07:56,610 --> 00:08:03,390
You can still run a machine learning model and you can use the results in your professional life.

92
00:08:04,530 --> 00:08:10,350
But I highly recommend that you go through all the lectures very carefully to understand the core concepts

93
00:08:10,350 --> 00:08:12,440
behind all these machine learning methods.