1 00:00:00,870 --> 00:00:06,900 No, simple linear regression is a straightforward approach for predicting Y on the basis of a single 2 00:00:06,900 --> 00:00:08,160 predictor variable X. 3 00:00:08,520 --> 00:00:13,290 So if you take only one predictor variable, it is called simple linear regression. 4 00:00:14,620 --> 00:00:17,320 If we assume linear relationship between X and Y. 5 00:00:17,880 --> 00:00:27,240 Mathematically, it can be written as Y is approximately equal to be zero plus between X. It is nearly 6 00:00:27,240 --> 00:00:33,870 equal to because the value of Y that model will give may not be exactly equal to the value in our observation. 7 00:00:35,610 --> 00:00:37,090 Will come back to the thought later. 8 00:00:39,260 --> 00:00:44,800 Here, let us select only one variable from the dataset to predict price of the house. 9 00:00:45,550 --> 00:00:52,440 Let's say I choose the average number of rooms for this simple model so we will regress house price 10 00:00:52,740 --> 00:01:00,120 onto the number of rooms by putting the model price is nearly equal to the zero plus bedevilling times. 11 00:01:00,120 --> 00:01:09,510 Room number error zero and B1 are the unknown terms which are known as model coefficients or model parameters 12 00:01:11,310 --> 00:01:17,880 for the particular case of simple linear regression between is the slope and which is the intercept. 13 00:01:20,460 --> 00:01:26,580 Once we use the training data to estimate these two parameters because it won't be the one, you'll 14 00:01:26,580 --> 00:01:32,030 be using this heart symbol to denote estimated parameters from our data. 15 00:01:32,190 --> 00:01:37,370 So we will raise prices are going to be zero plus between cap times number of rooms. 16 00:01:38,680 --> 00:01:44,500 So for estimating the values of what parameters, we will be using the data points in our dataset. 17 00:01:45,270 --> 00:01:49,290 If you remember, our house pricing data set has 506 observations. 18 00:01:50,670 --> 00:01:56,720 This number of data points for general-purpose will be denoted by an smolan. 19 00:01:58,020 --> 00:02:01,110 So Smolan is 506 for our dataset. 20 00:02:02,460 --> 00:02:09,590 What this means is we have five hundred six pairs of X and Y values and that goal is to obtain coefficient 21 00:02:09,600 --> 00:02:18,460 estimate, zero cap and we have UNCAP such that the linear model fits the available data such as Vivint 22 00:02:18,460 --> 00:02:28,020 cap is equal to zero plus one cap X one, and if you generalize it for any eye, it is Y is nearly equal 23 00:02:28,020 --> 00:02:29,250 to be zero. 24 00:02:29,250 --> 00:02:30,960 Karplus be the one cap ex. 25 00:02:31,680 --> 00:02:37,130 In other words, we want our estimate in line to be as close to this point as possible. 26 00:02:38,820 --> 00:02:45,360 One method for measuring this closeness of our line is called the least squared method, which will 27 00:02:45,360 --> 00:02:46,080 discuss now. 28 00:02:47,470 --> 00:02:53,830 Once we done the model and get a line, the line will be predicting a value of by at each point I. 29 00:02:55,360 --> 00:02:59,170 This predicted value will be denoted by a gap. 30 00:02:59,480 --> 00:03:04,750 Now, we do have the actual values of where each of these points, the difference between these actual 31 00:03:04,750 --> 00:03:14,860 values and the predicted value is that, Miss, this is the residual and it is denoted by, as you can 32 00:03:14,860 --> 00:03:21,910 see in the graph, using the training data that we had, we have the line using the Betaseron be the 33 00:03:21,910 --> 00:03:23,370 one that we calculated. 34 00:03:23,650 --> 00:03:26,630 And this line is drawn here in the blue color. 35 00:03:27,160 --> 00:03:28,950 Each of these points is also plotted. 36 00:03:29,230 --> 00:03:37,150 Some of these points are exactly on the line, but most of them are missing the distance of that point 37 00:03:37,240 --> 00:03:42,760 from the line you need, as it will at some point, this residual is positive. 38 00:03:42,760 --> 00:03:44,410 At some point this is negative. 39 00:03:45,520 --> 00:03:52,150 When we are taking out the total residual of the sample, we cannot straight away from the map because 40 00:03:52,150 --> 00:03:57,980 some are positive and negative, therefore will define a new quantity called residual sum of squares. 41 00:03:58,960 --> 00:04:05,200 Now, since Odyssey's is something the root of each residual, it is representing the total error in 42 00:04:05,200 --> 00:04:05,770 this formula. 43 00:04:05,780 --> 00:04:08,650 You can see that for each of the point. 44 00:04:09,010 --> 00:04:14,470 We are subtracting the actual observed value from the predicted value and then squaring it. 45 00:04:14,890 --> 00:04:17,480 And we are doing this for all of the points. 46 00:04:18,280 --> 00:04:24,780 Now we have the total error of our predicted line and we want to minimize this error. 47 00:04:26,080 --> 00:04:33,160 So using calculus and matrix algebra, we will get these formulas for Betaseron be the one for which 48 00:04:33,160 --> 00:04:34,630 this error is minimized. 49 00:04:35,710 --> 00:04:42,850 So this approach is called Lee Square Meter, because we are minimizing the squared error squared sum 50 00:04:42,850 --> 00:04:43,400 of errors. 51 00:04:43,420 --> 00:04:50,970 So this Odyssey's value we are trying to minimize by differentiating and putting it to zero will get 52 00:04:50,980 --> 00:04:53,120 these values of between carbon zero. 53 00:04:53,200 --> 00:04:58,060 Got what this value, what these values are zero. 54 00:04:58,060 --> 00:05:01,860 And B, the one the calculated sum of squares will be minimum. 55 00:05:03,010 --> 00:05:09,100 So we this summation of X minus X, but if you remember, what is the meaning of the sample. 56 00:05:09,670 --> 00:05:17,290 So for each data point we will find out this different of each point from each mean and then will multiplied 57 00:05:17,290 --> 00:05:25,150 with the difference of each y y variable with X mean, you'll sum this product for all the points and 58 00:05:25,150 --> 00:05:30,760 will divide it by the difference of X from its mean squared I all points. 59 00:05:32,140 --> 00:05:34,540 Similarly, what does it mean. 60 00:05:34,540 --> 00:05:38,820 Value of Y minus beta one group times mean value of X. 61 00:05:39,130 --> 00:05:41,350 So we have mean value of X and Y. 62 00:05:41,380 --> 00:05:43,510 We first need to calculate the value. 63 00:05:43,990 --> 00:05:48,370 When we put between value and this formula, we will get to be the zero value. 64 00:05:50,310 --> 00:05:55,920 So using these formulas for simple linear regression, you can get the Beatles, you know, and be the 65 00:05:55,950 --> 00:05:56,670 one values. 66 00:05:59,590 --> 00:06:07,510 For our model, where we selected House Price as Y and Bromium as X environ this morning, this thought 67 00:06:07,510 --> 00:06:08,860 assortment, I get this result. 68 00:06:09,950 --> 00:06:13,240 I have highlighted the the values in this bluebox. 69 00:06:14,620 --> 00:06:19,420 This intercept is zero and room number is the X variable. 70 00:06:19,720 --> 00:06:22,570 And this is giving the confusion of this variable. 71 00:06:22,600 --> 00:06:23,700 So this is Beethoven. 72 00:06:24,880 --> 00:06:31,810 So Beethoven is coming out as nine point zero nine and intercept is coming is minus thirty one point 73 00:06:31,810 --> 00:06:32,330 six nine. 74 00:06:33,190 --> 00:06:40,240 In other words, this means that if I increase the number of rooms by one unit, the price of house 75 00:06:40,240 --> 00:06:41,990 will increase by nine units. 76 00:06:43,690 --> 00:06:50,170 What is the meaning of all these other values that will be learning in the coming reduce funding to 77 00:06:50,170 --> 00:06:50,830 NOTARIS? 78 00:06:51,160 --> 00:06:57,220 You do not need to remember these formulas because these software packages will be doing it for you. 79 00:06:59,160 --> 00:07:04,680 As you saw in this video and you will see in the coming videos, we'll be telling you the mathematical 80 00:07:04,680 --> 00:07:09,720 concept behind the theory and discussing those mathematical formulas. 81 00:07:09,720 --> 00:07:14,310 Also, keep in mind that you do not need to remember these formulas. 82 00:07:14,460 --> 00:07:17,210 You just need to understand the concept behind them. 83 00:07:17,670 --> 00:07:21,810 The indication that I tell you that will help you interpret the result. 84 00:07:22,200 --> 00:07:24,950 That understanding of result is very important. 85 00:07:26,010 --> 00:07:32,850 But you do not need to memorize these formulas since you will be using a software package which will 86 00:07:32,850 --> 00:07:38,350 be applying all these formulas and getting the results for you to preparing the data is important for 87 00:07:38,370 --> 00:07:44,220 running a model is important and interpreting the data accurately is the most important. 88 00:07:45,630 --> 00:07:53,580 Remembering formula is not important in machine learning, nor also even if you do not understand the 89 00:07:53,580 --> 00:07:55,010 mathematical part of this. 90 00:07:55,380 --> 00:07:56,190 Don't be worried. 91 00:07:56,610 --> 00:08:03,390 You can still run a machine learning model and you can use the results in your professional life. 92 00:08:04,530 --> 00:08:10,350 But I highly recommend that you go through all the lectures very carefully to understand the core concepts 93 00:08:10,350 --> 00:08:12,440 behind all these machine learning methods.