1 00:00:00,480 --> 00:00:01,410 All right. 2 00:00:01,450 --> 00:00:04,120 So now we've got our own custom evaluation functions. 3 00:00:04,130 --> 00:00:06,430 We can evaluate how our model is doing. 4 00:00:06,440 --> 00:00:12,170 So the next logical step would be to train another one on the training data set and then evaluate it 5 00:00:12,230 --> 00:00:19,750 on the validation data set or just using this beautiful function that we've crafted not so fast. 6 00:00:19,760 --> 00:00:21,170 Let's see why. 7 00:00:21,170 --> 00:00:26,180 So testing our model this section is gonna be we're gonna do it on a subset. 8 00:00:26,360 --> 00:00:31,250 So testing our model on a subset. 9 00:00:31,250 --> 00:00:38,690 Now this is mostly to tune the hybrid parameters because remember what we've been stressing we want 10 00:00:38,690 --> 00:00:43,780 to decrease the amount of time it takes us between experiments and if we come back up. 11 00:00:43,820 --> 00:00:49,580 Remember how we time traveled in a previous video that took six minutes 58 seconds to fit a model on 12 00:00:49,580 --> 00:00:50,430 all those data. 13 00:00:50,450 --> 00:00:54,140 So on 400000 Rose we don't have enough time to do that again. 14 00:00:54,140 --> 00:00:56,840 We can only time travel every so often. 15 00:00:57,050 --> 00:01:04,130 And so if we were to just ride it out again we would instantiate our models and model equals random 16 00:01:04,520 --> 00:01:13,810 forest regress our end jobs equals negative one and a random state of 42. 17 00:01:14,060 --> 00:01:14,650 Wonderful. 18 00:01:14,660 --> 00:01:22,160 And then if we just did modeled outfit X train I need to be careful here not to be trigger happy if 19 00:01:22,160 --> 00:01:24,180 we did this right. 20 00:01:24,210 --> 00:01:30,860 If we put a little time up here time that's gonna take far too long. 21 00:01:30,930 --> 00:01:31,140 Right. 22 00:01:31,160 --> 00:01:33,190 So I might put a note here. 23 00:01:33,650 --> 00:01:41,990 This takes far too long for experimenting because that's going to take about five minutes. 24 00:01:42,110 --> 00:01:43,960 That's not what we're after. 25 00:01:43,970 --> 00:01:48,320 And again if you have run this cell before actually I really should've told you this earlier if you 26 00:01:48,320 --> 00:01:51,570 did run this cell on your computer and it's taking far too long for it to load. 27 00:01:51,570 --> 00:01:54,950 Remember you can always press the stop button up here. 28 00:01:54,950 --> 00:01:58,970 So if we run this Oh well that's too fast that runs too fast. 29 00:01:58,970 --> 00:02:02,420 But if your cell is running remember how there's a little star here I'll show you. 30 00:02:02,420 --> 00:02:05,500 There'll be a star while that that happens too quick for this. 31 00:02:05,530 --> 00:02:08,000 That you got to trust me on this one when you run a cell. 32 00:02:08,000 --> 00:02:10,090 This will turn into a star if it's running. 33 00:02:10,130 --> 00:02:15,950 You can stop it by pressing this button with this cell highlighted and that'll stop that cell so come 34 00:02:15,960 --> 00:02:16,710 back. 35 00:02:16,740 --> 00:02:17,570 Let's not run that. 36 00:02:17,580 --> 00:02:20,980 We'll make sure we can't so we commented out beautiful. 37 00:02:21,330 --> 00:02:22,570 Because why. 38 00:02:22,620 --> 00:02:31,440 Because our data frame is 400000 rows there and my little MacBook Pro is already calculate a lot of 39 00:02:31,440 --> 00:02:36,780 patents in 400000 rise and it's like well I'm going to take a while so you need to be patient with me. 40 00:02:36,780 --> 00:02:40,970 Well that's fair enough because I would take far longer than my MacBook Pro does to calculate three 41 00:02:40,980 --> 00:02:42,720 four hundred thousand rows. 42 00:02:42,720 --> 00:02:49,200 So what our solution is is to speed up our time between experiments and that's what we're doing testing 43 00:02:49,200 --> 00:02:50,250 our model on a subset. 44 00:02:50,250 --> 00:02:55,610 We could do something like this could just copy this model of it. 45 00:02:55,710 --> 00:02:58,310 Maybe we want to only do it on ten thousand rows 46 00:03:01,410 --> 00:03:02,630 but there's an even better way. 47 00:03:02,640 --> 00:03:06,350 And it's right built into we could do that right. 48 00:03:06,360 --> 00:03:08,860 That's going to test it on 10000 rows. 49 00:03:09,690 --> 00:03:13,030 But now if there's a better way to do it that is one option. 50 00:03:13,050 --> 00:03:17,880 So keep that in mind that you can just slice your you're training Denison but the beautiful thing about 51 00:03:17,880 --> 00:03:22,620 random forest and this is why we're focusing on it so much is that they have this little feature in 52 00:03:22,620 --> 00:03:24,970 it called Max samples. 53 00:03:24,990 --> 00:03:30,230 So change Max sample's value. 54 00:03:30,490 --> 00:03:30,990 All right. 55 00:03:30,990 --> 00:03:35,740 And if we wanted to figure out what Max samples does where would you go. 56 00:03:35,890 --> 00:03:39,960 Well one option is if we go model equals random forest regress. 57 00:03:39,970 --> 00:03:43,310 We're gonna have heaps of practice instantiating these models. 58 00:03:43,480 --> 00:03:47,860 It's what it's about though Ryan is just about almost making this second nature of what it's like to 59 00:03:47,870 --> 00:03:50,460 just get a model instantiate and get it ready. 60 00:03:50,470 --> 00:03:53,220 Random state equals 42. 61 00:03:53,260 --> 00:04:00,550 There's our model so that could be one option but if we press shift tab here we'll see the doc string 62 00:04:00,700 --> 00:04:05,670 Max features not Max samples Max samples. 63 00:04:05,770 --> 00:04:06,950 Boom. 64 00:04:07,000 --> 00:04:09,730 Now if we come down here let's find it. 65 00:04:11,330 --> 00:04:15,330 Max samples there we go. 66 00:04:15,430 --> 00:04:21,700 If bootstrap is true the number of samples to draw from X to train each base estimate. 67 00:04:22,540 --> 00:04:23,500 Mm hmm. 68 00:04:23,590 --> 00:04:28,550 What does this mean if a guy model like bootstrap equals true. 69 00:04:28,580 --> 00:04:29,780 Yes. 70 00:04:29,780 --> 00:04:35,000 And now each estimate as sees Max samples that way as Max samples none. 71 00:04:35,030 --> 00:04:36,490 So what does it mean if none. 72 00:04:36,800 --> 00:04:43,780 It's gonna be all isn't it let's have a look this is what I want you to have some practice doing right 73 00:04:43,810 --> 00:04:48,880 is if you're not sure on something it's looking it up and reading it in the docs. 74 00:04:48,880 --> 00:04:52,100 If bootstrap is true the number of samples to draw from x. 75 00:04:52,270 --> 00:04:54,920 Okay so X is what we pass a model. 76 00:04:55,030 --> 00:04:59,400 If none then draw X dot shape samples so it's going to draw them all. 77 00:04:59,620 --> 00:05:03,000 Now if and then draw Max sample samples. 78 00:05:03,040 --> 00:05:14,580 So what this means is that if we go X train does shape zero so if we leave Max samples as none every 79 00:05:14,580 --> 00:05:19,760 estimate are in our case there's 100 because the default end estimate is in random forest regressive 80 00:05:19,780 --> 00:05:21,280 is 100. 81 00:05:21,390 --> 00:05:24,770 So you can imagine an estimated as a small model. 82 00:05:24,930 --> 00:05:26,430 So each small model. 83 00:05:26,430 --> 00:05:31,410 So one hundred small models are going to see every single one of these. 84 00:05:31,500 --> 00:05:35,550 So that's that's a large number right. 85 00:05:35,910 --> 00:05:38,850 So this is all the calculations that we have to make probably more than this. 86 00:05:39,240 --> 00:05:40,350 But this is what you can imagine. 87 00:05:40,770 --> 00:05:46,660 But instead if we set Max samples to something else to say for example maybe if we only wanted to to 88 00:05:46,670 --> 00:05:50,350 try 10000 examples instead of four hundred thousand. 89 00:05:50,400 --> 00:05:50,660 Right. 90 00:05:50,670 --> 00:06:00,910 So closer to the shape of our validation set we could change Max samples set to ten thousand. 91 00:06:01,050 --> 00:06:08,920 So this means that instead of looking at 400000 rows every estimate up to 100 estimates for an estimate 92 00:06:08,920 --> 00:06:14,010 as we could also change this to be honest but we'll leave it as default just for this example. 93 00:06:14,010 --> 00:06:19,610 So that means because we change it to ten thousand times 100 well. 94 00:06:20,270 --> 00:06:23,150 So let's have a look here. 95 00:06:23,240 --> 00:06:25,350 Let's divide these two. 96 00:06:25,430 --> 00:06:27,950 How many times less what do we got here. 97 00:06:27,950 --> 00:06:31,590 What numbers and how many zeros a million. 98 00:06:32,320 --> 00:06:37,610 That's a million I can count 40 times less. 99 00:06:37,610 --> 00:06:40,060 So 40 times less data to compute on. 100 00:06:40,070 --> 00:06:46,130 So that means we should see a speed increase of 40 x or about that. 101 00:06:46,130 --> 00:06:46,400 Right. 102 00:06:46,400 --> 00:06:48,150 Because it's not going to be perfect. 103 00:06:48,230 --> 00:06:51,500 So let's go here change what we might do. 104 00:06:52,400 --> 00:06:52,610 Yeah. 105 00:06:52,640 --> 00:06:53,880 That looks good. 106 00:06:54,170 --> 00:06:55,900 And then we'll create a new selling year. 107 00:06:55,950 --> 00:06:58,580 We'll need our little time here because this is what we want to practice. 108 00:06:58,580 --> 00:07:09,730 So cutting down on the max number of samples each estimate I can see improves. 109 00:07:09,800 --> 00:07:13,910 This is our hypothesis improved training time. 110 00:07:14,020 --> 00:07:19,130 Now let's have a look model dot fit we'll do it on our training data now. 111 00:07:21,290 --> 00:07:27,510 So this shouldn't take too long maybe we might even have time to wait for it still might take a few 112 00:07:27,510 --> 00:07:27,960 seconds 113 00:07:33,130 --> 00:07:40,120 this might be another opportunity the time travel Hall didn't need own child will march. 114 00:07:40,120 --> 00:07:44,080 Look at that wall time 16 seconds instead of seven minutes. 115 00:07:44,080 --> 00:07:45,690 Now that's a bit better right now. 116 00:07:45,730 --> 00:07:49,780 That's something that we can potentially just sit here and wait for if you're doing seven minute experiments 117 00:07:49,780 --> 00:07:54,510 all the time or even longer if you had more data you're going to lose patience you're gonna get distracted. 118 00:07:54,520 --> 00:07:59,260 But if we can keep our experiments like somewhere around 10 seconds so that's that's pretty good right. 119 00:07:59,260 --> 00:08:00,740 I can wait 10 seconds. 120 00:08:00,820 --> 00:08:07,200 And now the beauty comes in here when we can call our show scores function. 121 00:08:07,240 --> 00:08:12,730 So let's see what happens when we go show scores hopefully this works because we've coded up this function 122 00:08:12,730 --> 00:08:13,520 ourselves. 123 00:08:13,630 --> 00:08:17,850 Of course it will show scores. 124 00:08:18,230 --> 00:08:21,330 What are we going to find. 125 00:08:21,430 --> 00:08:22,470 Oh no. 126 00:08:22,570 --> 00:08:23,240 We've messed it up. 127 00:08:23,240 --> 00:08:26,080 No I has no spirit. 128 00:08:26,140 --> 00:08:28,370 Instead of square root this is where we've messed up. 129 00:08:28,480 --> 00:08:29,650 None pi. 130 00:08:29,650 --> 00:08:30,270 There we go. 131 00:08:31,780 --> 00:08:35,960 Typo classic couldn't get it right the first go could you Daniel. 132 00:08:36,940 --> 00:08:38,300 Nope. 133 00:08:38,320 --> 00:08:39,670 So we'll see how our model does. 134 00:08:39,670 --> 00:08:45,740 We've just trained it here on a subset of the data be you to a fall. 135 00:08:46,300 --> 00:08:47,370 Oh right. 136 00:08:47,410 --> 00:08:48,980 Look at that. 137 00:08:48,980 --> 00:08:50,920 So training mean absolute error. 138 00:08:51,130 --> 00:08:51,730 So valid. 139 00:08:51,750 --> 00:08:51,990 Yeah. 140 00:08:52,000 --> 00:08:56,620 See here how I said the valid error will be slightly higher than the training. 141 00:08:56,620 --> 00:08:57,220 That's good. 142 00:08:57,220 --> 00:09:03,400 That means we're not over feeling and now are squared off a training invalid or are far worse than the 143 00:09:03,400 --> 00:09:04,960 point nine eight seven we saw before. 144 00:09:04,960 --> 00:09:11,020 But that's all right because we're only training on a subset of 10000 examples not all 400000 so we'd 145 00:09:11,020 --> 00:09:18,700 expect these metrics to be worse than if we were to set Max samples to be none or train on all of the 146 00:09:18,700 --> 00:09:20,690 training data. 147 00:09:20,740 --> 00:09:28,600 Now what we might do is seen this is how valid I am s Ellie of point to nine I wonder where if we were 148 00:09:28,600 --> 00:09:32,030 to compare that figure with the leaderboard where would we end up. 149 00:09:32,050 --> 00:09:41,260 So point these is about 425 teams in this competition so if we were to submit our sample as it is now 150 00:09:41,380 --> 00:09:49,730 with our valid arm SLA score of point 2 9 3 only on 10000 samples where do we end up. 151 00:09:49,870 --> 00:09:51,670 Point 2 9 3. 152 00:09:52,050 --> 00:09:55,220 Well we're in the top 100. 153 00:09:55,610 --> 00:10:03,580 Looking at we're in the top 100 with just a simple model on ten thousand examples so somewhere in here. 154 00:10:03,670 --> 00:10:13,050 All right now we've got an idea of how to train our model a bit faster what we might do is try some 155 00:10:13,050 --> 00:10:18,860 different hyper parameters and I'll what you think about it before we get into the next video. 156 00:10:18,930 --> 00:10:24,180 What's a way to find different hyper parameters rather than us having to manually adjust these hyper 157 00:10:24,190 --> 00:10:28,770 parameters about random forest aggressor if you're not sure don't worry we'll cover in the next video 158 00:10:28,800 --> 00:10:29,490 but have a think. 159 00:10:30,120 --> 00:10:35,700 However we found better hyper parameters in the past because that's what we're gonna use for the next 160 00:10:35,700 --> 00:10:36,060 video.