1
00:00:00,480 --> 00:00:01,410
All right.

2
00:00:01,450 --> 00:00:04,120
So now we've got our own custom evaluation functions.

3
00:00:04,130 --> 00:00:06,430
We can evaluate how our model is doing.

4
00:00:06,440 --> 00:00:12,170
So the next logical step would be to train another one on the training data set and then evaluate it

5
00:00:12,230 --> 00:00:19,750
on the validation data set or just using this beautiful function that we've crafted not so fast.

6
00:00:19,760 --> 00:00:21,170
Let's see why.

7
00:00:21,170 --> 00:00:26,180
So testing our model this section is gonna be we're gonna do it on a subset.

8
00:00:26,360 --> 00:00:31,250
So testing our model on a subset.

9
00:00:31,250 --> 00:00:38,690
Now this is mostly to tune the hybrid parameters because remember what we've been stressing we want

10
00:00:38,690 --> 00:00:43,780
to decrease the amount of time it takes us between experiments and if we come back up.

11
00:00:43,820 --> 00:00:49,580
Remember how we time traveled in a previous video that took six minutes 58 seconds to fit a model on

12
00:00:49,580 --> 00:00:50,430
all those data.

13
00:00:50,450 --> 00:00:54,140
So on 400000 Rose we don't have enough time to do that again.

14
00:00:54,140 --> 00:00:56,840
We can only time travel every so often.

15
00:00:57,050 --> 00:01:04,130
And so if we were to just ride it out again we would instantiate our models and model equals random

16
00:01:04,520 --> 00:01:13,810
forest regress our end jobs equals negative one and a random state of 42.

17
00:01:14,060 --> 00:01:14,650
Wonderful.

18
00:01:14,660 --> 00:01:22,160
And then if we just did modeled outfit X train I need to be careful here not to be trigger happy if

19
00:01:22,160 --> 00:01:24,180
we did this right.

20
00:01:24,210 --> 00:01:30,860
If we put a little time up here time that's gonna take far too long.

21
00:01:30,930 --> 00:01:31,140
Right.

22
00:01:31,160 --> 00:01:33,190
So I might put a note here.

23
00:01:33,650 --> 00:01:41,990
This takes far too long for experimenting because that's going to take about five minutes.

24
00:01:42,110 --> 00:01:43,960
That's not what we're after.

25
00:01:43,970 --> 00:01:48,320
And again if you have run this cell before actually I really should've told you this earlier if you

26
00:01:48,320 --> 00:01:51,570
did run this cell on your computer and it's taking far too long for it to load.

27
00:01:51,570 --> 00:01:54,950
Remember you can always press the stop button up here.

28
00:01:54,950 --> 00:01:58,970
So if we run this Oh well that's too fast that runs too fast.

29
00:01:58,970 --> 00:02:02,420
But if your cell is running remember how there's a little star here I'll show you.

30
00:02:02,420 --> 00:02:05,500
There'll be a star while that that happens too quick for this.

31
00:02:05,530 --> 00:02:08,000
That you got to trust me on this one when you run a cell.

32
00:02:08,000 --> 00:02:10,090
This will turn into a star if it's running.

33
00:02:10,130 --> 00:02:15,950
You can stop it by pressing this button with this cell highlighted and that'll stop that cell so come

34
00:02:15,960 --> 00:02:16,710
back.

35
00:02:16,740 --> 00:02:17,570
Let's not run that.

36
00:02:17,580 --> 00:02:20,980
We'll make sure we can't so we commented out beautiful.

37
00:02:21,330 --> 00:02:22,570
Because why.

38
00:02:22,620 --> 00:02:31,440
Because our data frame is 400000 rows there and my little MacBook Pro is already calculate a lot of

39
00:02:31,440 --> 00:02:36,780
patents in 400000 rise and it's like well I'm going to take a while so you need to be patient with me.

40
00:02:36,780 --> 00:02:40,970
Well that's fair enough because I would take far longer than my MacBook Pro does to calculate three

41
00:02:40,980 --> 00:02:42,720
four hundred thousand rows.

42
00:02:42,720 --> 00:02:49,200
So what our solution is is to speed up our time between experiments and that's what we're doing testing

43
00:02:49,200 --> 00:02:50,250
our model on a subset.

44
00:02:50,250 --> 00:02:55,610
We could do something like this could just copy this model of it.

45
00:02:55,710 --> 00:02:58,310
Maybe we want to only do it on ten thousand rows

46
00:03:01,410 --> 00:03:02,630
but there's an even better way.

47
00:03:02,640 --> 00:03:06,350
And it's right built into we could do that right.

48
00:03:06,360 --> 00:03:08,860
That's going to test it on 10000 rows.

49
00:03:09,690 --> 00:03:13,030
But now if there's a better way to do it that is one option.

50
00:03:13,050 --> 00:03:17,880
So keep that in mind that you can just slice your you're training Denison but the beautiful thing about

51
00:03:17,880 --> 00:03:22,620
random forest and this is why we're focusing on it so much is that they have this little feature in

52
00:03:22,620 --> 00:03:24,970
it called Max samples.

53
00:03:24,990 --> 00:03:30,230
So change Max sample's value.

54
00:03:30,490 --> 00:03:30,990
All right.

55
00:03:30,990 --> 00:03:35,740
And if we wanted to figure out what Max samples does where would you go.

56
00:03:35,890 --> 00:03:39,960
Well one option is if we go model equals random forest regress.

57
00:03:39,970 --> 00:03:43,310
We're gonna have heaps of practice instantiating these models.

58
00:03:43,480 --> 00:03:47,860
It's what it's about though Ryan is just about almost making this second nature of what it's like to

59
00:03:47,870 --> 00:03:50,460
just get a model instantiate and get it ready.

60
00:03:50,470 --> 00:03:53,220
Random state equals 42.

61
00:03:53,260 --> 00:04:00,550
There's our model so that could be one option but if we press shift tab here we'll see the doc string

62
00:04:00,700 --> 00:04:05,670
Max features not Max samples Max samples.

63
00:04:05,770 --> 00:04:06,950
Boom.

64
00:04:07,000 --> 00:04:09,730
Now if we come down here let's find it.

65
00:04:11,330 --> 00:04:15,330
Max samples there we go.

66
00:04:15,430 --> 00:04:21,700
If bootstrap is true the number of samples to draw from X to train each base estimate.

67
00:04:22,540 --> 00:04:23,500
Mm hmm.

68
00:04:23,590 --> 00:04:28,550
What does this mean if a guy model like bootstrap equals true.

69
00:04:28,580 --> 00:04:29,780
Yes.

70
00:04:29,780 --> 00:04:35,000
And now each estimate as sees Max samples that way as Max samples none.

71
00:04:35,030 --> 00:04:36,490
So what does it mean if none.

72
00:04:36,800 --> 00:04:43,780
It's gonna be all isn't it let's have a look this is what I want you to have some practice doing right

73
00:04:43,810 --> 00:04:48,880
is if you're not sure on something it's looking it up and reading it in the docs.

74
00:04:48,880 --> 00:04:52,100
If bootstrap is true the number of samples to draw from x.

75
00:04:52,270 --> 00:04:54,920
Okay so X is what we pass a model.

76
00:04:55,030 --> 00:04:59,400
If none then draw X dot shape samples so it's going to draw them all.

77
00:04:59,620 --> 00:05:03,000
Now if and then draw Max sample samples.

78
00:05:03,040 --> 00:05:14,580
So what this means is that if we go X train does shape zero so if we leave Max samples as none every

79
00:05:14,580 --> 00:05:19,760
estimate are in our case there's 100 because the default end estimate is in random forest regressive

80
00:05:19,780 --> 00:05:21,280
is 100.

81
00:05:21,390 --> 00:05:24,770
So you can imagine an estimated as a small model.

82
00:05:24,930 --> 00:05:26,430
So each small model.

83
00:05:26,430 --> 00:05:31,410
So one hundred small models are going to see every single one of these.

84
00:05:31,500 --> 00:05:35,550
So that's that's a large number right.

85
00:05:35,910 --> 00:05:38,850
So this is all the calculations that we have to make probably more than this.

86
00:05:39,240 --> 00:05:40,350
But this is what you can imagine.

87
00:05:40,770 --> 00:05:46,660
But instead if we set Max samples to something else to say for example maybe if we only wanted to to

88
00:05:46,670 --> 00:05:50,350
try 10000 examples instead of four hundred thousand.

89
00:05:50,400 --> 00:05:50,660
Right.

90
00:05:50,670 --> 00:06:00,910
So closer to the shape of our validation set we could change Max samples set to ten thousand.

91
00:06:01,050 --> 00:06:08,920
So this means that instead of looking at 400000 rows every estimate up to 100 estimates for an estimate

92
00:06:08,920 --> 00:06:14,010
as we could also change this to be honest but we'll leave it as default just for this example.

93
00:06:14,010 --> 00:06:19,610
So that means because we change it to ten thousand times 100 well.

94
00:06:20,270 --> 00:06:23,150
So let's have a look here.

95
00:06:23,240 --> 00:06:25,350
Let's divide these two.

96
00:06:25,430 --> 00:06:27,950
How many times less what do we got here.

97
00:06:27,950 --> 00:06:31,590
What numbers and how many zeros a million.

98
00:06:32,320 --> 00:06:37,610
That's a million I can count 40 times less.

99
00:06:37,610 --> 00:06:40,060
So 40 times less data to compute on.

100
00:06:40,070 --> 00:06:46,130
So that means we should see a speed increase of 40 x or about that.

101
00:06:46,130 --> 00:06:46,400
Right.

102
00:06:46,400 --> 00:06:48,150
Because it's not going to be perfect.

103
00:06:48,230 --> 00:06:51,500
So let's go here change what we might do.

104
00:06:52,400 --> 00:06:52,610
Yeah.

105
00:06:52,640 --> 00:06:53,880
That looks good.

106
00:06:54,170 --> 00:06:55,900
And then we'll create a new selling year.

107
00:06:55,950 --> 00:06:58,580
We'll need our little time here because this is what we want to practice.

108
00:06:58,580 --> 00:07:09,730
So cutting down on the max number of samples each estimate I can see improves.

109
00:07:09,800 --> 00:07:13,910
This is our hypothesis improved training time.

110
00:07:14,020 --> 00:07:19,130
Now let's have a look model dot fit we'll do it on our training data now.

111
00:07:21,290 --> 00:07:27,510
So this shouldn't take too long maybe we might even have time to wait for it still might take a few

112
00:07:27,510 --> 00:07:27,960
seconds

113
00:07:33,130 --> 00:07:40,120
this might be another opportunity the time travel Hall didn't need own child will march.

114
00:07:40,120 --> 00:07:44,080
Look at that wall time 16 seconds instead of seven minutes.

115
00:07:44,080 --> 00:07:45,690
Now that's a bit better right now.

116
00:07:45,730 --> 00:07:49,780
That's something that we can potentially just sit here and wait for if you're doing seven minute experiments

117
00:07:49,780 --> 00:07:54,510
all the time or even longer if you had more data you're going to lose patience you're gonna get distracted.

118
00:07:54,520 --> 00:07:59,260
But if we can keep our experiments like somewhere around 10 seconds so that's that's pretty good right.

119
00:07:59,260 --> 00:08:00,740
I can wait 10 seconds.

120
00:08:00,820 --> 00:08:07,200
And now the beauty comes in here when we can call our show scores function.

121
00:08:07,240 --> 00:08:12,730
So let's see what happens when we go show scores hopefully this works because we've coded up this function

122
00:08:12,730 --> 00:08:13,520
ourselves.

123
00:08:13,630 --> 00:08:17,850
Of course it will show scores.

124
00:08:18,230 --> 00:08:21,330
What are we going to find.

125
00:08:21,430 --> 00:08:22,470
Oh no.

126
00:08:22,570 --> 00:08:23,240
We've messed it up.

127
00:08:23,240 --> 00:08:26,080
No I has no spirit.

128
00:08:26,140 --> 00:08:28,370
Instead of square root this is where we've messed up.

129
00:08:28,480 --> 00:08:29,650
None pi.

130
00:08:29,650 --> 00:08:30,270
There we go.

131
00:08:31,780 --> 00:08:35,960
Typo classic couldn't get it right the first go could you Daniel.

132
00:08:36,940 --> 00:08:38,300
Nope.

133
00:08:38,320 --> 00:08:39,670
So we'll see how our model does.

134
00:08:39,670 --> 00:08:45,740
We've just trained it here on a subset of the data be you to a fall.

135
00:08:46,300 --> 00:08:47,370
Oh right.

136
00:08:47,410 --> 00:08:48,980
Look at that.

137
00:08:48,980 --> 00:08:50,920
So training mean absolute error.

138
00:08:51,130 --> 00:08:51,730
So valid.

139
00:08:51,750 --> 00:08:51,990
Yeah.

140
00:08:52,000 --> 00:08:56,620
See here how I said the valid error will be slightly higher than the training.

141
00:08:56,620 --> 00:08:57,220
That's good.

142
00:08:57,220 --> 00:09:03,400
That means we're not over feeling and now are squared off a training invalid or are far worse than the

143
00:09:03,400 --> 00:09:04,960
point nine eight seven we saw before.

144
00:09:04,960 --> 00:09:11,020
But that's all right because we're only training on a subset of 10000 examples not all 400000 so we'd

145
00:09:11,020 --> 00:09:18,700
expect these metrics to be worse than if we were to set Max samples to be none or train on all of the

146
00:09:18,700 --> 00:09:20,690
training data.

147
00:09:20,740 --> 00:09:28,600
Now what we might do is seen this is how valid I am s Ellie of point to nine I wonder where if we were

148
00:09:28,600 --> 00:09:32,030
to compare that figure with the leaderboard where would we end up.

149
00:09:32,050 --> 00:09:41,260
So point these is about 425 teams in this competition so if we were to submit our sample as it is now

150
00:09:41,380 --> 00:09:49,730
with our valid arm SLA score of point 2 9 3 only on 10000 samples where do we end up.

151
00:09:49,870 --> 00:09:51,670
Point 2 9 3.

152
00:09:52,050 --> 00:09:55,220
Well we're in the top 100.

153
00:09:55,610 --> 00:10:03,580
Looking at we're in the top 100 with just a simple model on ten thousand examples so somewhere in here.

154
00:10:03,670 --> 00:10:13,050
All right now we've got an idea of how to train our model a bit faster what we might do is try some

155
00:10:13,050 --> 00:10:18,860
different hyper parameters and I'll what you think about it before we get into the next video.

156
00:10:18,930 --> 00:10:24,180
What's a way to find different hyper parameters rather than us having to manually adjust these hyper

157
00:10:24,190 --> 00:10:28,770
parameters about random forest aggressor if you're not sure don't worry we'll cover in the next video

158
00:10:28,800 --> 00:10:29,490
but have a think.

159
00:10:30,120 --> 00:10:35,700
However we found better hyper parameters in the past because that's what we're gonna use for the next

160
00:10:35,700 --> 00:10:36,060
video.