1
00:00:00,320 --> 00:00:01,320
All righty.

2
00:00:01,350 --> 00:00:04,860
So we've seen how to choose an estimate for a regression problem.

3
00:00:04,950 --> 00:00:07,590
How can we do the same for a classification problem.

4
00:00:08,130 --> 00:00:13,420
Well first of all let's get a data set that is a classification problem.

5
00:00:13,620 --> 00:00:14,730
Back in our midst.

6
00:00:14,760 --> 00:00:16,050
So we want data.

7
00:00:16,140 --> 00:00:21,150
We've already imported this above what we we're just gonna do it again just so we have it says we just

8
00:00:21,150 --> 00:00:22,800
to make sure it's the same.

9
00:00:23,160 --> 00:00:29,840
Heart disease heart disease dot head wonderful.

10
00:00:29,930 --> 00:00:35,740
So again we've seen this dataset before we want to use these columns of patient each row as a patient

11
00:00:35,750 --> 00:00:41,870
so each sample is a patient each column or each feature is a health attribute of that particular patient

12
00:00:42,200 --> 00:00:46,280
and the target 1 or 0 is whether that patient has heart disease or not.

13
00:00:46,280 --> 00:00:51,590
So our model classification predicting something is one thing or another heart disease or not.

14
00:00:52,310 --> 00:00:53,540
That's what we need.

15
00:00:53,540 --> 00:00:58,380
So what we're going to do is we left ourselves this little link here to go visit the maps we're going

16
00:00:58,380 --> 00:01:06,680
to pay attention to that we go here so we know we want something in the classification realm but what

17
00:01:06,680 --> 00:01:10,850
we're going to do is we're gonna start from the top and follow this through.

18
00:01:11,030 --> 00:01:14,150
So start do we have above 50 samples.

19
00:01:14,150 --> 00:01:22,470
Well let's check each round with a sample so we want Len heart disease 303.

20
00:01:22,850 --> 00:01:24,230
Yes we can.

21
00:01:24,230 --> 00:01:25,940
Are we predicting category.

22
00:01:25,940 --> 00:01:29,380
Are we heart disease or not heart disease.

23
00:01:29,490 --> 00:01:30,860
Sounds like a category to me.

24
00:01:30,870 --> 00:01:31,700
Yes.

25
00:01:31,710 --> 00:01:33,420
Do we have label data.

26
00:01:33,420 --> 00:01:34,370
Yes.

27
00:01:34,410 --> 00:01:37,200
Do we have under 100 k samples.

28
00:01:37,200 --> 00:01:38,450
Yes.

29
00:01:38,580 --> 00:01:39,580
What is this.

30
00:01:39,600 --> 00:01:41,140
We reach a little green box.

31
00:01:41,160 --> 00:01:42,120
Beautiful.

32
00:01:42,150 --> 00:01:46,780
So remember the green boxes are estimated as or machine learning algorithms.

33
00:01:46,860 --> 00:01:47,760
So let's have a look.

34
00:01:47,760 --> 00:01:52,710
This is telling us our problems classification and maybe the first one we come to is trying a linear

35
00:01:52,800 --> 00:01:54,790
as we see.

36
00:01:55,380 --> 00:02:02,610
All right classification SBC and Linnea SBC a class is capable of performing multicast classification

37
00:02:02,700 --> 00:02:03,520
on a data set.

38
00:02:03,960 --> 00:02:06,610
Well this one particularly said linear SBC.

39
00:02:06,810 --> 00:02:07,980
So let's have a look.

40
00:02:08,310 --> 00:02:14,070
Linear Support Vector classification similar to SBC with parameter kernel linear but implement in terms

41
00:02:14,070 --> 00:02:17,440
of live linear rather than lib SVM.

42
00:02:17,580 --> 00:02:20,000
That's a lot of times and most of my I don't really understand.

43
00:02:20,010 --> 00:02:28,870
But what what's important here is that if we go down to the code example we can see from S.K. learned

44
00:02:28,870 --> 00:02:37,330
on SVM import linear SPC x y oh they've used x and y CSF classifier equals linear as we.

45
00:02:37,390 --> 00:02:37,770
All right.

46
00:02:37,780 --> 00:02:39,340
This is kind of similar.

47
00:02:39,340 --> 00:02:43,660
Well let's not take the documentations word for it let's try it for ourselves.

48
00:02:43,670 --> 00:02:55,990
We'll go import actually we'll leave ourselves a note consulting the map and it says to try Linnea SABC

49
00:02:57,600 --> 00:03:07,190
so that will be our first port of call in Port Linnea as we see estimated class we go from SBA loan

50
00:03:07,940 --> 00:03:13,460
SVM and really what I'm doing here is I'm basically just gonna rewrite this spot for our problem.

51
00:03:13,790 --> 00:03:27,940
So SBA loan but SVM important in the SABC import Linnea SABC wonderful set up random seed and paid out

52
00:03:27,950 --> 00:03:31,350
random dot seed and we'll use our faithful 42.

53
00:03:31,610 --> 00:03:34,430
And then we're gonna go make the data.

54
00:03:34,430 --> 00:03:37,740
We saw this in the previous section section 1 Get your data ready.

55
00:03:37,820 --> 00:03:42,750
X equals heart disease dot drop.

56
00:03:43,100 --> 00:03:53,480
I want to get rid of the target column on axis equals 1 and Y equals heart disease Y is actually the

57
00:03:53,480 --> 00:03:54,160
target column.

58
00:03:54,200 --> 00:03:54,830
Beautiful.

59
00:03:55,190 --> 00:04:02,690
And then when you go split the data x test or no train comes first.

60
00:04:02,820 --> 00:04:06,510
So he may have done this about 100 times now and still getting it wrong.

61
00:04:06,510 --> 00:04:11,340
See this is what it takes it takes a little bit of practice or it takes a lot of practice to start applying

62
00:04:11,340 --> 00:04:17,130
stop becoming machine learning practitioner knowing how to deal with data knowing how to split data

63
00:04:17,370 --> 00:04:18,960
knowing how to model data.

64
00:04:18,990 --> 00:04:21,680
It's all part of the practice test songs.

65
00:04:21,690 --> 00:04:22,860
Yes that is correct.

66
00:04:22,860 --> 00:04:27,460
That's what we want to do instantiate linear SBC.

67
00:04:27,810 --> 00:04:35,360
So we call it C.L. short for classifier Linnea SABC and we're gonna go.

68
00:04:35,380 --> 00:04:38,500
That'll do actually say a left not fit.

69
00:04:38,790 --> 00:04:53,660
We want X train y train and then we're gonna go evaluate linear and say C a left on score x test y test.

70
00:04:53,660 --> 00:04:56,430
All right beautiful let's say this in action.

71
00:04:58,010 --> 00:05:00,380
Wonderful linear fail to converge.

72
00:05:00,380 --> 00:05:02,890
Increase the number of iterations.

73
00:05:03,010 --> 00:05:08,660
Well this is one of those things where if you came into an error like this what I might do is go convergence

74
00:05:08,660 --> 00:05:13,610
warning I google this message and see what came up because I've had a little bit of experience with

75
00:05:13,610 --> 00:05:14,030
it.

76
00:05:14,300 --> 00:05:20,390
I might do something like Max it equals a thousand and it still happens.

77
00:05:20,540 --> 00:05:20,930
Right.

78
00:05:20,930 --> 00:05:23,110
So increase the number of iterations.

79
00:05:23,180 --> 00:05:23,840
We've done that.

80
00:05:23,840 --> 00:05:30,620
We've gone Max it equals a thousand but this is actually how much does it start Max iteration or is

81
00:05:30,620 --> 00:05:31,820
default by a thousand.

82
00:05:31,820 --> 00:05:34,040
Let's say 10000 still there.

83
00:05:34,460 --> 00:05:41,180
Well let's bypass that for the time being we're getting a warning and see here the score is below point

84
00:05:41,180 --> 00:05:41,660
five.

85
00:05:41,660 --> 00:05:44,300
Now why is that a trigger in our heads.

86
00:05:44,300 --> 00:05:56,330
Well because if we look at our data we want heart disease target don't value Kant's

87
00:06:00,560 --> 00:06:07,490
so we say there's only two classes 1 0 0 does someone have heart disease or not.

88
00:06:07,490 --> 00:06:15,190
Right so it's a binary classification problem binary meaning one or the other and our model is receiving

89
00:06:15,190 --> 00:06:21,010
a score of point four seven returns the mean accuracy on a given test date and labels.

90
00:06:21,010 --> 00:06:24,870
As I said we'll look into evaluation metrics in in a future section.

91
00:06:25,000 --> 00:06:30,060
But what you can imagine this is our model is only operating at 47 per cent accuracy.

92
00:06:30,430 --> 00:06:32,160
And why does that trigger us.

93
00:06:32,170 --> 00:06:34,510
Well because there's only two classes.

94
00:06:34,510 --> 00:06:40,120
So that means if we were just guessing whether someone had heart disease or not but based off their

95
00:06:40,390 --> 00:06:46,870
health attributes we were just guessing yes or no we would get 50 per cent it's basically a coin toss.

96
00:06:46,870 --> 00:06:52,150
So what's that triggering in a head and saying hey without fixing this warning or without improving

97
00:06:52,150 --> 00:06:58,780
our model by choosing the hyper parameters our linear SPC model might not be finding patterns between

98
00:06:59,050 --> 00:07:00,310
x and y.

99
00:07:00,310 --> 00:07:01,120
So what can we do.

100
00:07:01,480 --> 00:07:05,560
Well when in doubt come back to the graphic not working.

101
00:07:05,740 --> 00:07:11,040
Are we working with test data because we just tried linear as we say so we're going to say not working.

102
00:07:11,080 --> 00:07:13,620
We were going to test data no.

103
00:07:13,670 --> 00:07:18,640
Is going to say caner is neighbors classifier but I'm going to skip this one I'm going to go to ensemble

104
00:07:18,640 --> 00:07:20,150
classifiers.

105
00:07:20,290 --> 00:07:21,790
So let's click on this.

106
00:07:21,790 --> 00:07:22,590
Now why this.

107
00:07:22,750 --> 00:07:30,730
Because we've seen this before ensemble methods forest of randomize trees S.K. lone ensemble from S.K.

108
00:07:30,740 --> 00:07:33,820
lone ensemble import random forest classifier.

109
00:07:33,820 --> 00:07:40,510
Now if you remember back up in our regression problem when we switched to using a random forest aggressor

110
00:07:40,900 --> 00:07:47,320
we saw a bump in the score well because random forests regressive did so well.

111
00:07:47,530 --> 00:07:52,140
Good news is for us it's got a dance partner called random forest classifier.

112
00:07:52,240 --> 00:07:53,910
We've actually seen this before.

113
00:07:54,040 --> 00:08:00,700
But what we're going to do is compare it to linear SPC so to save time I'm going to copy this code and

114
00:08:00,700 --> 00:08:01,870
I'm gonna bring it down here.

115
00:08:02,590 --> 00:08:14,340
So we want to change this to random forest classifier we actually want ensemble import random forest

116
00:08:14,520 --> 00:08:16,930
classifier wonderful.

117
00:08:17,100 --> 00:08:25,050
We can keep the rest the same except for this we need to actually sub this and random forest classifier

118
00:08:26,640 --> 00:08:35,850
and we want to get random forest classifier wonderful we'll do the same here just to tidy up our notes

119
00:08:35,850 --> 00:08:37,840
keep our code communicated.

120
00:08:38,190 --> 00:08:38,730
Wonderful.

121
00:08:38,730 --> 00:08:42,510
And now what do you think will happen here 3 to 1.

122
00:08:42,540 --> 00:08:48,790
Let's see error fit missing one argument why oh there we go.

123
00:08:48,820 --> 00:08:49,700
That's what we've forgotten.

124
00:08:49,700 --> 00:08:53,460
Forgot to add a little brackets on the end.

125
00:08:53,550 --> 00:08:58,560
Wonderful No to get rid of this warning we need to change an estimate as I could see here the default

126
00:08:58,560 --> 00:09:02,420
value of any estimate is will change from 10 inversion zero point two to 100.

127
00:09:02,430 --> 00:09:10,060
So we'll just get rid of that by changing an estimate as equals 100 delicious wonderful.

128
00:09:10,100 --> 00:09:11,450
So what's happened here.

129
00:09:11,480 --> 00:09:20,120
We've used a different model now linear SBC has scored 47 percent accuracy and our random forest classifier

130
00:09:20,150 --> 00:09:25,940
has scored 85 percent accuracy so nearly nearly double by just using another model.

131
00:09:26,160 --> 00:09:26,420
Right.

132
00:09:26,420 --> 00:09:34,230
So you can see why I'm kind of jumping to using random forest if we go back all I've done is I've gone

133
00:09:34,230 --> 00:09:37,270
start I figured out what kind of problem we're trying to solve.

134
00:09:37,320 --> 00:09:38,790
You can do this as well.

135
00:09:38,880 --> 00:09:43,560
So for our regression problem we were trying to figure out whether something was a number for our classification

136
00:09:43,560 --> 00:09:45,130
problem we're trying to predict a label.

137
00:09:45,240 --> 00:09:46,200
Do we have label data.

138
00:09:46,260 --> 00:09:46,740
Yes.

139
00:09:46,740 --> 00:09:47,910
Are we producing category.

140
00:09:47,970 --> 00:09:48,930
Yes.

141
00:09:48,990 --> 00:09:54,750
And now Alinea S.E.C. model wasn't working so we went straight to ensemble classifiers.

142
00:09:55,020 --> 00:09:58,380
And now you might be thinking what about all the other models right.

143
00:09:58,380 --> 00:10:00,780
So we've kind of skipped all these ones here.

144
00:10:00,990 --> 00:10:05,880
All of these and we've only really touched the surface we've only used this one an ensemble classifiers

145
00:10:05,880 --> 00:10:06,830
from both.

146
00:10:06,910 --> 00:10:08,450
And that's a great question.

147
00:10:08,460 --> 00:10:13,740
Well the first reason is namely for time we could go through and try all of these.

148
00:10:13,740 --> 00:10:17,340
And the second reason is there's a little tidbit in machine learning.

149
00:10:17,790 --> 00:10:25,170
If you have structured data a.k.a. tables or data frames use ensemble methods such as random forest.

150
00:10:25,170 --> 00:10:26,100
Why.

151
00:10:26,220 --> 00:10:27,840
Because it'll perform pretty well.

152
00:10:28,050 --> 00:10:32,320
If there are patterns this is a tidbit here.

153
00:10:32,340 --> 00:10:39,360
Heart disease might write this down so we know I'll put this in a resources section as well.

154
00:10:39,360 --> 00:10:55,750
To is one if you have structured data use ensemble methods and two is if you have unstructured data

155
00:10:56,470 --> 00:11:02,190
use deep learning or transfer learning.

156
00:11:02,500 --> 00:11:05,350
Now what's an example of structured data.

157
00:11:05,350 --> 00:11:12,940
While stuff in a table like this and if we have unstructured data which is like images audio or text

158
00:11:12,940 --> 00:11:16,120
or something like that use deep learning or transfer learning.

159
00:11:16,120 --> 00:11:21,310
Now we haven't covered deep learning and transfer learning yet but since we have structured data a.k.a.

160
00:11:21,310 --> 00:11:27,430
things in a data frame that Tibetan his use ensemble methods which is hence why we've kind of gone you

161
00:11:27,430 --> 00:11:31,660
know what I'm not going to try any of these I'm going straight to ensemble classifiers straight to the

162
00:11:31,660 --> 00:11:37,370
random for us because the random forest is known for its robustness and ability to find patterns.

163
00:11:37,420 --> 00:11:44,660
Now who we've covered a lot we've had to look at these machine learning map if we're still looking at

164
00:11:44,660 --> 00:11:47,240
this going Holy gosh what's going on.

165
00:11:47,240 --> 00:11:47,770
Don't worry.

166
00:11:47,780 --> 00:11:53,270
It took me a while to figure it out too but it really clicked once I realized hold on the first step

167
00:11:53,300 --> 00:11:57,460
is to just get the data figure out the main problem we're trying to solve.

168
00:11:57,650 --> 00:12:02,690
Usually regression of classification they're the two most common ones you come across and then start

169
00:12:02,720 --> 00:12:08,240
answering the questions going through it and then use a little framework something as simple as this

170
00:12:08,570 --> 00:12:15,260
to start running machine learning estimate or machine learning model on your data and getting some feedback

171
00:12:15,260 --> 00:12:16,410
from it quickly.

172
00:12:16,460 --> 00:12:21,770
That's the most important part being a data scientist or machine learning engineer is reducing the time

173
00:12:21,770 --> 00:12:23,700
between your experiments.

174
00:12:23,720 --> 00:12:27,950
So what we're gonna do now is we've seen how to choose a model.

175
00:12:28,190 --> 00:12:32,720
We're gonna get into the next section which is fitting a model to the data and then using it to make

176
00:12:32,720 --> 00:12:35,840
predictions because that's really the essence of machine learning right.

177
00:12:35,840 --> 00:12:41,020
Finding patterns in data and then using those patterns to make predictions on future data.

178
00:12:41,050 --> 00:12:42,690
A model hasn't seen before.

179
00:12:42,770 --> 00:12:44,930
Take a quick break and I'll see you in the next video.