1
00:00:00,510 --> 00:00:07,650
In this lecture we are going to see the impact of bullying later on the number of parameters we have

2
00:00:07,650 --> 00:00:13,390
to train and the execution time on our CNN model.

3
00:00:13,440 --> 00:00:20,770
First we will run the CNN model that we discussed in our last lecture with the polling earlier.

4
00:00:20,920 --> 00:00:29,850
Then we will remove the polling layer from that model to see its impact on the execution time

5
00:00:32,770 --> 00:00:38,800
so this is the architecture of the model that we have loved in the last lecture.

6
00:00:38,890 --> 00:00:43,630
First we have then put layer then on layer then pulling layer.

7
00:00:43,690 --> 00:00:48,890
Then we have two dense layer and then output layer.

8
00:00:49,300 --> 00:00:57,210
We are going to remove this pooling layer to notice its impact on the execution day.

9
00:00:57,880 --> 00:01:01,490
So the code will remain the same.

10
00:01:01,540 --> 00:01:04,660
First we have the con layer then pulling layer.

11
00:01:04,720 --> 00:01:10,960
Then like then layer then Dennis layer we are calling this model as model underscored it.

12
00:01:11,500 --> 00:01:20,050
This is same as the model we have loved in our last lecture and then we have a second model in which

13
00:01:20,260 --> 00:01:23,080
we are not taking this layer.

14
00:01:23,080 --> 00:01:32,150
So first we have the con layer then select and layer then to dense layer and one output layer.

15
00:01:32,380 --> 00:01:40,450
We are calling this model s model underscore B and later on we will compare the performance of Model

16
00:01:40,540 --> 00:01:40,840
A.

17
00:01:41,210 --> 00:01:43,630
What is this performance of Model B.

18
00:01:45,490 --> 00:01:49,370
Let's just run this.

19
00:01:49,750 --> 00:02:13,720
We can also look at the summary to get an idea of how many parameters our models start optimizing.

20
00:02:13,930 --> 00:02:26,470
So here you can see for the first dance layer we have got on one point six million parameters to train.

21
00:02:26,470 --> 00:02:30,760
If we compare it with Model B values

22
00:02:38,750 --> 00:02:46,490
you can see in our second model where we do not have any pooling layer the number of cranial parameters

23
00:02:47,210 --> 00:02:50,240
is around six point five million.

24
00:02:50,510 --> 00:02:58,820
So overall you can say that there are four times more trainable parameter in your model without pulling

25
00:02:58,820 --> 00:03:00,360
layer.

26
00:03:00,800 --> 00:03:07,760
And we know that the execution time is directly dependent on the number of parameters that we are going

27
00:03:07,760 --> 00:03:08,240
to train.

28
00:03:09,710 --> 00:03:19,080
So obviously we can expect that Model B really take a lot more time than more delay let's just combine

29
00:03:19,100 --> 00:03:29,000
both of these models and then we will run model 8 for 3 bucks then we will run Model B for three books

30
00:03:29,540 --> 00:03:35,180
and after that we will compare the execution paying for both models let's just first run model a

31
00:03:46,360 --> 00:03:55,960
now we have trained over that and as you can see for each Bob the execution time is own 31 seconds and

32
00:03:55,990 --> 00:04:02,140
after the completion of 30 book we are getting a validation accuracy of 82 were the same and same as

33
00:04:02,140 --> 00:04:06,140
the accuracy for training data as well.

34
00:04:06,190 --> 00:04:12,720
Now let's run the model without the pulling layer.

35
00:04:12,960 --> 00:04:34,340
Stan this.

36
00:04:34,670 --> 00:04:43,400
So now we have trained our second model as well and as you can see the execution time for each epoch

37
00:04:43,640 --> 00:04:51,170
is around 62 to succeed three seconds whereas for the one model one the execution time for each epoch

38
00:04:51,230 --> 00:04:54,740
was around 30 to 31 seconds.

39
00:04:54,740 --> 00:05:03,440
So the execution time is almost double for our model be that is the model without the pulling Lear as

40
00:05:03,440 --> 00:05:07,650
compared to a model with pulling Lear.

41
00:05:07,760 --> 00:05:11,810
You can also look at the accuracy score on training set.

42
00:05:11,930 --> 00:05:20,750
The accuracy of over model BS mode and on the validation set the accuracy is almost the same for both

43
00:05:20,960 --> 00:05:23,680
the models.

44
00:05:23,690 --> 00:05:32,240
This is because when we are pulling for pixels and the one pixel there is some information lost and

45
00:05:32,420 --> 00:05:37,120
that information loss is resulting in the lower accuracy.

46
00:05:37,160 --> 00:05:45,230
So if you are using pulling layer your execution time will be less and the accuracy will also be a little

47
00:05:45,230 --> 00:05:51,670
less as compared to a model without pulling their.

48
00:05:52,150 --> 00:06:00,790
In this case we have only used one constitutional layer but in real life scenario you may have to use

49
00:06:00,790 --> 00:06:11,030
multiple convolution layer and in such cases a use of pooling layer becomes much more important.

50
00:06:11,080 --> 00:06:19,810
So using pulling earlier you can significantly reduce your execution time without impacting much accuracy.