1 00:00:00,510 --> 00:00:07,650 In this lecture we are going to see the impact of bullying later on the number of parameters we have 2 00:00:07,650 --> 00:00:13,390 to train and the execution time on our CNN model. 3 00:00:13,440 --> 00:00:20,770 First we will run the CNN model that we discussed in our last lecture with the polling earlier. 4 00:00:20,920 --> 00:00:29,850 Then we will remove the polling layer from that model to see its impact on the execution time 5 00:00:32,770 --> 00:00:38,800 so this is the architecture of the model that we have loved in the last lecture. 6 00:00:38,890 --> 00:00:43,630 First we have then put layer then on layer then pulling layer. 7 00:00:43,690 --> 00:00:48,890 Then we have two dense layer and then output layer. 8 00:00:49,300 --> 00:00:57,210 We are going to remove this pooling layer to notice its impact on the execution day. 9 00:00:57,880 --> 00:01:01,490 So the code will remain the same. 10 00:01:01,540 --> 00:01:04,660 First we have the con layer then pulling layer. 11 00:01:04,720 --> 00:01:10,960 Then like then layer then Dennis layer we are calling this model as model underscored it. 12 00:01:11,500 --> 00:01:20,050 This is same as the model we have loved in our last lecture and then we have a second model in which 13 00:01:20,260 --> 00:01:23,080 we are not taking this layer. 14 00:01:23,080 --> 00:01:32,150 So first we have the con layer then select and layer then to dense layer and one output layer. 15 00:01:32,380 --> 00:01:40,450 We are calling this model s model underscore B and later on we will compare the performance of Model 16 00:01:40,540 --> 00:01:40,840 A. 17 00:01:41,210 --> 00:01:43,630 What is this performance of Model B. 18 00:01:45,490 --> 00:01:49,370 Let's just run this. 19 00:01:49,750 --> 00:02:13,720 We can also look at the summary to get an idea of how many parameters our models start optimizing. 20 00:02:13,930 --> 00:02:26,470 So here you can see for the first dance layer we have got on one point six million parameters to train. 21 00:02:26,470 --> 00:02:30,760 If we compare it with Model B values 22 00:02:38,750 --> 00:02:46,490 you can see in our second model where we do not have any pooling layer the number of cranial parameters 23 00:02:47,210 --> 00:02:50,240 is around six point five million. 24 00:02:50,510 --> 00:02:58,820 So overall you can say that there are four times more trainable parameter in your model without pulling 25 00:02:58,820 --> 00:03:00,360 layer. 26 00:03:00,800 --> 00:03:07,760 And we know that the execution time is directly dependent on the number of parameters that we are going 27 00:03:07,760 --> 00:03:08,240 to train. 28 00:03:09,710 --> 00:03:19,080 So obviously we can expect that Model B really take a lot more time than more delay let's just combine 29 00:03:19,100 --> 00:03:29,000 both of these models and then we will run model 8 for 3 bucks then we will run Model B for three books 30 00:03:29,540 --> 00:03:35,180 and after that we will compare the execution paying for both models let's just first run model a 31 00:03:46,360 --> 00:03:55,960 now we have trained over that and as you can see for each Bob the execution time is own 31 seconds and 32 00:03:55,990 --> 00:04:02,140 after the completion of 30 book we are getting a validation accuracy of 82 were the same and same as 33 00:04:02,140 --> 00:04:06,140 the accuracy for training data as well. 34 00:04:06,190 --> 00:04:12,720 Now let's run the model without the pulling layer. 35 00:04:12,960 --> 00:04:34,340 Stan this. 36 00:04:34,670 --> 00:04:43,400 So now we have trained our second model as well and as you can see the execution time for each epoch 37 00:04:43,640 --> 00:04:51,170 is around 62 to succeed three seconds whereas for the one model one the execution time for each epoch 38 00:04:51,230 --> 00:04:54,740 was around 30 to 31 seconds. 39 00:04:54,740 --> 00:05:03,440 So the execution time is almost double for our model be that is the model without the pulling Lear as 40 00:05:03,440 --> 00:05:07,650 compared to a model with pulling Lear. 41 00:05:07,760 --> 00:05:11,810 You can also look at the accuracy score on training set. 42 00:05:11,930 --> 00:05:20,750 The accuracy of over model BS mode and on the validation set the accuracy is almost the same for both 43 00:05:20,960 --> 00:05:23,680 the models. 44 00:05:23,690 --> 00:05:32,240 This is because when we are pulling for pixels and the one pixel there is some information lost and 45 00:05:32,420 --> 00:05:37,120 that information loss is resulting in the lower accuracy. 46 00:05:37,160 --> 00:05:45,230 So if you are using pulling layer your execution time will be less and the accuracy will also be a little 47 00:05:45,230 --> 00:05:51,670 less as compared to a model without pulling their. 48 00:05:52,150 --> 00:06:00,790 In this case we have only used one constitutional layer but in real life scenario you may have to use 49 00:06:00,790 --> 00:06:11,030 multiple convolution layer and in such cases a use of pooling layer becomes much more important. 50 00:06:11,080 --> 00:06:19,810 So using pulling earlier you can significantly reduce your execution time without impacting much accuracy.