1 00:00:00,790 --> 00:00:10,270 And the last video we have seen how we can do scrapie and we have combined some countries in one table 2 00:00:10,690 --> 00:00:13,460 like India, Pakistan and US population. 3 00:00:13,600 --> 00:00:16,400 So in this video, we will do the second part. 4 00:00:17,500 --> 00:00:19,510 So let me show you the website again. 5 00:00:20,620 --> 00:00:24,880 So here you can see we have a different countries population. 6 00:00:24,880 --> 00:00:25,240 Right. 7 00:00:25,900 --> 00:00:37,960 So, for example, if I click China and then I need to know that different like cities, the population 8 00:00:37,960 --> 00:00:49,150 of the cities, and now the requirement is we need to go to each country and then we have to count the 9 00:00:49,150 --> 00:00:55,570 average population in that country according to the city. 10 00:00:55,600 --> 00:01:04,150 So we will take the average of all the population of all the cities, and then we will display that 11 00:01:04,150 --> 00:01:06,220 average in front of the country. 12 00:01:06,580 --> 00:01:06,950 Right. 13 00:01:07,690 --> 00:01:10,210 So let's do it step by step. 14 00:01:11,230 --> 00:01:19,160 So for this, we need to copy the China population link and then go to the power grid. 15 00:01:19,240 --> 00:01:29,470 Ed, here we have to click and nuclearised and web and let me paste the link and click. 16 00:01:29,470 --> 00:01:33,580 OK, so here we have to choose the table six. 17 00:01:35,230 --> 00:01:43,570 Because it contains all the cities for the selected country, this is China, so I need to click and 18 00:01:43,570 --> 00:01:52,090 then I click, OK, we are interested in population, so we need to make a list of population. 19 00:01:52,150 --> 00:01:56,190 So all we will do, how we how we can make the list out of the table. 20 00:01:56,950 --> 00:02:01,180 So for this, we need to click the effects. 21 00:02:03,020 --> 00:02:09,950 And here we have to right Beckett and the name of the column that is population. 22 00:02:13,110 --> 00:02:15,620 They sit and click control. 23 00:02:16,950 --> 00:02:22,080 So this is a list that contain the population for the selected country. 24 00:02:23,910 --> 00:02:26,880 So I need now the average. 25 00:02:26,910 --> 00:02:28,620 So how will the average? 26 00:02:28,620 --> 00:02:29,880 Because this is a list. 27 00:02:30,390 --> 00:02:36,690 So then we have to use a function in the list that is list dot average. 28 00:02:42,960 --> 00:02:43,680 It started. 29 00:02:48,860 --> 00:02:54,320 And we need to close it and then we have to conform. 30 00:02:54,950 --> 00:02:57,050 So this is the average population policy. 31 00:02:57,410 --> 00:03:01,730 Let's go to The View and what are they advanced, Ed? 32 00:03:01,760 --> 00:03:08,170 So we need to write a function that will take the curious thing, like here is China. 33 00:03:08,780 --> 00:03:13,730 And if I are right here, like India 34 00:03:17,450 --> 00:03:27,920 and then we have done so here, you can see it is one seventy two point seven to nine seven two three. 35 00:03:27,950 --> 00:03:38,240 So we need to write a function, go to the advanced editor, and here we have to write one function 36 00:03:38,240 --> 00:03:38,540 here. 37 00:03:41,820 --> 00:03:55,590 Just like put it in the prior video, so it is like a country name as text and what we have to return 38 00:03:56,340 --> 00:03:57,480 as no. 39 00:04:00,420 --> 00:04:04,020 And then the last line there said. 40 00:04:06,210 --> 00:04:07,200 And we have a clip. 41 00:04:07,550 --> 00:04:12,260 OK, so we need to use the country name parameter here. 42 00:04:12,290 --> 00:04:17,450 We don't need to write this thing manually every time this part. 43 00:04:18,540 --> 00:04:19,860 So it will be 44 00:04:23,160 --> 00:04:28,960 and sign based and. 45 00:04:32,160 --> 00:04:33,300 So that's it. 46 00:04:34,780 --> 00:04:36,930 We need to click then. 47 00:04:42,250 --> 00:04:58,150 So now we can right here first Bill right here, China Desh population in work, and here you can see 48 00:04:58,540 --> 00:05:03,100 the population of China and the same way like if I do 49 00:05:06,190 --> 00:05:06,730 use 50 00:05:09,820 --> 00:05:11,540 and click confirm. 51 00:05:11,890 --> 00:05:14,580 So this is the population of the US, right. 52 00:05:15,070 --> 00:05:27,580 So the function is ready and now we need to make a table that will contain all the curious things so 53 00:05:27,580 --> 00:05:30,780 that we don't need to write everything manually. 54 00:05:31,570 --> 00:05:35,790 So how we will create that table for this? 55 00:05:35,800 --> 00:05:47,020 I need to again go to the will go to this website and this time I need to take this in the population 56 00:05:47,020 --> 00:05:50,170 by country so that I can get all the countries here. 57 00:05:51,160 --> 00:05:52,060 So copy. 58 00:05:52,250 --> 00:06:00,400 And then we have to go to the backslapping data and nuclear free web. 59 00:06:03,550 --> 00:06:13,210 And control take control because our function is ready and now we need to prepare one table that will 60 00:06:13,210 --> 00:06:17,770 have all the parameters that we were passing manually. 61 00:06:18,430 --> 00:06:23,890 So we will prepare a table that will contain all those things possible to table one. 62 00:06:24,820 --> 00:06:28,350 So this table contain the countryside. 63 00:06:28,510 --> 00:06:29,920 So this is a quiet table. 64 00:06:30,480 --> 00:06:31,390 We're just going to click. 65 00:06:32,020 --> 00:06:32,530 Yes. 66 00:06:33,010 --> 00:06:33,770 And click on. 67 00:06:41,610 --> 00:06:46,870 So we are interested in this column. 68 00:06:47,490 --> 00:06:50,640 So we need to right click and remove other columns. 69 00:06:52,170 --> 00:07:00,540 And here if you look here, you can see the India or the United States or there is US population in 70 00:07:00,540 --> 00:07:01,470 the adult population. 71 00:07:01,480 --> 00:07:05,430 So we need to create a new column here. 72 00:07:08,020 --> 00:07:19,630 Ed column, and then I write custom column, and here I have to write here, for example, countries, 73 00:07:22,240 --> 00:07:23,160 countries link. 74 00:07:23,920 --> 00:07:26,590 This is the column name. 75 00:07:27,880 --> 00:07:38,380 And here, the first thing we need if the if there is any space like the United States or any country, 76 00:07:39,910 --> 00:07:47,920 the United Kingdom, and we have seen most of the time, if there is a space, they have replaced the 77 00:07:47,930 --> 00:07:50,170 space with our Desh. 78 00:07:50,170 --> 00:07:53,880 Here you can see the first letter of every country is capital. 79 00:07:53,890 --> 00:07:55,860 So we need to make everything smart. 80 00:07:56,350 --> 00:07:57,540 So how we can do this. 81 00:07:58,390 --> 00:08:06,640 There is a function called text, not law, and 82 00:08:09,620 --> 00:08:11,460 we have to close it, right. 83 00:08:13,310 --> 00:08:21,290 And then we have to right here one more function like replace. 84 00:08:25,750 --> 00:08:33,520 We need to put some space here and then so that the Intellisense work, so to replace or not replace 85 00:08:33,520 --> 00:08:43,060 text, so it will take a couple of images the first time, which is the original strain here. 86 00:08:43,060 --> 00:08:48,040 The second parameter is the oldest text, that which text we want to replace. 87 00:08:48,340 --> 00:08:52,960 So this is the space and the third parameter will be the dash. 88 00:08:55,180 --> 00:08:56,480 So this is one thing. 89 00:08:56,500 --> 00:08:59,320 So now we have spaces. 90 00:08:59,380 --> 00:09:07,510 We have to ask if there is any space in the country name and all the letters will be small and then 91 00:09:07,510 --> 00:09:10,540 we have to write one more Angelyne. 92 00:09:12,780 --> 00:09:17,700 And Desh, like the entire population of our and here you can see. 93 00:09:20,000 --> 00:09:30,750 And a population smaller, so it will be in certain space, it will be Dash. 94 00:09:32,180 --> 00:09:33,200 So that's it. 95 00:09:33,920 --> 00:09:38,240 So now we need to click, OK? 96 00:09:40,930 --> 00:09:48,990 So here you can see this is the new column that contain country like China, population, India, population, 97 00:09:49,000 --> 00:09:55,300 and if you see foreigners, Saudi Arabia, so Saudi Arabia. 98 00:09:55,510 --> 00:09:57,240 So you can see if there is a space. 99 00:09:58,090 --> 00:10:00,790 So we have dash instead of space. 100 00:10:00,790 --> 00:10:01,080 Right. 101 00:10:02,470 --> 00:10:08,810 So we need to do so because we are interested in this column controlling. 102 00:10:09,610 --> 00:10:16,170 So we need to remove this column because we don't need it and remove it. 103 00:10:16,720 --> 00:10:18,160 And here you can see the country. 104 00:10:18,370 --> 00:10:22,040 So we need to just delete this involved function. 105 00:10:22,270 --> 00:10:26,930 This is a function that is returning us the average value. 106 00:10:27,730 --> 00:10:36,710 So here you can see a city like a city average. 107 00:10:37,150 --> 00:10:38,680 So this is the name of the function. 108 00:10:40,750 --> 00:10:44,080 And here the table name is countries. 109 00:10:46,120 --> 00:10:50,560 This is the name of the cable that contain all the parameters. 110 00:10:51,250 --> 00:10:56,620 So now we need to run this function for every role. 111 00:10:57,160 --> 00:10:57,530 Right. 112 00:10:57,790 --> 00:11:01,780 So this city average will run for this one, this one. 113 00:11:01,780 --> 00:11:02,140 This one. 114 00:11:02,140 --> 00:11:06,740 And then it will fetch the value for every role. 115 00:11:07,040 --> 00:11:07,390 Right. 116 00:11:08,410 --> 00:11:16,750 So for this, we need to click here the invoke function and column. 117 00:11:16,750 --> 00:11:20,110 And here is the invoked custom function. 118 00:11:22,340 --> 00:11:34,940 So we need to click here and here, you can write like a country country name and. 119 00:11:37,580 --> 00:11:39,880 Then we need to choose the functionality. 120 00:11:40,280 --> 00:11:47,120 So this is a function, ethnicity average and they the table opted this is the table like controlling. 121 00:11:47,150 --> 00:11:54,830 So we have to choose the column for that table and then we have to click, OK? 122 00:11:58,780 --> 00:12:05,410 Before clicking, OK, because it's a very, very big table is contained like two thirty five rows and 123 00:12:05,410 --> 00:12:07,230 it will take a lot of time. 124 00:12:08,410 --> 00:12:19,060 So we need to limit like we we we can write maximum like ten records for this demo because what we don't 125 00:12:19,060 --> 00:12:25,130 need to wait like three or four minutes depending on the speed of your Internet and your computer. 126 00:12:25,940 --> 00:12:32,740 So for this, I need to we have to write here one more function. 127 00:12:32,740 --> 00:12:37,240 Here is a table dot first. 128 00:12:37,240 --> 00:12:40,360 And so, 129 00:12:43,720 --> 00:12:50,620 like we are interested in 50 percent, all 15 countries, the first 15 countries. 130 00:12:51,130 --> 00:12:57,430 And then we we have to click here, confirm we we need to click on from here. 131 00:12:57,430 --> 00:12:59,410 You can see we got 15 countries. 132 00:12:59,410 --> 00:13:07,570 So now we need to click the invoked function and here we have to write the name of the column. 133 00:13:07,690 --> 00:13:14,710 So countries, countries name and then they function. 134 00:13:14,710 --> 00:13:17,440 Query is ethnicity average. 135 00:13:17,890 --> 00:13:19,810 This is the name of the column controlling. 136 00:13:20,830 --> 00:13:25,140 And then we have to click, OK, we got this result. 137 00:13:25,390 --> 00:13:32,760 So here you can see we are getting the average population positive for each country. 138 00:13:33,220 --> 00:13:34,570 So you have seen here. 139 00:13:36,390 --> 00:13:41,100 We caught the error message in the United States population. 140 00:13:41,670 --> 00:13:52,740 The reason is if you go to the if we go to the website and the United States, if you move forward, 141 00:13:52,740 --> 00:14:01,050 the United States will see it is US Desh population, but in indeed poverty. 142 00:14:01,480 --> 00:14:07,470 Ed, here, the name of the curious thing is the United States population. 143 00:14:07,800 --> 00:14:10,150 That's why we are getting this error message. 144 00:14:10,470 --> 00:14:21,960 If you see here, if you click here and you will see the link Valdo meter dot info and then the United 145 00:14:22,080 --> 00:14:23,270 States population. 146 00:14:23,790 --> 00:14:27,270 So you are unable to read it because this link does not exist. 147 00:14:27,540 --> 00:14:30,230 So that's why you are getting the fourth quarter for error. 148 00:14:31,050 --> 00:14:34,590 So you can change or you can just replace. 149 00:14:34,590 --> 00:14:43,080 If there is a you if there is a United State, then it should then then it should be replaced with yesterday's 150 00:14:43,200 --> 00:14:43,880 population. 151 00:14:43,980 --> 00:14:49,890 So you have seen how easy it is to do the web scrapping with the help of poverty.