1 00:00:00,090 --> 00:00:03,030 All right, next problem here, this is a little different. 2 00:00:03,150 --> 00:00:09,480 So as I mentioned earlier, a real problem on Instagram are bot bot accounts that basically exists to 3 00:00:09,480 --> 00:00:15,180 go around and like a bunch of stuff or to comment on people's things but not create their own content. 4 00:00:15,180 --> 00:00:16,260 They're not real people. 5 00:00:16,290 --> 00:00:18,630 Oftentimes they're commenting the exact same thing. 6 00:00:18,720 --> 00:00:24,330 Or like if I post a photo, I'll notice I get like ten likes immediately, like the next second. 7 00:00:24,330 --> 00:00:29,520 And usually there's not ten people who are happening to like it immediately. 8 00:00:29,520 --> 00:00:34,950 There's some sort of program that's happening behind the scenes where it's looking for new posts and 9 00:00:34,950 --> 00:00:36,120 liking them immediately. 10 00:00:36,270 --> 00:00:39,810 So in our hypothetical Instagram clone, we have the same problem. 11 00:00:39,810 --> 00:00:45,300 So what we want to do is just find all the users who we think could be bot their users who have never 12 00:00:45,370 --> 00:00:48,570 excuse me, their users who have liked every single photo on the site. 13 00:00:48,720 --> 00:00:52,170 So if we have 257 photos, they've liked all of them. 14 00:00:52,170 --> 00:00:54,570 If we have 800 photos, they've liked all of them. 15 00:00:54,780 --> 00:00:59,340 And I'm going to show you a way of doing it that actually uses something we haven't seen yet, something 16 00:00:59,340 --> 00:00:59,880 new. 17 00:01:00,420 --> 00:01:05,010 So go ahead and give it your own shot and then we'll have a solution. 18 00:01:05,640 --> 00:01:10,430 So the first thing we want to do, we know it's going to be a joint between users and likes. 19 00:01:10,440 --> 00:01:20,070 So we can start with this to select star from users and we'll do an inner join likes. 20 00:01:20,070 --> 00:01:25,470 And the reason we know it's an inner join because we don't care about users who don't have any corresponding 21 00:01:25,470 --> 00:01:27,750 likes or likes that don't have a corresponding user. 22 00:01:27,750 --> 00:01:28,920 Those shouldn't exist. 23 00:01:29,100 --> 00:01:32,490 So we want the inner join, we want the overlap only. 24 00:01:32,490 --> 00:01:41,190 So select star from users inner join likes on users ID equals likes dot user ID and if we start with 25 00:01:41,190 --> 00:01:41,700 that. 26 00:01:43,370 --> 00:01:44,210 Take a moment. 27 00:01:44,840 --> 00:01:48,410 You've got, remember 8782 legs. 28 00:01:49,250 --> 00:01:54,470 And now what we want to do is group them based off of the user who did the liking. 29 00:01:54,980 --> 00:01:59,720 So we'll have a group by and we have a bunch of choices of how we group them. 30 00:01:59,870 --> 00:02:05,480 We could do users state ID or we can do like user ID. 31 00:02:06,830 --> 00:02:07,790 We'll save. 32 00:02:09,270 --> 00:02:13,680 Okay, so now we have 77 rows here. 33 00:02:13,700 --> 00:02:17,240 And we can see there's a bunch of information. 34 00:02:17,240 --> 00:02:18,710 We've got the username. 35 00:02:19,100 --> 00:02:20,300 We don't need created that. 36 00:02:20,300 --> 00:02:21,720 So let's slim this down a bit. 37 00:02:21,740 --> 00:02:29,120 Let's do select username user ID and we'll just keep it at that. 38 00:02:32,430 --> 00:02:36,960 So these are all the users who have liked anything at all. 39 00:02:37,110 --> 00:02:38,550 We're grouping them, right? 40 00:02:38,550 --> 00:02:42,100 So they're grouped by user ID, missing a comma. 41 00:02:42,120 --> 00:02:43,170 That's what's going on. 42 00:02:45,060 --> 00:02:47,670 Now what we want to do is figure out how many. 43 00:02:47,940 --> 00:02:52,890 So that's just a matter of using Count Star, which you should feel pretty good about. 44 00:02:52,890 --> 00:02:54,840 Now, we'll give it an alias right now. 45 00:02:58,350 --> 00:03:01,680 And we can get rid of user ID as well and just do a username. 46 00:03:03,570 --> 00:03:04,110 Okay. 47 00:03:04,470 --> 00:03:10,020 So we can see that we have Andre Purdy, 85, has 94 likes. 48 00:03:10,290 --> 00:03:12,960 Let's give this actually numb likes. 49 00:03:13,980 --> 00:03:18,660 But the question was not just to figure out who has the most, but to figure out the people who have 50 00:03:18,660 --> 00:03:20,940 liked every single photo. 51 00:03:21,210 --> 00:03:25,590 And if you don't remember how many, we have 257. 52 00:03:26,310 --> 00:03:27,990 But we're not talking about cheating. 53 00:03:28,590 --> 00:03:32,430 We could cheat if we wanted to check exactly 4 to 57. 54 00:03:33,150 --> 00:03:41,400 It's not too bad, but you might think you could just do like a where where num likes equals 257, but 55 00:03:41,400 --> 00:03:42,960 that won't actually work. 56 00:03:43,620 --> 00:03:50,850 And that's because where clauses actually go before the group by telling it what the data you'd like 57 00:03:50,850 --> 00:03:56,310 to select two group versus what you'd like to select from the group's data. 58 00:03:56,310 --> 00:04:01,500 So where doesn't work, we need to use something different, which we haven't seen, which is called 59 00:04:01,500 --> 00:04:06,390 having and having X like where it's just a different word. 60 00:04:06,750 --> 00:04:13,110 And what it will do is take our groups data, our end result, and allow us to filter based off of a 61 00:04:13,110 --> 00:04:20,279 clause so we can say having and we want to say total equals 257 just like that. 62 00:04:22,000 --> 00:04:23,140 So now if we do it. 63 00:04:24,140 --> 00:04:25,000 Oh total. 64 00:04:25,010 --> 00:04:28,580 I changed the name to numb likes, but now if I do it, you can see. 65 00:04:28,580 --> 00:04:33,740 Okay, these are the people who have liked it 257 times, like every single post. 66 00:04:33,740 --> 00:04:36,770 But the problem again is be hardcoded to 57. 67 00:04:37,190 --> 00:04:39,470 So we could fix that very easily. 68 00:04:39,470 --> 00:04:46,550 We want this to be dynamic however many photos we have and we can just use a subquery to select count 69 00:04:47,030 --> 00:04:50,720 star from photos. 70 00:04:50,720 --> 00:04:56,450 So this will be where the number of things they've liked is equal to the total number of our photos. 71 00:04:58,280 --> 00:04:58,910 And there we go. 72 00:04:58,910 --> 00:05:00,110 We get the same exact thing. 73 00:05:00,110 --> 00:05:07,100 But this time, if I add 1000 photos or remove ten photos, this will still always be the total number 74 00:05:07,100 --> 00:05:07,820 of photos. 75 00:05:08,300 --> 00:05:13,520 Okay, kind of an ugly query with a subquery here, but if you came up with something different, post 76 00:05:13,520 --> 00:05:13,850 it. 77 00:05:13,850 --> 00:05:14,750 We'll have a discussion.