1 00:00:00,090 --> 00:00:00,710 All right. 2 00:00:00,720 --> 00:00:01,470 Welcome back. 3 00:00:01,500 --> 00:00:03,710 So we're moving on to our next question here. 4 00:00:03,719 --> 00:00:08,730 Even though this one actually isn't framed as a question, there's no question mark, but that doesn't 5 00:00:08,730 --> 00:00:11,330 make it any less relevant or less real world. 6 00:00:11,340 --> 00:00:16,770 In fact, this is one that I think is a really common operation, a common question to ask, and it 7 00:00:16,770 --> 00:00:19,020 has to do with inactive users. 8 00:00:19,170 --> 00:00:27,300 So across any sort of startup or online app company, there's often users who sign up, create an account 9 00:00:27,300 --> 00:00:32,700 and then do nothing, whether it means never posting or never buying something, never converting from 10 00:00:32,700 --> 00:00:38,850 a free account to a paid account like in the case of Spotify or never in my case. 11 00:00:39,360 --> 00:00:41,520 This is a real example for my courses. 12 00:00:41,520 --> 00:00:45,780 Students who sign up and then never start it, never make it past the first video. 13 00:00:45,810 --> 00:00:52,530 They're at 0% and I'm actually able to send targeted, I think, once a month, twice a month. 14 00:00:52,530 --> 00:00:56,610 I can send an email update and I can target that to specific segments. 15 00:00:56,610 --> 00:01:04,440 So if I updated an introduction section to a course and I only wanted to let students know who haven't 16 00:01:04,440 --> 00:01:08,940 started yet, I want to target them and say, Hey, by the way, I noticed you haven't started yet. 17 00:01:08,940 --> 00:01:10,230 Well, this is a good time to start. 18 00:01:10,230 --> 00:01:12,150 I just revamped the whole intro. 19 00:01:12,180 --> 00:01:15,720 I could do that based off of what we're going to be doing here. 20 00:01:15,750 --> 00:01:19,770 Of course, our data and our database is set up differently, but it's the same idea. 21 00:01:19,770 --> 00:01:25,470 So this is a really common thing and all we're going to try and do is for our Instagram clone, our 22 00:01:25,470 --> 00:01:26,520 fake Instagram. 23 00:01:26,520 --> 00:01:31,950 We're trying to identify users who have never posted a photo, so they've signed up, they never posted 24 00:01:31,950 --> 00:01:36,210 anything, and we want our users to be more engaged for a bunch of reasons. 25 00:01:36,210 --> 00:01:40,590 You know, we want higher engagement in our app, which means that we could have a higher valuation 26 00:01:40,590 --> 00:01:45,210 potentially or sell ads for more, maybe get higher investments, all that kind of stuff. 27 00:01:45,210 --> 00:01:47,520 We want people posting if they're signed up. 28 00:01:47,520 --> 00:01:50,550 It looks bad to have dead accounts, people who sign up and then give up. 29 00:01:50,850 --> 00:01:52,590 So we're going to try and send an email. 30 00:01:53,010 --> 00:01:57,330 We're not worrying about doing that, but how are we going to target these people who have never posted 31 00:01:57,330 --> 00:01:59,790 a photo first step, Let's identify them. 32 00:02:00,180 --> 00:02:05,550 And essentially we're going to need to work with a users table and the photos table. 33 00:02:06,000 --> 00:02:16,200 So I'll add my three here, which is identify inactive users, users with no photos. 34 00:02:18,630 --> 00:02:21,150 So you've probably gathered we need to do a join. 35 00:02:21,150 --> 00:02:28,740 We're going to be working with the users table so we can start there, select start from users and we 36 00:02:28,740 --> 00:02:35,010 have 100 something users, but we can't see who's done what because we have the photos table and if 37 00:02:35,010 --> 00:02:38,130 we do our select star from photos just to jog your memory. 38 00:02:38,250 --> 00:02:40,650 Well, I guess we haven't actually seen all of our data here yet. 39 00:02:40,650 --> 00:02:46,740 We've got about exactly 257 photos, but a lot of them are done by the same user. 40 00:02:46,740 --> 00:02:48,750 So this here is user ID. 41 00:02:49,770 --> 00:02:51,840 If I scroll way up, here we go. 42 00:02:52,080 --> 00:02:53,760 We can see these. 43 00:02:54,060 --> 00:02:55,710 These are all posted by the same user. 44 00:02:55,710 --> 00:02:59,850 So three these are all from the same user, these are all from the same user. 45 00:03:00,090 --> 00:03:05,820 But then we have some users who haven't posted anything, but there's no way of knowing that just by 46 00:03:05,820 --> 00:03:08,880 looking at the photos because these are the posted photos. 47 00:03:08,880 --> 00:03:14,490 So what we need to do is join these two together and hopefully you know what we need to do. 48 00:03:14,490 --> 00:03:22,890 If I just do a regular inner join in or join photos and where do we want to do it? 49 00:03:22,890 --> 00:03:32,220 I want to join on users ID equals photos, dot user ID and if we run this now. 50 00:03:35,790 --> 00:03:36,500 It's a mess. 51 00:03:36,510 --> 00:03:38,790 So let's focus on just the things we need. 52 00:03:39,060 --> 00:03:45,690 So rather than Star can do username, maybe the image URL and we can just start with those two. 53 00:03:47,130 --> 00:03:47,400 All right. 54 00:03:47,400 --> 00:03:51,000 So we can see all these usernames and what they've submitted. 55 00:03:51,000 --> 00:03:55,310 But the problem here is that this doesn't account for the people who haven't submitted anything. 56 00:03:55,320 --> 00:04:00,360 So if there was a user who didn't submit something, they're not showing up here because we're doing 57 00:04:00,360 --> 00:04:06,060 an inner join, which remember is the intersection of both circles, the left and the right where they 58 00:04:06,060 --> 00:04:06,750 overlap. 59 00:04:06,750 --> 00:04:14,070 But what we want is to identify where users on the left have no overlap on the right for photos. 60 00:04:14,070 --> 00:04:16,200 So we can change this to a left join. 61 00:04:16,500 --> 00:04:20,160 And that one change if we rerun this. 62 00:04:21,850 --> 00:04:22,790 We'll have to scroll a bit. 63 00:04:22,810 --> 00:04:23,500 There we go. 64 00:04:23,680 --> 00:04:29,410 You can see we have certain users who have a null over here, and that's referring to the fact that 65 00:04:29,410 --> 00:04:32,260 they have no photos corresponding to them. 66 00:04:32,260 --> 00:04:37,530 So there's a couple of them as I go here, and that's what we want to be able to identify. 67 00:04:37,540 --> 00:04:38,860 So we're almost there. 68 00:04:38,860 --> 00:04:46,210 Actually, all that we need to do is only select the users that have photos of NULL, essentially. 69 00:04:46,210 --> 00:04:50,890 And of course, if I go back to having Star here, it's not just image URL that's null. 70 00:04:51,490 --> 00:04:52,780 It's everything in photos. 71 00:04:52,780 --> 00:04:53,800 It's hard to see. 72 00:04:53,800 --> 00:04:57,280 Here's an example, but we've got a couple of nulls here. 73 00:04:57,280 --> 00:05:04,330 We've got the image or the photo ID, we've got the image URL, we've got the user ID of the photo, 74 00:05:04,330 --> 00:05:06,520 and then we have the created additional. 75 00:05:06,790 --> 00:05:08,410 So all four things are null. 76 00:05:08,560 --> 00:05:10,270 We just need to focus on one of them. 77 00:05:10,270 --> 00:05:11,380 It doesn't matter which one. 78 00:05:11,380 --> 00:05:17,110 So I'll go back to just adding username here and we can do it. 79 00:05:17,110 --> 00:05:18,190 Doesn't matter really. 80 00:05:18,760 --> 00:05:24,370 Let's do image URL again and then all we want to do is add a layer down here, but we can't just say 81 00:05:24,370 --> 00:05:29,710 where photos dot, we can just do ID equals null. 82 00:05:29,710 --> 00:05:30,340 That doesn't work. 83 00:05:30,340 --> 00:05:37,960 Remember, NULL is weird, so we need to do where photoshopped ID is null and this could be anything 84 00:05:37,960 --> 00:05:38,860 on the photo side. 85 00:05:38,860 --> 00:05:44,620 It could be photo ID, it could be photos that image URL, photos that user ID any of them. 86 00:05:44,680 --> 00:05:51,400 So I'll just do it as ID And if I do this now, you'll see. 87 00:05:51,400 --> 00:05:53,950 And we actually don't even need to display the image URL. 88 00:05:54,760 --> 00:06:01,840 But if I do that, this is our list of users who have never posted anything for whatever reasons, some 89 00:06:01,840 --> 00:06:03,610 of them programmatically. 90 00:06:03,610 --> 00:06:09,940 I created them as bots, which all they do is go comment and like other people's photos, but they don't 91 00:06:09,940 --> 00:06:11,230 post anything themselves. 92 00:06:11,230 --> 00:06:17,740 And then another persona of a user is a lurker, which is somebody who has an account and they don't 93 00:06:17,740 --> 00:06:22,990 use it all that much, but they just kind of look at people's photos, maybe like something, but they 94 00:06:22,990 --> 00:06:23,620 don't really. 95 00:06:23,620 --> 00:06:25,600 They definitely don't post and they don't comment. 96 00:06:25,600 --> 00:06:30,970 They're kind of a silent member of the community, so we get a decent number of them. 97 00:06:31,180 --> 00:06:35,980 Sorry for that fire truck and that's all we need to do If we want to figure out how many, that's another 98 00:06:35,980 --> 00:06:37,570 thing we can just add account in. 99 00:06:37,570 --> 00:06:40,870 But we just want the username so that we can target them. 100 00:06:40,870 --> 00:06:42,700 These are people who've never posted anything. 101 00:06:42,700 --> 00:06:44,050 And just a note. 102 00:06:44,470 --> 00:06:48,490 If you did this the other way around, it would have worked just fine as well too. 103 00:06:48,490 --> 00:06:52,660 Do photos users and then a right join. 104 00:06:55,200 --> 00:06:56,220 That's the same thing. 105 00:06:56,460 --> 00:06:58,680 So either way, it works. 106 00:06:58,680 --> 00:07:06,090 But the key thing is that we're working with NULL, where we have an intersection where some users in 107 00:07:06,090 --> 00:07:07,050 some photos overlap. 108 00:07:07,060 --> 00:07:08,070 Most of them do. 109 00:07:08,280 --> 00:07:13,710 But then we also have that subset of users who have no photos, and we need to also join those with 110 00:07:13,710 --> 00:07:16,950 NULL, which is how we're able to tell if they have no photos. 111 00:07:16,950 --> 00:07:18,240 So here's my solution. 112 00:07:18,240 --> 00:07:20,940 If you came up with something different, post it and we can discuss it. 113 00:07:21,270 --> 00:07:22,050 Moving on.