1 00:00:00,610 --> 00:00:01,580 So in this lesson, 2 00:00:01,580 --> 00:00:05,130 we are going to talk about "Configuring Data Retention". 3 00:00:05,130 --> 00:00:06,500 Specifically, we're going to be looking 4 00:00:06,500 --> 00:00:09,240 at some data retention policy basics. 5 00:00:09,240 --> 00:00:12,170 So, I'll explain how data retention works 6 00:00:12,170 --> 00:00:13,410 at a super high level. 7 00:00:13,410 --> 00:00:14,610 And then we'll talk about 8 00:00:14,610 --> 00:00:16,820 configuring data retention in Azure. 9 00:00:16,820 --> 00:00:18,660 And then, of course, we'll jump into the portal, 10 00:00:18,660 --> 00:00:19,700 and we'll take a quick look 11 00:00:19,700 --> 00:00:21,973 at how you would set data retention. 12 00:00:22,940 --> 00:00:25,833 So, data retention policy basics. 13 00:00:26,870 --> 00:00:28,160 The first thing you need to do 14 00:00:28,160 --> 00:00:30,730 is determine your regulatory requirements. 15 00:00:30,730 --> 00:00:32,940 So this is your legal requirements, right? 16 00:00:32,940 --> 00:00:34,440 Every business is going to be different. 17 00:00:34,440 --> 00:00:35,960 Some businesses are 7 years, 18 00:00:35,960 --> 00:00:37,500 some businesses are 3 years, 19 00:00:37,500 --> 00:00:40,430 some businesses don't have legal requirements. 20 00:00:40,430 --> 00:00:43,390 But as a data engineer, you need to understand 21 00:00:43,390 --> 00:00:47,290 what the regulatory requirements are for your data. 22 00:00:47,290 --> 00:00:48,630 So start there. 23 00:00:48,630 --> 00:00:49,610 Once you've done that, 24 00:00:49,610 --> 00:00:52,430 you're going to determine a retention policy. 25 00:00:52,430 --> 00:00:54,290 So, based upon the business needs, 26 00:00:54,290 --> 00:00:56,420 and it's not just regulatory requirements, 27 00:00:56,420 --> 00:00:58,340 there's also business needs here. 28 00:00:58,340 --> 00:01:01,600 So it might be that we want 5 years of records 29 00:01:01,600 --> 00:01:04,390 and legally, we have to keep 3 years of records. 30 00:01:04,390 --> 00:01:08,650 So, for our data retention policy, it's 5 years. 31 00:01:08,650 --> 00:01:10,930 And after that, we purge, right? 32 00:01:10,930 --> 00:01:12,930 That could be a data retention policy, 33 00:01:12,930 --> 00:01:14,620 or it could be one of those things 34 00:01:14,620 --> 00:01:16,190 where you just have a business need 35 00:01:16,190 --> 00:01:19,340 but no legal requirement or regulatory requirement. 36 00:01:19,340 --> 00:01:21,970 So, determine your regulatory requirement, 37 00:01:21,970 --> 00:01:24,210 then determine your retention policy 38 00:01:24,210 --> 00:01:25,620 based upon your business needs 39 00:01:25,620 --> 00:01:27,840 and that regulatory requirement. 40 00:01:27,840 --> 00:01:31,410 Third, that's going to give you your data retention period. 41 00:01:31,410 --> 00:01:34,070 How long does your organization hold onto its data? 42 00:01:34,070 --> 00:01:35,340 That's what that is. 43 00:01:35,340 --> 00:01:38,180 Once that's done, generally, you would purge that data, 44 00:01:38,180 --> 00:01:40,523 or you could move it somewhere else as needed. 45 00:01:41,780 --> 00:01:44,940 And then fourth, you are going to implement 46 00:01:44,940 --> 00:01:47,220 or start your data retention period, 47 00:01:47,220 --> 00:01:49,390 and set up a periodic review. 48 00:01:49,390 --> 00:01:50,900 Generally, this is going to be something 49 00:01:50,900 --> 00:01:52,960 that's an annual review or something like that, 50 00:01:52,960 --> 00:01:56,010 because in general, regulatory requirements 51 00:01:56,010 --> 00:01:57,468 and business requirements 52 00:01:57,468 --> 00:01:59,110 -at least as it's concerning data-- 53 00:01:59,110 --> 00:02:00,580 doesn't change that often 54 00:02:00,580 --> 00:02:02,593 as far as on how long to hold records. 55 00:02:05,150 --> 00:02:07,110 Next, let's talk about configuring 56 00:02:07,110 --> 00:02:09,300 our data retention policy. 57 00:02:09,300 --> 00:02:13,180 So, this has to be configured by Azure service. 58 00:02:13,180 --> 00:02:16,070 And so, for the DP-203, we're going to focus on Synapse 59 00:02:16,070 --> 00:02:18,223 so you get a look at how that would be set. 60 00:02:19,860 --> 00:02:22,830 You're going to implement at the SQL-pool level. 61 00:02:22,830 --> 00:02:24,750 And I'm going to jump in and show you this in just a second, 62 00:02:24,750 --> 00:02:27,210 but I'm walking you through the process right now. 63 00:02:27,210 --> 00:02:29,650 So, we go into the Audit tab, we configure 64 00:02:29,650 --> 00:02:32,070 our data retention period, 65 00:02:32,070 --> 00:02:35,720 and that's going to be set for SQL pool events. 66 00:02:35,720 --> 00:02:38,520 Then, we're going to define our retention period. 67 00:02:38,520 --> 00:02:40,920 This is anywhere from a day to 9 years, 68 00:02:40,920 --> 00:02:42,590 and that's somewhat dependent upon 69 00:02:42,590 --> 00:02:44,073 the service that we're using. 70 00:02:45,400 --> 00:02:46,840 And so with that, let's just go ahead 71 00:02:46,840 --> 00:02:49,130 and jump into the portal, and let me show you 72 00:02:49,130 --> 00:02:50,420 what this looks like. 73 00:02:50,420 --> 00:02:53,820 So, here we find ourself in Azure Synapse. 74 00:02:53,820 --> 00:02:57,860 And I have scrolled down from Overview down to Security, 75 00:02:57,860 --> 00:02:59,870 and then Azure SQL Auditing. 76 00:02:59,870 --> 00:03:01,970 And this is what we are interested in. 77 00:03:01,970 --> 00:03:04,060 And you can see here, like I said, 78 00:03:04,060 --> 00:03:07,900 Azure SQL Auditing is going to track your SQL pool event, 79 00:03:07,900 --> 00:03:09,840 and then it's going to write them somewhere. 80 00:03:09,840 --> 00:03:11,930 So let's go ahead and turn that on, 81 00:03:11,930 --> 00:03:13,680 because it's not on by default. 82 00:03:13,680 --> 00:03:16,290 And then we get to choose where we send this thing. 83 00:03:16,290 --> 00:03:17,123 So let's just say 84 00:03:17,123 --> 00:03:19,400 that we want to send it to a storage account. 85 00:03:19,400 --> 00:03:21,340 So, I pick a storage account. 86 00:03:21,340 --> 00:03:24,260 And then you can see here, under Advanced Properties, 87 00:03:24,260 --> 00:03:25,770 it gives me a choice, 88 00:03:25,770 --> 00:03:29,850 anywhere from 0 days up to 9 years. 89 00:03:29,850 --> 00:03:32,070 So, I can choose that. 90 00:03:32,070 --> 00:03:32,950 And in addition to that, 91 00:03:32,950 --> 00:03:37,340 I can also send it to Log Analytics or Event Hub as well. 92 00:03:37,340 --> 00:03:40,130 So once that's all set, I just click on Save. 93 00:03:40,130 --> 00:03:41,600 It'll take a couple of seconds. 94 00:03:41,600 --> 00:03:44,300 And then, Azure SQL Auditing will be turned on. 95 00:03:44,300 --> 00:03:45,860 That simple. 96 00:03:45,860 --> 00:03:49,110 So with that, let's go ahead and wrap up this lesson. 97 00:03:49,110 --> 00:03:53,750 So, this looks like an auditing lesson, not data retention. 98 00:03:53,750 --> 00:03:55,700 Yes, it does look like auditing. 99 00:03:55,700 --> 00:03:58,110 However, the key is the purpose, right? 100 00:03:58,110 --> 00:04:00,680 From that data retention, we can run auditing 101 00:04:00,680 --> 00:04:03,310 or we can store it for regulatory requirements 102 00:04:03,310 --> 00:04:05,490 or do a whole host of things. 103 00:04:05,490 --> 00:04:07,830 Next, don't just look at storage. 104 00:04:07,830 --> 00:04:11,240 Log Analytics is very useful for querying. 105 00:04:11,240 --> 00:04:12,610 So if you just need to store it 106 00:04:12,610 --> 00:04:14,410 and you don't ever plan on using it, 107 00:04:14,410 --> 00:04:16,120 hey, storage might be fine. 108 00:04:16,120 --> 00:04:19,020 If you think you're going to be using that in your analysis, 109 00:04:19,020 --> 00:04:21,940 Log Analytics is an excellent solution. 110 00:04:21,940 --> 00:04:23,130 That's it for this lesson. 111 00:04:23,130 --> 00:04:24,870 Super easy, super fast. 112 00:04:24,870 --> 00:04:26,120 I'll see you in the next.