1 00:00:00,170 --> 00:00:01,773 It's time to talk about 2 00:00:01,773 --> 00:00:04,330 "Troubleshooting a Failed Pipeline Run". 3 00:00:04,330 --> 00:00:06,540 In this lesson, we are going to take a look 4 00:00:06,540 --> 00:00:09,540 at how we troubleshoot a failed pipeline run. 5 00:00:09,540 --> 00:00:11,935 We're going to start by looking at some troubleshooting 6 00:00:11,935 --> 00:00:15,310 steps, and then some just general things to remember. 7 00:00:15,310 --> 00:00:17,203 All right, with that, let's jump in. 8 00:00:19,460 --> 00:00:23,840 So, let's talk about debugging and basic monitoring. 9 00:00:23,840 --> 00:00:28,010 First question: was the pipeline able to debug correctly? 10 00:00:28,010 --> 00:00:30,270 You want to hop in and take a look at that, 11 00:00:30,270 --> 00:00:34,100 and if it wasn't able to run correctly, what was shown? 12 00:00:34,100 --> 00:00:36,990 We also want to take a look at dashboard runs 13 00:00:36,990 --> 00:00:39,976 with details of the latest fail, and we want to understand 14 00:00:39,976 --> 00:00:43,713 were the multiple successes before you had the failure? 15 00:00:44,570 --> 00:00:46,770 So let's jump into the portal real quick 16 00:00:46,770 --> 00:00:49,003 and take a look at these concepts. 17 00:00:50,000 --> 00:00:54,270 So here we find ourselves in our Data Factory studio. 18 00:00:54,270 --> 00:00:58,530 And so I've opened up my Data Factory studio, 19 00:00:58,530 --> 00:01:00,540 and I went under the Monitor tab here, 20 00:01:00,540 --> 00:01:03,300 and then I clicked on Pipeline Runs. 21 00:01:03,300 --> 00:01:06,630 So you can see here, we have Triggered and Debug. 22 00:01:06,630 --> 00:01:08,340 If we have Triggered selected, 23 00:01:08,340 --> 00:01:09,790 that's obviously going to show us 24 00:01:09,790 --> 00:01:11,780 any runs that were triggered. 25 00:01:11,780 --> 00:01:14,160 And then this is our first Debug run, 26 00:01:14,160 --> 00:01:16,900 so we can kind of see what's happening there as well. 27 00:01:16,900 --> 00:01:20,970 So I have given you a Succeeded and a Failed pipeline. 28 00:01:20,970 --> 00:01:24,610 So we can take a look both and see what's happening. 29 00:01:24,610 --> 00:01:27,320 First thing to look at is, with the failed run, 30 00:01:27,320 --> 00:01:29,670 I can come over here and click on my error, 31 00:01:29,670 --> 00:01:32,520 and it will tell me exactly why it failed. 32 00:01:32,520 --> 00:01:34,560 So this is your very first clue, 33 00:01:34,560 --> 00:01:37,430 and 99% of the time, should fix what's wrong. 34 00:01:37,430 --> 00:01:39,550 You'll be able to go in here, take a look. 35 00:01:39,550 --> 00:01:42,260 And so you can see here that our activity timed out 36 00:01:42,260 --> 00:01:44,800 in my Copy Data activity. 37 00:01:44,800 --> 00:01:46,960 Well, let's see why that happened. 38 00:01:46,960 --> 00:01:51,840 So if I go to my pipeline, this is pipeline1. 39 00:01:51,840 --> 00:01:54,170 Here's my Copy Data activity. 40 00:01:54,170 --> 00:01:56,770 If I click on it and I look at my timeout, 41 00:01:56,770 --> 00:01:59,810 woah, it was set for 1 second. 42 00:01:59,810 --> 00:02:01,870 Pretty unlikely that it's going to succeed 43 00:02:01,870 --> 00:02:04,590 with a really short timeline like that. 44 00:02:04,590 --> 00:02:06,110 So that's just a simple look 45 00:02:06,110 --> 00:02:09,320 at how we can fix our pipeline runs. 46 00:02:09,320 --> 00:02:12,230 So make sure that you start there by looking at that. 47 00:02:12,230 --> 00:02:13,750 And the same thing's true with Triggered. 48 00:02:13,750 --> 00:02:15,690 You'll be able to see all the statuses, 49 00:02:15,690 --> 00:02:17,910 go in and see exactly why it failed. 50 00:02:17,910 --> 00:02:20,190 So make sure that you look at that, look at your Debug, 51 00:02:20,190 --> 00:02:21,040 look at your Triggered, 52 00:02:21,040 --> 00:02:23,140 and see if you can understand, from the error, 53 00:02:23,140 --> 00:02:25,370 what exactly is going on. 54 00:02:25,370 --> 00:02:26,900 Now, as you get a little bit more advanced, 55 00:02:26,900 --> 00:02:28,170 you can come under Notifications 56 00:02:28,170 --> 00:02:31,070 and set yourself up some alerts, which I highly recommend. 57 00:02:31,960 --> 00:02:32,793 All right. 58 00:02:34,500 --> 00:02:37,040 With that, let's talk about a few things to remember. 59 00:02:37,040 --> 00:02:39,370 So if you have reached the 60 00:02:39,370 --> 00:02:41,310 integration runtime capacity limit, 61 00:02:41,310 --> 00:02:43,200 you're going to get some errors. 62 00:02:43,200 --> 00:02:44,080 How do we fix that? 63 00:02:44,080 --> 00:02:46,240 Well, we vary our trigger runtimes 64 00:02:46,240 --> 00:02:49,190 so that not everything is running at the exact same time. 65 00:02:49,190 --> 00:02:51,450 And we might look at splitting our pipelines 66 00:02:51,450 --> 00:02:54,740 across multiple different integration runtimes. 67 00:02:54,740 --> 00:02:57,460 Another common error is long queues. 68 00:02:57,460 --> 00:02:58,850 If you're getting long queues, 69 00:02:58,850 --> 00:03:01,250 you want to look at your concurrency limits, 70 00:03:01,250 --> 00:03:03,680 and you want to take a look at general Azure issues, 71 00:03:03,680 --> 00:03:06,820 i.e., is there a service outage for Azure, 72 00:03:06,820 --> 00:03:09,890 or is there something wrong with your subscription? 73 00:03:09,890 --> 00:03:12,680 So, those are 2 of the most common errors. 74 00:03:12,680 --> 00:03:14,850 There's a lot more, and I actually want to show you 75 00:03:14,850 --> 00:03:19,460 a document that Microsoft has, and it is right here. 76 00:03:19,460 --> 00:03:20,550 So this is our, 77 00:03:20,550 --> 00:03:22,010 -and I'll put this link 78 00:03:22,010 --> 00:03:23,130 in the description of the video-- 79 00:03:23,130 --> 00:03:25,710 but this is a, "Troubleshoot Pipeline 80 00:03:25,710 --> 00:03:27,260 "Orchestration and Triggers". 81 00:03:27,260 --> 00:03:29,290 So this just gives you a list 82 00:03:29,290 --> 00:03:32,010 of all kinds of things that can go wrong 83 00:03:32,010 --> 00:03:33,470 with your pipelines. 84 00:03:33,470 --> 00:03:36,450 My suggestion to you, as we finish this video, 85 00:03:36,450 --> 00:03:38,760 is jump in and skim through this. 86 00:03:38,760 --> 00:03:42,420 No, you do not need to memorize this for the DP-203, 87 00:03:42,420 --> 00:03:44,290 but it wouldn't hurt you to be familiar 88 00:03:44,290 --> 00:03:46,370 with just a quick run-through of 89 00:03:46,370 --> 00:03:50,093 the kinds of things that can go wrong in a pipeline. 90 00:03:53,810 --> 00:03:55,180 So with that, let's go ahead 91 00:03:55,180 --> 00:03:57,550 and wrap up with a few key points. 92 00:03:57,550 --> 00:04:00,470 One, the starting point should always be your dashboard. 93 00:04:00,470 --> 00:04:02,460 Make sure that you hop in and you take a look 94 00:04:02,460 --> 00:04:03,630 to see if there's any errors 95 00:04:03,630 --> 00:04:06,010 and see if you can troubleshoot what's going wrong there. 96 00:04:06,010 --> 00:04:07,940 Next, read through that doc. 97 00:04:07,940 --> 00:04:10,170 That's your homework after this lesson. 98 00:04:10,170 --> 00:04:12,450 Make sure that you don't spend a ton of time 99 00:04:12,450 --> 00:04:14,160 memorizing those concepts. 100 00:04:14,160 --> 00:04:16,960 Skimming through it 1 time should be just fine, 101 00:04:16,960 --> 00:04:18,420 but make sure that you've done that. 102 00:04:18,420 --> 00:04:20,920 And when that's done, I'll see you in the next lesson.