1 00:00:00,200 --> 00:00:01,650 ‫So in this new demo we're going 2 00:00:01,650 --> 00:00:04,290 ‫to create a workflow where we are going 3 00:00:04,290 --> 00:00:06,170 ‫to handle some errors. 4 00:00:06,170 --> 00:00:08,060 ‫And first we're going to create a function 5 00:00:08,060 --> 00:00:09,650 ‫in Lambda that is going to throw an error. 6 00:00:09,650 --> 00:00:11,280 ‫So let's create a function. 7 00:00:11,280 --> 00:00:13,930 ‫We'll use a blueprint and in it, 8 00:00:13,930 --> 00:00:16,583 ‫I'm going to type step function. 9 00:00:17,550 --> 00:00:19,750 ‫Step, so let's clear the filter, 10 00:00:19,750 --> 00:00:22,740 ‫step-functions. 11 00:00:22,740 --> 00:00:26,120 ‫Okay, so we have step-functions-error as a blueprint 12 00:00:26,120 --> 00:00:28,980 ‫to throw an error and then we can configure Step functions 13 00:00:28,980 --> 00:00:30,670 ‫to handle the catch or retry from this error. 14 00:00:30,670 --> 00:00:32,940 ‫So let's configure this blueprint 15 00:00:32,940 --> 00:00:35,930 ‫and I'll call it MyLambdaFunctionThatFails. 16 00:00:37,930 --> 00:00:40,090 ‫We're going to create a new role 17 00:00:40,090 --> 00:00:42,840 ‫and here is the code itself. 18 00:00:42,840 --> 00:00:45,390 ‫So it's going to look at it and create a CustomError. 19 00:00:45,390 --> 00:00:46,740 ‫And then to say, "This is a custom error!" 20 00:00:46,740 --> 00:00:49,233 ‫So this is a function that just only fails. 21 00:00:50,280 --> 00:00:51,630 ‫Let's create this function. 22 00:00:52,930 --> 00:00:54,790 ‫And the function is created now as we can see, 23 00:00:54,790 --> 00:00:57,300 ‫if we test this function and pass in 24 00:00:57,300 --> 00:00:59,410 ‫whatever test events and create it. 25 00:00:59,410 --> 00:01:01,193 ‫So let's just call it Foobar 26 00:01:02,350 --> 00:01:05,440 ‫and create this and test this function. 27 00:01:05,440 --> 00:01:07,040 ‫It gives us an error. 28 00:01:07,040 --> 00:01:08,020 ‫This is a CustomError. 29 00:01:08,020 --> 00:01:08,853 ‫Here's the error message 30 00:01:08,853 --> 00:01:11,110 ‫and here is the stack trace of the error. 31 00:01:11,110 --> 00:01:13,180 ‫And we're going to handle this error directly 32 00:01:13,180 --> 00:01:15,150 ‫from our Step functions. 33 00:01:15,150 --> 00:01:17,030 ‫So let's create a new state machine 34 00:01:18,330 --> 00:01:20,420 ‫and we're going to author it from scratch. 35 00:01:20,420 --> 00:01:24,950 ‫And for the definition, again, let's go into our code. 36 00:01:24,950 --> 00:01:28,950 ‫So we're going to 1-error-handling, state-machine.json. 37 00:01:28,950 --> 00:01:32,130 ‫Copy this and paste it here. 38 00:01:32,130 --> 00:01:34,220 ‫Refresh the graph and we have the ideas. 39 00:01:34,220 --> 00:01:36,220 ‫So we're going to invoke our function 40 00:01:36,220 --> 00:01:38,940 ‫and if it goes fine, it goes into the end states. 41 00:01:38,940 --> 00:01:41,570 ‫And there are some different kinds of error. 42 00:01:41,570 --> 00:01:44,200 ‫So CustomErrorFallback, ReservedTypeFallback, 43 00:01:44,200 --> 00:01:47,100 ‫and CatchAllFallback, okay? 44 00:01:47,100 --> 00:01:49,870 ‫So if we look at the code itself, 45 00:01:49,870 --> 00:01:52,300 ‫it's going to say how to handle each error. 46 00:01:52,300 --> 00:01:55,260 ‫So let's fix the error in right here first. 47 00:01:55,260 --> 00:01:59,170 ‫We copied the ARN and paste it here. 48 00:01:59,170 --> 00:02:00,230 ‫Great. 49 00:02:00,230 --> 00:02:02,570 ‫So it's going to invoke our function. 50 00:02:02,570 --> 00:02:05,010 ‫And then if the error is of type CustomError 51 00:02:05,010 --> 00:02:06,340 ‫which is the case here. 52 00:02:06,340 --> 00:02:10,100 ‫So as we can see here, we are throwing a CustomError. 53 00:02:10,100 --> 00:02:13,290 ‫Then it's going to wait one second 54 00:02:13,290 --> 00:02:16,070 ‫and do two attempts and have a BackofRate of two. 55 00:02:16,070 --> 00:02:18,320 ‫So there's going to be some retry. 56 00:02:18,320 --> 00:02:21,860 ‫If the task is failed, there's gonna be more 57 00:02:21,860 --> 00:02:24,000 ‫interval seconds, more attempts, and more back of 58 00:02:24,000 --> 00:02:26,480 ‫and all the other types of error is going to go into 59 00:02:26,480 --> 00:02:28,250 ‫the catch all or the States.ALL 60 00:02:28,250 --> 00:02:30,550 ‫and try some more time, okay? 61 00:02:30,550 --> 00:02:31,383 ‫Then there's a catch. 62 00:02:31,383 --> 00:02:34,290 ‫So once the retries are exhausted 63 00:02:34,290 --> 00:02:37,170 ‫then it's going to go into the Catch phase. 64 00:02:37,170 --> 00:02:39,850 ‫And if it's the CustomError 65 00:02:39,850 --> 00:02:41,600 ‫then you go into the CustomErrorFallback. 66 00:02:41,600 --> 00:02:42,900 ‫If it's a test failure, it goes into 67 00:02:42,900 --> 00:02:45,830 ‫the ReservedTypeFallback and States.ALL 68 00:02:45,830 --> 00:02:47,730 ‫it goes into the CatchAllFallback. 69 00:02:47,730 --> 00:02:52,140 ‫And each fallback is going to be difference in here, okay? 70 00:02:52,140 --> 00:02:56,323 ‫So let's click on next, MyStateMachineError. 71 00:02:57,770 --> 00:02:59,570 ‫We're going to create a new role 72 00:02:59,570 --> 00:03:01,840 ‫and create this state machine. 73 00:03:01,840 --> 00:03:05,010 ‫So it is successfully created, let's start the execution 74 00:03:05,010 --> 00:03:07,230 ‫and we'll open this into a new browser tab 75 00:03:09,120 --> 00:03:10,850 ‫and we're going into our workflow. 76 00:03:10,850 --> 00:03:14,190 ‫So we are invoking our function and this is in progress 77 00:03:14,190 --> 00:03:15,980 ‫and we are gonna get some errors. 78 00:03:15,980 --> 00:03:17,780 ‫So let's wait a little bit, but if we look 79 00:03:17,780 --> 00:03:21,510 ‫at the event history, so the Lambda function was started, 80 00:03:21,510 --> 00:03:22,950 ‫and then it failed, okay? 81 00:03:22,950 --> 00:03:25,530 ‫It failed with this CustomError. 82 00:03:25,530 --> 00:03:28,280 ‫And then it went into being scheduled 83 00:03:28,280 --> 00:03:30,530 ‫because it was being retried, okay? 84 00:03:30,530 --> 00:03:32,700 ‫So there's retry because the error message was captured. 85 00:03:32,700 --> 00:03:36,310 ‫So it was started again, then failed, then scheduled again, 86 00:03:36,310 --> 00:03:38,960 ‫then started again, then failed. 87 00:03:38,960 --> 00:03:40,020 ‫And then the task executed 88 00:03:40,020 --> 00:03:42,410 ‫because we have exhausted all our retries. 89 00:03:42,410 --> 00:03:45,600 ‫So the task exited and then it got cut 90 00:03:45,600 --> 00:03:48,630 ‫by the CustomErrorFallback, okay? 91 00:03:48,630 --> 00:03:50,320 ‫And then the execution was finished. 92 00:03:50,320 --> 00:03:54,160 ‫So if we go in here, as we can see this failed twice 93 00:03:54,160 --> 00:03:57,150 ‫and three times, then it gets an error being caught 94 00:03:57,150 --> 00:04:00,180 ‫and it went into the CustomErrorFallback and succeeded. 95 00:04:00,180 --> 00:04:02,540 ‫Now, if we change our Lambda function a little bit 96 00:04:02,540 --> 00:04:06,770 ‫and say we're going to do a NotCustomError 97 00:04:06,770 --> 00:04:10,150 ‫which is an error that wasn't being cut by our code. 98 00:04:10,150 --> 00:04:14,923 ‫This is a different custom error, which is uncut. 99 00:04:15,990 --> 00:04:18,100 ‫So we're going to deploy this, excuse me, 100 00:04:18,100 --> 00:04:19,200 ‫and then click on test. 101 00:04:19,200 --> 00:04:22,580 ‫So this is not the same kind of error, okay? 102 00:04:22,580 --> 00:04:24,470 ‫And we start our step-function again. 103 00:04:24,470 --> 00:04:27,090 ‫So we're going to start a new execution 104 00:04:28,040 --> 00:04:30,883 ‫and open this in a new tab and start the execution. 105 00:04:33,450 --> 00:04:37,440 ‫As we can see it's running, but now the error we get 106 00:04:38,340 --> 00:04:41,010 ‫is that we have a difference error. 107 00:04:41,010 --> 00:04:44,150 ‫And so we expect it to go into a different route now. 108 00:04:44,150 --> 00:04:46,090 ‫So let's wait a little bit 109 00:04:46,090 --> 00:04:49,393 ‫for all the execution to go through. 110 00:04:50,280 --> 00:04:53,560 ‫And so the execution has succeeded. 111 00:04:53,560 --> 00:04:54,820 ‫And if we look at it now 112 00:04:54,820 --> 00:04:57,690 ‫we went into the ReservedTypeFallback. 113 00:04:57,690 --> 00:04:59,530 ‫And if you look at the event history, as we can see 114 00:04:59,530 --> 00:05:02,610 ‫in terms of elapsed time, this first one failed. 115 00:05:02,610 --> 00:05:05,460 ‫And then we waited 30 seconds to have 116 00:05:05,460 --> 00:05:07,820 ‫the second one to be started. 117 00:05:07,820 --> 00:05:11,850 ‫And then we waited 60 seconds, so it doubled the wait time 118 00:05:11,850 --> 00:05:14,000 ‫to go and to try it a third time. 119 00:05:14,000 --> 00:05:16,060 ‫And then it failed twice, so if we have three times, 120 00:05:16,060 --> 00:05:18,230 ‫so it went into the TaskStateExited 121 00:05:18,230 --> 00:05:19,940 ‫and then the execution finished. 122 00:05:19,940 --> 00:05:21,900 ‫So it's really, really cool to see here 123 00:05:21,900 --> 00:05:24,140 ‫that's based on the type of error, we have different 124 00:05:24,140 --> 00:05:26,280 ‫retry logic and we have different catch logic. 125 00:05:26,280 --> 00:05:28,260 ‫This retry logic was very, very quick, 126 00:05:28,260 --> 00:05:30,830 ‫whereas the retry logic here was much longer, 127 00:05:30,830 --> 00:05:33,940 ‫it was every 30 seconds and it went into a different branch. 128 00:05:33,940 --> 00:05:35,120 ‫So hopefully that helps for you 129 00:05:35,120 --> 00:05:37,710 ‫to understand error handling in Step functions. 130 00:05:37,710 --> 00:05:38,543 ‫I hope you liked it. 131 00:05:38,543 --> 00:05:40,480 ‫And to clean up everything, you should go 132 00:05:40,480 --> 00:05:43,140 ‫into the Step Functions and delete your State machines 133 00:05:43,140 --> 00:05:45,410 ‫but it won't cost you anything to have them around 134 00:05:45,410 --> 00:05:46,330 ‫if you want to. 135 00:05:46,330 --> 00:05:47,163 ‫All right, that's it. 136 00:05:47,163 --> 00:05:48,820 ‫I will see you in the next lecture.