1 00:00:00,220 --> 00:00:01,230 So now let's talk about 2 00:00:01,230 --> 00:00:03,200 auto scaling group scaling policies. 3 00:00:03,200 --> 00:00:04,250 And we have two different kinds, 4 00:00:04,250 --> 00:00:06,120 we have the dynamic scaling policies first. 5 00:00:06,120 --> 00:00:07,910 And so within the dynamics scaling policies, 6 00:00:07,910 --> 00:00:10,410 we have three kinds, we have the target tracking scaling, 7 00:00:10,410 --> 00:00:11,500 which is pretty easy. 8 00:00:11,500 --> 00:00:13,420 It's the most simple and easy to set up. 9 00:00:13,420 --> 00:00:15,190 And the idea is that you wanna say, for example, 10 00:00:15,190 --> 00:00:18,830 I want to track the average CPU utilization 11 00:00:18,830 --> 00:00:21,840 of my autoscaling groups across all my EC2 instances 12 00:00:21,840 --> 00:00:23,820 to stay at around 40%. 13 00:00:23,820 --> 00:00:26,640 This is when you want to have a default baseline 14 00:00:26,640 --> 00:00:29,010 and wanna make sure that you're always available. 15 00:00:29,010 --> 00:00:30,920 Simple and step scaling is more involved. 16 00:00:30,920 --> 00:00:33,050 So you set up your own CloudWatch alarms 17 00:00:33,050 --> 00:00:34,570 and when they're triggered so reasonable, 18 00:00:34,570 --> 00:00:38,690 when the CPU goes over 70% for your ASG as a whole, 19 00:00:38,690 --> 00:00:40,900 then add two units of capacity. 20 00:00:40,900 --> 00:00:42,750 And then you would set up a second rule saying, 21 00:00:42,750 --> 00:00:46,760 hey, in case the CPU utilization goes to less than 30% 22 00:00:46,760 --> 00:00:50,320 as a whole within my ASG, then remove one unit. 23 00:00:50,320 --> 00:00:54,440 But you would have to set up your CloudWatch alarms 24 00:00:54,440 --> 00:00:56,080 as well as the steps which is, 25 00:00:56,080 --> 00:00:59,030 how many units you wanna add at a time 26 00:00:59,030 --> 00:01:01,510 and how many units you want to remove at a time. 27 00:01:01,510 --> 00:01:03,440 And finally scheduled actions, 28 00:01:03,440 --> 00:01:05,480 which is to anticipate scaling 29 00:01:05,480 --> 00:01:06,970 based on the known users patterns. 30 00:01:06,970 --> 00:01:08,170 For example, you're saying that, 31 00:01:08,170 --> 00:01:10,520 hey, I know that there's going to be a big event 32 00:01:10,520 --> 00:01:11,760 at 5:00 PM on Fridays 33 00:01:11,760 --> 00:01:13,550 because when people are gonna be done with work 34 00:01:13,550 --> 00:01:15,610 and they're going to use my application, 35 00:01:15,610 --> 00:01:18,100 and therefore you want you to increase the min capacity 36 00:01:18,100 --> 00:01:21,010 automatically of your ASG to 10 at 5:00 PM 37 00:01:21,010 --> 00:01:22,200 every single Friday. 38 00:01:22,200 --> 00:01:23,590 This is a scheduled action 39 00:01:23,590 --> 00:01:25,370 where you know scaling in advance. 40 00:01:25,370 --> 00:01:26,830 And there's a new kind of scaling, 41 00:01:26,830 --> 00:01:28,670 which is called predictive scaling. 42 00:01:28,670 --> 00:01:29,760 So with predictive scaling, 43 00:01:29,760 --> 00:01:32,580 you continually have a forecast being made 44 00:01:32,580 --> 00:01:35,200 by the autoscaling service in AWS. 45 00:01:35,200 --> 00:01:36,440 And it will look at the load 46 00:01:36,440 --> 00:01:39,100 and we'll schedule scaling ahead. 47 00:01:39,100 --> 00:01:40,260 So what will happen is that 48 00:01:40,260 --> 00:01:43,460 the historical load is going to be analyzed over time 49 00:01:43,460 --> 00:01:46,250 and then forecast is going to be created. 50 00:01:46,250 --> 00:01:48,190 And then based on that forecast, 51 00:01:48,190 --> 00:01:51,480 they will be scaling actions being scheduled ahead of time, 52 00:01:51,480 --> 00:01:53,720 which is a quite a cool way of doing scaling as well. 53 00:01:53,720 --> 00:01:55,120 And I think this is the future 54 00:01:55,120 --> 00:01:56,680 because this is machine learning powered, 55 00:01:56,680 --> 00:01:58,940 and this really is a hands-off approach 56 00:01:58,940 --> 00:02:01,370 to automatic scaling for ASG. 57 00:02:01,370 --> 00:02:04,140 So some good metrics with scale on is a big question. 58 00:02:04,140 --> 00:02:07,370 So it depends really on what your application is doing 59 00:02:07,370 --> 00:02:08,830 and how it's working, 60 00:02:08,830 --> 00:02:10,970 but usually here are a few. 61 00:02:10,970 --> 00:02:13,410 So number one is CPU utilization 62 00:02:13,410 --> 00:02:16,220 because every time your instance receive a request, 63 00:02:16,220 --> 00:02:18,410 usually they will do some sort of computation 64 00:02:18,410 --> 00:02:20,560 and so it will use some CPU. 65 00:02:20,560 --> 00:02:23,150 And so if you look at the average CPU utilization 66 00:02:23,150 --> 00:02:26,060 across all your instances and it goes higher, 67 00:02:26,060 --> 00:02:28,100 that means that your instances are being more utilized 68 00:02:28,100 --> 00:02:30,820 and so it would be a good metric to scale on. 69 00:02:30,820 --> 00:02:32,110 Another metric to scale on, 70 00:02:32,110 --> 00:02:34,000 it's more like application specific, 71 00:02:34,000 --> 00:02:35,930 but it is a request counts per targets, 72 00:02:35,930 --> 00:02:37,510 which is based on your testing. 73 00:02:37,510 --> 00:02:39,270 You know that your EC2 instances 74 00:02:39,270 --> 00:02:41,060 operate at an optimal request 75 00:02:41,060 --> 00:02:44,650 of 1000 per request per target at a time 76 00:02:44,650 --> 00:02:46,540 and so maybe this is the target you want to have 77 00:02:46,540 --> 00:02:48,680 for your scaling. 78 00:02:48,680 --> 00:02:49,870 So here's an example. 79 00:02:49,870 --> 00:02:52,920 You have an auto scaling group with three EC2 instances, 80 00:02:52,920 --> 00:02:56,250 and your lb is currently spreading the instance request 81 00:02:56,250 --> 00:02:57,400 across all of them. 82 00:02:57,400 --> 00:02:59,890 So right now the value of the request 83 00:02:59,890 --> 00:03:01,850 counts per target metric is three 84 00:03:01,850 --> 00:03:04,520 because each EC2 instance on average 85 00:03:04,520 --> 00:03:06,933 has three requests outstanding. 86 00:03:07,870 --> 00:03:10,080 Next, if your application is network bound, 87 00:03:10,080 --> 00:03:12,760 so for example, there's a lot of uploads and downloads 88 00:03:12,760 --> 00:03:15,380 and you know that network is going to be a bottleneck 89 00:03:15,380 --> 00:03:17,210 for your EC2 instances, 90 00:03:17,210 --> 00:03:20,290 then you may want to scale on the average network in or out 91 00:03:20,290 --> 00:03:23,430 to make sure that if you reach some certain threshold, 92 00:03:23,430 --> 00:03:25,710 then you're going to scale based on that. 93 00:03:25,710 --> 00:03:27,880 Or any custom metrics that you push to CloudWatch, 94 00:03:27,880 --> 00:03:29,400 you can set up your own metrics 95 00:03:29,400 --> 00:03:31,210 that are going to be application specific 96 00:03:31,210 --> 00:03:34,950 and based on that, you can set up your scaling policies. 97 00:03:34,950 --> 00:03:37,200 Now, what else would you need to know about scaling policies 98 00:03:37,200 --> 00:03:38,960 is what's called the scaling cooldown. 99 00:03:38,960 --> 00:03:41,950 So the idea is that after there is a scaling activity, 100 00:03:41,950 --> 00:03:45,090 so whenever you add, or you remove instances, 101 00:03:45,090 --> 00:03:46,990 you are entering the cool-down period, 102 00:03:46,990 --> 00:03:50,140 which is by default five minutes or 300 seconds. 103 00:03:50,140 --> 00:03:51,907 And during that cooldown period, 104 00:03:51,907 --> 00:03:56,440 the ASG will not launch or terminate additional instances. 105 00:03:56,440 --> 00:03:59,120 And the reason behind this reasoning is that 106 00:03:59,120 --> 00:04:01,630 you allow four metrics to stabilize, okay, 107 00:04:01,630 --> 00:04:04,400 for your new instance to enter into effect 108 00:04:04,400 --> 00:04:07,200 and to see what the new metric will become. 109 00:04:07,200 --> 00:04:08,150 So the idea is that 110 00:04:08,150 --> 00:04:10,000 when there is a scaling action that occurs, 111 00:04:10,000 --> 00:04:13,050 the question is, is there a default cooldown in effects? 112 00:04:13,050 --> 00:04:14,560 If yes, then ignore the action. 113 00:04:14,560 --> 00:04:17,230 If no, then proceeded with the scaling action 114 00:04:17,230 --> 00:04:19,440 which is to launch or terminate instances. 115 00:04:19,440 --> 00:04:22,400 And so in advice to you, is to use a ready-to-use AMI 116 00:04:22,400 --> 00:04:26,260 to reduce the configuration time for your EC2 instances 117 00:04:26,260 --> 00:04:27,660 in order for them to request 118 00:04:27,660 --> 00:04:29,580 or to be serving the requests faster. 119 00:04:29,580 --> 00:04:32,810 So if you don't spend time configuring your EC2 instance, 120 00:04:32,810 --> 00:04:35,180 then they can be in effect right away. 121 00:04:35,180 --> 00:04:37,870 And then because there can be active way faster 122 00:04:37,870 --> 00:04:39,570 then the cooldown period can be decreased, 123 00:04:39,570 --> 00:04:42,633 and you can have a more dynamic scaling up and down 124 00:04:42,633 --> 00:04:44,800 of your ASG. 125 00:04:44,800 --> 00:04:46,030 And of course you need to make sure 126 00:04:46,030 --> 00:04:48,550 to enable something like detailed monitoring for ASG 127 00:04:48,550 --> 00:04:52,620 to get access to lower level two metrics every one minute, 128 00:04:52,620 --> 00:04:54,530 and to make sure that you have these metrics 129 00:04:54,530 --> 00:04:56,030 being updated fast enough. 130 00:04:56,030 --> 00:04:57,200 So that's it for this lecture. 131 00:04:57,200 --> 00:04:58,130 I hope you liked it, 132 00:04:58,130 --> 00:05:00,080 and I will see you in the next lecture.