1
00:00:00,220 --> 00:00:01,230
‫So now let's talk about

2
00:00:01,230 --> 00:00:03,200
‫auto scaling group scaling policies.

3
00:00:03,200 --> 00:00:04,250
‫And we have two different kinds,

4
00:00:04,250 --> 00:00:06,120
‫we have the dynamic scaling policies first.

5
00:00:06,120 --> 00:00:07,910
‫And so within the dynamics scaling policies,

6
00:00:07,910 --> 00:00:10,410
‫we have three kinds, we have the target tracking scaling,

7
00:00:10,410 --> 00:00:11,500
‫which is pretty easy.

8
00:00:11,500 --> 00:00:13,420
‫It's the most simple and easy to set up.

9
00:00:13,420 --> 00:00:15,190
‫And the idea is that you wanna say, for example,

10
00:00:15,190 --> 00:00:18,830
‫I want to track the average CPU utilization

11
00:00:18,830 --> 00:00:21,840
‫of my autoscaling groups across all my EC2 instances

12
00:00:21,840 --> 00:00:23,820
‫to stay at around 40%.

13
00:00:23,820 --> 00:00:26,640
‫This is when you want to have a default baseline

14
00:00:26,640 --> 00:00:29,010
‫and wanna make sure that you're always available.

15
00:00:29,010 --> 00:00:30,920
‫Simple and step scaling is more involved.

16
00:00:30,920 --> 00:00:33,050
‫So you set up your own CloudWatch alarms

17
00:00:33,050 --> 00:00:34,570
‫and when they're triggered so reasonable,

18
00:00:34,570 --> 00:00:38,690
‫when the CPU goes over 70% for your ASG as a whole,

19
00:00:38,690 --> 00:00:40,900
‫then add two units of capacity.

20
00:00:40,900 --> 00:00:42,750
‫And then you would set up a second rule saying,

21
00:00:42,750 --> 00:00:46,760
‫hey, in case the CPU utilization goes to less than 30%

22
00:00:46,760 --> 00:00:50,320
‫as a whole within my ASG, then remove one unit.

23
00:00:50,320 --> 00:00:54,440
‫But you would have to set up your CloudWatch alarms

24
00:00:54,440 --> 00:00:56,080
‫as well as the steps which is,

25
00:00:56,080 --> 00:00:59,030
‫how many units you wanna add at a time

26
00:00:59,030 --> 00:01:01,510
‫and how many units you want to remove at a time.

27
00:01:01,510 --> 00:01:03,440
‫And finally scheduled actions,

28
00:01:03,440 --> 00:01:05,480
‫which is to anticipate scaling

29
00:01:05,480 --> 00:01:06,970
‫based on the known users patterns.

30
00:01:06,970 --> 00:01:08,170
‫For example, you're saying that,

31
00:01:08,170 --> 00:01:10,520
‫hey, I know that there's going to be a big event

32
00:01:10,520 --> 00:01:11,760
‫at 5:00 PM on Fridays

33
00:01:11,760 --> 00:01:13,550
‫because when people are gonna be done with work

34
00:01:13,550 --> 00:01:15,610
‫and they're going to use my application,

35
00:01:15,610 --> 00:01:18,100
‫and therefore you want you to increase the min capacity

36
00:01:18,100 --> 00:01:21,010
‫automatically of your ASG to 10 at 5:00 PM

37
00:01:21,010 --> 00:01:22,200
‫every single Friday.

38
00:01:22,200 --> 00:01:23,590
‫This is a scheduled action

39
00:01:23,590 --> 00:01:25,370
‫where you know scaling in advance.

40
00:01:25,370 --> 00:01:26,830
‫And there's a new kind of scaling,

41
00:01:26,830 --> 00:01:28,670
‫which is called predictive scaling.

42
00:01:28,670 --> 00:01:29,760
‫So with predictive scaling,

43
00:01:29,760 --> 00:01:32,580
‫you continually have a forecast being made

44
00:01:32,580 --> 00:01:35,200
‫by the autoscaling service in AWS.

45
00:01:35,200 --> 00:01:36,440
‫And it will look at the load

46
00:01:36,440 --> 00:01:39,100
‫and we'll schedule scaling ahead.

47
00:01:39,100 --> 00:01:40,260
‫So what will happen is that

48
00:01:40,260 --> 00:01:43,460
‫the historical load is going to be analyzed over time

49
00:01:43,460 --> 00:01:46,250
‫and then forecast is going to be created.

50
00:01:46,250 --> 00:01:48,190
‫And then based on that forecast,

51
00:01:48,190 --> 00:01:51,480
‫they will be scaling actions being scheduled ahead of time,

52
00:01:51,480 --> 00:01:53,720
‫which is a quite a cool way of doing scaling as well.

53
00:01:53,720 --> 00:01:55,120
‫And I think this is the future

54
00:01:55,120 --> 00:01:56,680
‫because this is machine learning powered,

55
00:01:56,680 --> 00:01:58,940
‫and this really is a hands-off approach

56
00:01:58,940 --> 00:02:01,370
‫to automatic scaling for ASG.

57
00:02:01,370 --> 00:02:04,140
‫So some good metrics with scale on is a big question.

58
00:02:04,140 --> 00:02:07,370
‫So it depends really on what your application is doing

59
00:02:07,370 --> 00:02:08,830
‫and how it's working,

60
00:02:08,830 --> 00:02:10,970
‫but usually here are a few.

61
00:02:10,970 --> 00:02:13,410
‫So number one is CPU utilization

62
00:02:13,410 --> 00:02:16,220
‫because every time your instance receive a request,

63
00:02:16,220 --> 00:02:18,410
‫usually they will do some sort of computation

64
00:02:18,410 --> 00:02:20,560
‫and so it will use some CPU.

65
00:02:20,560 --> 00:02:23,150
‫And so if you look at the average CPU utilization

66
00:02:23,150 --> 00:02:26,060
‫across all your instances and it goes higher,

67
00:02:26,060 --> 00:02:28,100
‫that means that your instances are being more utilized

68
00:02:28,100 --> 00:02:30,820
‫and so it would be a good metric to scale on.

69
00:02:30,820 --> 00:02:32,110
‫Another metric to scale on,

70
00:02:32,110 --> 00:02:34,000
‫it's more like application specific,

71
00:02:34,000 --> 00:02:35,930
‫but it is a request counts per targets,

72
00:02:35,930 --> 00:02:37,510
‫which is based on your testing.

73
00:02:37,510 --> 00:02:39,270
‫You know that your EC2 instances

74
00:02:39,270 --> 00:02:41,060
‫operate at an optimal request

75
00:02:41,060 --> 00:02:44,650
‫of 1000 per request per target at a time

76
00:02:44,650 --> 00:02:46,540
‫and so maybe this is the target you want to have

77
00:02:46,540 --> 00:02:48,680
‫for your scaling.

78
00:02:48,680 --> 00:02:49,870
‫So here's an example.

79
00:02:49,870 --> 00:02:52,920
‫You have an auto scaling group with three EC2 instances,

80
00:02:52,920 --> 00:02:56,250
‫and your lb is currently spreading the instance request

81
00:02:56,250 --> 00:02:57,400
‫across all of them.

82
00:02:57,400 --> 00:02:59,890
‫So right now the value of the request

83
00:02:59,890 --> 00:03:01,850
‫counts per target metric is three

84
00:03:01,850 --> 00:03:04,520
‫because each EC2 instance on average

85
00:03:04,520 --> 00:03:06,933
‫has three requests outstanding.

86
00:03:07,870 --> 00:03:10,080
‫Next, if your application is network bound,

87
00:03:10,080 --> 00:03:12,760
‫so for example, there's a lot of uploads and downloads

88
00:03:12,760 --> 00:03:15,380
‫and you know that network is going to be a bottleneck

89
00:03:15,380 --> 00:03:17,210
‫for your EC2 instances,

90
00:03:17,210 --> 00:03:20,290
‫then you may want to scale on the average network in or out

91
00:03:20,290 --> 00:03:23,430
‫to make sure that if you reach some certain threshold,

92
00:03:23,430 --> 00:03:25,710
‫then you're going to scale based on that.

93
00:03:25,710 --> 00:03:27,880
‫Or any custom metrics that you push to CloudWatch,

94
00:03:27,880 --> 00:03:29,400
‫you can set up your own metrics

95
00:03:29,400 --> 00:03:31,210
‫that are going to be application specific

96
00:03:31,210 --> 00:03:34,950
‫and based on that, you can set up your scaling policies.

97
00:03:34,950 --> 00:03:37,200
‫Now, what else would you need to know about scaling policies

98
00:03:37,200 --> 00:03:38,960
‫is what's called the scaling cooldown.

99
00:03:38,960 --> 00:03:41,950
‫So the idea is that after there is a scaling activity,

100
00:03:41,950 --> 00:03:45,090
‫so whenever you add, or you remove instances,

101
00:03:45,090 --> 00:03:46,990
‫you are entering the cool-down period,

102
00:03:46,990 --> 00:03:50,140
‫which is by default five minutes or 300 seconds.

103
00:03:50,140 --> 00:03:51,907
‫And during that cooldown period,

104
00:03:51,907 --> 00:03:56,440
‫the ASG will not launch or terminate additional instances.

105
00:03:56,440 --> 00:03:59,120
‫And the reason behind this reasoning is that

106
00:03:59,120 --> 00:04:01,630
‫you allow four metrics to stabilize, okay,

107
00:04:01,630 --> 00:04:04,400
‫for your new instance to enter into effect

108
00:04:04,400 --> 00:04:07,200
‫and to see what the new metric will become.

109
00:04:07,200 --> 00:04:08,150
‫So the idea is that

110
00:04:08,150 --> 00:04:10,000
‫when there is a scaling action that occurs,

111
00:04:10,000 --> 00:04:13,050
‫the question is, is there a default cooldown in effects?

112
00:04:13,050 --> 00:04:14,560
‫If yes, then ignore the action.

113
00:04:14,560 --> 00:04:17,230
‫If no, then proceeded with the scaling action

114
00:04:17,230 --> 00:04:19,440
‫which is to launch or terminate instances.

115
00:04:19,440 --> 00:04:22,400
‫And so in advice to you, is to use a ready-to-use AMI

116
00:04:22,400 --> 00:04:26,260
‫to reduce the configuration time for your EC2 instances

117
00:04:26,260 --> 00:04:27,660
‫in order for them to request

118
00:04:27,660 --> 00:04:29,580
‫or to be serving the requests faster.

119
00:04:29,580 --> 00:04:32,810
‫So if you don't spend time configuring your EC2 instance,

120
00:04:32,810 --> 00:04:35,180
‫then they can be in effect right away.

121
00:04:35,180 --> 00:04:37,870
‫And then because there can be active way faster

122
00:04:37,870 --> 00:04:39,570
‫then the cooldown period can be decreased,

123
00:04:39,570 --> 00:04:42,633
‫and you can have a more dynamic scaling up and down

124
00:04:42,633 --> 00:04:44,800
‫of your ASG.

125
00:04:44,800 --> 00:04:46,030
‫And of course you need to make sure

126
00:04:46,030 --> 00:04:48,550
‫to enable something like detailed monitoring for ASG

127
00:04:48,550 --> 00:04:52,620
‫to get access to lower level two metrics every one minute,

128
00:04:52,620 --> 00:04:54,530
‫and to make sure that you have these metrics

129
00:04:54,530 --> 00:04:56,030
‫being updated fast enough.

130
00:04:56,030 --> 00:04:57,200
‫So that's it for this lecture.

131
00:04:57,200 --> 00:04:58,130
‫I hope you liked it,

132
00:04:58,130 --> 00:05:00,080
‫and I will see you in the next lecture.