1
00:00:00,220 --> 00:00:01,230
So now let's talk about

2
00:00:01,230 --> 00:00:03,200
auto scaling group scaling policies.

3
00:00:03,200 --> 00:00:04,250
And we have two different kinds,

4
00:00:04,250 --> 00:00:06,120
we have the dynamic scaling policies first.

5
00:00:06,120 --> 00:00:07,910
And so within the dynamics scaling policies,

6
00:00:07,910 --> 00:00:10,410
we have three kinds, we have the target tracking scaling,

7
00:00:10,410 --> 00:00:11,500
which is pretty easy.

8
00:00:11,500 --> 00:00:13,420
It's the most simple and easy to set up.

9
00:00:13,420 --> 00:00:15,190
And the idea is that you wanna say, for example,

10
00:00:15,190 --> 00:00:18,830
I want to track the average CPU utilization

11
00:00:18,830 --> 00:00:21,840
of my autoscaling groups across all my EC2 instances

12
00:00:21,840 --> 00:00:23,820
to stay at around 40%.

13
00:00:23,820 --> 00:00:26,640
This is when you want to have a default baseline

14
00:00:26,640 --> 00:00:29,010
and wanna make sure that you're always available.

15
00:00:29,010 --> 00:00:30,920
Simple and step scaling is more involved.

16
00:00:30,920 --> 00:00:33,050
So you set up your own CloudWatch alarms

17
00:00:33,050 --> 00:00:34,570
and when they're triggered so reasonable,

18
00:00:34,570 --> 00:00:38,690
when the CPU goes over 70% for your ASG as a whole,

19
00:00:38,690 --> 00:00:40,900
then add two units of capacity.

20
00:00:40,900 --> 00:00:42,750
And then you would set up a second rule saying,

21
00:00:42,750 --> 00:00:46,760
hey, in case the CPU utilization goes to less than 30%

22
00:00:46,760 --> 00:00:50,320
as a whole within my ASG, then remove one unit.

23
00:00:50,320 --> 00:00:54,440
But you would have to set up your CloudWatch alarms

24
00:00:54,440 --> 00:00:56,080
as well as the steps which is,

25
00:00:56,080 --> 00:00:59,030
how many units you wanna add at a time

26
00:00:59,030 --> 00:01:01,510
and how many units you want to remove at a time.

27
00:01:01,510 --> 00:01:03,440
And finally scheduled actions,

28
00:01:03,440 --> 00:01:05,480
which is to anticipate scaling

29
00:01:05,480 --> 00:01:06,970
based on the known users patterns.

30
00:01:06,970 --> 00:01:08,170
For example, you're saying that,

31
00:01:08,170 --> 00:01:10,520
hey, I know that there's going to be a big event

32
00:01:10,520 --> 00:01:11,760
at 5:00 PM on Fridays

33
00:01:11,760 --> 00:01:13,550
because when people are gonna be done with work

34
00:01:13,550 --> 00:01:15,610
and they're going to use my application,

35
00:01:15,610 --> 00:01:18,100
and therefore you want you to increase the min capacity

36
00:01:18,100 --> 00:01:21,010
automatically of your ASG to 10 at 5:00 PM

37
00:01:21,010 --> 00:01:22,200
every single Friday.

38
00:01:22,200 --> 00:01:23,590
This is a scheduled action

39
00:01:23,590 --> 00:01:25,370
where you know scaling in advance.

40
00:01:25,370 --> 00:01:26,830
And there's a new kind of scaling,

41
00:01:26,830 --> 00:01:28,670
which is called predictive scaling.

42
00:01:28,670 --> 00:01:29,760
So with predictive scaling,

43
00:01:29,760 --> 00:01:32,580
you continually have a forecast being made

44
00:01:32,580 --> 00:01:35,200
by the autoscaling service in AWS.

45
00:01:35,200 --> 00:01:36,440
And it will look at the load

46
00:01:36,440 --> 00:01:39,100
and we'll schedule scaling ahead.

47
00:01:39,100 --> 00:01:40,260
So what will happen is that

48
00:01:40,260 --> 00:01:43,460
the historical load is going to be analyzed over time

49
00:01:43,460 --> 00:01:46,250
and then forecast is going to be created.

50
00:01:46,250 --> 00:01:48,190
And then based on that forecast,

51
00:01:48,190 --> 00:01:51,480
they will be scaling actions being scheduled ahead of time,

52
00:01:51,480 --> 00:01:53,720
which is a quite a cool way of doing scaling as well.

53
00:01:53,720 --> 00:01:55,120
And I think this is the future

54
00:01:55,120 --> 00:01:56,680
because this is machine learning powered,

55
00:01:56,680 --> 00:01:58,940
and this really is a hands-off approach

56
00:01:58,940 --> 00:02:01,370
to automatic scaling for ASG.

57
00:02:01,370 --> 00:02:04,140
So some good metrics with scale on is a big question.

58
00:02:04,140 --> 00:02:07,370
So it depends really on what your application is doing

59
00:02:07,370 --> 00:02:08,830
and how it's working,

60
00:02:08,830 --> 00:02:10,970
but usually here are a few.

61
00:02:10,970 --> 00:02:13,410
So number one is CPU utilization

62
00:02:13,410 --> 00:02:16,220
because every time your instance receive a request,

63
00:02:16,220 --> 00:02:18,410
usually they will do some sort of computation

64
00:02:18,410 --> 00:02:20,560
and so it will use some CPU.

65
00:02:20,560 --> 00:02:23,150
And so if you look at the average CPU utilization

66
00:02:23,150 --> 00:02:26,060
across all your instances and it goes higher,

67
00:02:26,060 --> 00:02:28,100
that means that your instances are being more utilized

68
00:02:28,100 --> 00:02:30,820
and so it would be a good metric to scale on.

69
00:02:30,820 --> 00:02:32,110
Another metric to scale on,

70
00:02:32,110 --> 00:02:34,000
it's more like application specific,

71
00:02:34,000 --> 00:02:35,930
but it is a request counts per targets,

72
00:02:35,930 --> 00:02:37,510
which is based on your testing.

73
00:02:37,510 --> 00:02:39,270
You know that your EC2 instances

74
00:02:39,270 --> 00:02:41,060
operate at an optimal request

75
00:02:41,060 --> 00:02:44,650
of 1000 per request per target at a time

76
00:02:44,650 --> 00:02:46,540
and so maybe this is the target you want to have

77
00:02:46,540 --> 00:02:48,680
for your scaling.

78
00:02:48,680 --> 00:02:49,870
So here's an example.

79
00:02:49,870 --> 00:02:52,920
You have an auto scaling group with three EC2 instances,

80
00:02:52,920 --> 00:02:56,250
and your lb is currently spreading the instance request

81
00:02:56,250 --> 00:02:57,400
across all of them.

82
00:02:57,400 --> 00:02:59,890
So right now the value of the request

83
00:02:59,890 --> 00:03:01,850
counts per target metric is three

84
00:03:01,850 --> 00:03:04,520
because each EC2 instance on average

85
00:03:04,520 --> 00:03:06,933
has three requests outstanding.

86
00:03:07,870 --> 00:03:10,080
Next, if your application is network bound,

87
00:03:10,080 --> 00:03:12,760
so for example, there's a lot of uploads and downloads

88
00:03:12,760 --> 00:03:15,380
and you know that network is going to be a bottleneck

89
00:03:15,380 --> 00:03:17,210
for your EC2 instances,

90
00:03:17,210 --> 00:03:20,290
then you may want to scale on the average network in or out

91
00:03:20,290 --> 00:03:23,430
to make sure that if you reach some certain threshold,

92
00:03:23,430 --> 00:03:25,710
then you're going to scale based on that.

93
00:03:25,710 --> 00:03:27,880
Or any custom metrics that you push to CloudWatch,

94
00:03:27,880 --> 00:03:29,400
you can set up your own metrics

95
00:03:29,400 --> 00:03:31,210
that are going to be application specific

96
00:03:31,210 --> 00:03:34,950
and based on that, you can set up your scaling policies.

97
00:03:34,950 --> 00:03:37,200
Now, what else would you need to know about scaling policies

98
00:03:37,200 --> 00:03:38,960
is what's called the scaling cooldown.

99
00:03:38,960 --> 00:03:41,950
So the idea is that after there is a scaling activity,

100
00:03:41,950 --> 00:03:45,090
so whenever you add, or you remove instances,

101
00:03:45,090 --> 00:03:46,990
you are entering the cool-down period,

102
00:03:46,990 --> 00:03:50,140
which is by default five minutes or 300 seconds.

103
00:03:50,140 --> 00:03:51,907
And during that cooldown period,

104
00:03:51,907 --> 00:03:56,440
the ASG will not launch or terminate additional instances.

105
00:03:56,440 --> 00:03:59,120
And the reason behind this reasoning is that

106
00:03:59,120 --> 00:04:01,630
you allow four metrics to stabilize, okay,

107
00:04:01,630 --> 00:04:04,400
for your new instance to enter into effect

108
00:04:04,400 --> 00:04:07,200
and to see what the new metric will become.

109
00:04:07,200 --> 00:04:08,150
So the idea is that

110
00:04:08,150 --> 00:04:10,000
when there is a scaling action that occurs,

111
00:04:10,000 --> 00:04:13,050
the question is, is there a default cooldown in effects?

112
00:04:13,050 --> 00:04:14,560
If yes, then ignore the action.

113
00:04:14,560 --> 00:04:17,230
If no, then proceeded with the scaling action

114
00:04:17,230 --> 00:04:19,440
which is to launch or terminate instances.

115
00:04:19,440 --> 00:04:22,400
And so in advice to you, is to use a ready-to-use AMI

116
00:04:22,400 --> 00:04:26,260
to reduce the configuration time for your EC2 instances

117
00:04:26,260 --> 00:04:27,660
in order for them to request

118
00:04:27,660 --> 00:04:29,580
or to be serving the requests faster.

119
00:04:29,580 --> 00:04:32,810
So if you don't spend time configuring your EC2 instance,

120
00:04:32,810 --> 00:04:35,180
then they can be in effect right away.

121
00:04:35,180 --> 00:04:37,870
And then because there can be active way faster

122
00:04:37,870 --> 00:04:39,570
then the cooldown period can be decreased,

123
00:04:39,570 --> 00:04:42,633
and you can have a more dynamic scaling up and down

124
00:04:42,633 --> 00:04:44,800
of your ASG.

125
00:04:44,800 --> 00:04:46,030
And of course you need to make sure

126
00:04:46,030 --> 00:04:48,550
to enable something like detailed monitoring for ASG

127
00:04:48,550 --> 00:04:52,620
to get access to lower level two metrics every one minute,

128
00:04:52,620 --> 00:04:54,530
and to make sure that you have these metrics

129
00:04:54,530 --> 00:04:56,030
being updated fast enough.

130
00:04:56,030 --> 00:04:57,200
So that's it for this lecture.

131
00:04:57,200 --> 00:04:58,130
I hope you liked it,

132
00:04:58,130 --> 00:05:00,080
and I will see you in the next lecture.