1
00:00:00,210 --> 00:00:02,100
So let's talk about the advanced features

2
00:00:02,100 --> 00:00:04,440
that can come up at the exam for DynamoDB,

3
00:00:04,440 --> 00:00:07,890
and the first one is DynamoDB Accelerator or DAX.

4
00:00:07,890 --> 00:00:10,080
So this is a fully-managed, highly available,

5
00:00:10,080 --> 00:00:14,310
and seamless in-memory cache for DynamoDB.

6
00:00:14,310 --> 00:00:15,750
The idea is that if you have a lot

7
00:00:15,750 --> 00:00:17,640
of reads on your DynamoDB level,

8
00:00:17,640 --> 00:00:20,760
then you can create a DAX cluster to solve

9
00:00:20,760 --> 00:00:25,760
read congestion by caching the data, and with DynamoDB DAX,

10
00:00:25,800 --> 00:00:28,650
you get microseconds latency for cache data,

11
00:00:28,650 --> 00:00:30,330
so this is something you have to look out

12
00:00:30,330 --> 00:00:32,549
for in terms of keywords at the exam.

13
00:00:32,549 --> 00:00:34,373
It doesn't require for you to change any

14
00:00:34,373 --> 00:00:36,300
of your application logic,

15
00:00:36,300 --> 00:00:38,790
because the DAX cluster is compatible

16
00:00:38,790 --> 00:00:41,400
with the existing dynamic DynamoDB APIs.

17
00:00:41,400 --> 00:00:44,130
So you have your DynamoDB tables and your application,

18
00:00:44,130 --> 00:00:46,770
and you would just create a DAX cluster made

19
00:00:46,770 --> 00:00:49,680
of a few cache nodes, connect to this DAX cluster,

20
00:00:49,680 --> 00:00:52,920
and then behind the scenes, the DAX cluster is connected

21
00:00:52,920 --> 00:00:55,230
to your Amazon DynamoDB table.

22
00:00:55,230 --> 00:00:58,110
The cache has a default TTL of five minutes,

23
00:00:58,110 --> 00:00:59,790
but you can change this.

24
00:00:59,790 --> 00:01:03,420
So you may ask me why should I use DAX and not ElastiCache?

25
00:01:03,420 --> 00:01:05,910
Well, DAX is in front of DynamoDB,

26
00:01:05,910 --> 00:01:07,200
and it's going to be very helpful

27
00:01:07,200 --> 00:01:09,210
for individual objects cache,

28
00:01:09,210 --> 00:01:11,673
or queries and scan queries cache.

29
00:01:12,510 --> 00:01:16,020
But if you want to store, say, aggregation results,

30
00:01:16,020 --> 00:01:18,720
then Amazon ElastiCache is a great way to do it,

31
00:01:18,720 --> 00:01:21,720
so if you want to store a very big computation

32
00:01:21,720 --> 00:01:24,390
that you've done on top of Amazon DynamoDB.

33
00:01:24,390 --> 00:01:26,910
So they're not drop-in replacements for one another.

34
00:01:26,910 --> 00:01:29,640
They're actually complementary, but most of the time

35
00:01:29,640 --> 00:01:32,430
for caching solution on top of Amazon DynamoDB,

36
00:01:32,430 --> 00:01:36,900
it's just going to be using DynamoDB Accelerator DAX.

37
00:01:36,900 --> 00:01:40,080
You can also do stream processing on top of DynamoDB.

38
00:01:40,080 --> 00:01:41,988
The idea is that you want to have a stream

39
00:01:41,988 --> 00:01:45,030
of all the modifications that happen on your table,

40
00:01:45,030 --> 00:01:48,300
whether it be create, update and delete, and the use case

41
00:01:48,300 --> 00:01:50,880
for that will be for example, to react to changes

42
00:01:50,880 --> 00:01:53,820
on your DynamoDB table in real time, for example,

43
00:01:53,820 --> 00:01:56,880
to send a welcome email whenever you have a new user

44
00:01:56,880 --> 00:01:59,580
in your user's table, or if so, you may wanna do

45
00:01:59,580 --> 00:02:03,480
real-time usage analytics, or maybe you want to insert data

46
00:02:03,480 --> 00:02:06,360
into a derivative table, or you want to implement

47
00:02:06,360 --> 00:02:09,990
cross-region replication, or you want to invoke Lambda

48
00:02:09,990 --> 00:02:13,350
on any changes made on your DynamoDB table.

49
00:02:13,350 --> 00:02:16,170
So two kind of stream processing on DynamoDB,

50
00:02:16,170 --> 00:02:19,890
you have DynamoDB streams with 24 hours retention,

51
00:02:19,890 --> 00:02:21,870
a limited number of consumers,

52
00:02:21,870 --> 00:02:24,840
and is going to be greatly used with Lambda Triggers,

53
00:02:24,840 --> 00:02:26,610
or if you wanted to read it yourself, there is

54
00:02:26,610 --> 00:02:30,900
something called the DynamoDB Stream Kinesis adapter,

55
00:02:30,900 --> 00:02:33,840
or you can also choose to send all your changes directly

56
00:02:33,840 --> 00:02:35,880
into a Kinesis data streams.

57
00:02:35,880 --> 00:02:38,760
And here you can have up to one year of retention,

58
00:02:38,760 --> 00:02:41,520
you can have a much higher number of consumers,

59
00:02:41,520 --> 00:02:43,500
and you have a much higher number of ways

60
00:02:43,500 --> 00:02:46,560
to process the data, whether we blend the functions,

61
00:02:46,560 --> 00:02:49,620
can you see data analytics, can you see Data Firehose,

62
00:02:49,620 --> 00:02:51,963
Glue streaming ETLs, and so on?

63
00:02:53,010 --> 00:02:55,290
So let's have a look at an architectural diagram

64
00:02:55,290 --> 00:02:56,850
to understand DynamoDB Streams better.

65
00:02:56,850 --> 00:02:58,920
So your application does create updates

66
00:02:58,920 --> 00:03:02,100
and delete operations on top of your DynamoDB table,

67
00:03:02,100 --> 00:03:05,010
which is going to be either a DynamoDB Streams,

68
00:03:05,010 --> 00:03:07,260
or a Kinesis data streams.

69
00:03:07,260 --> 00:03:09,690
If you choose to use Kinesis data streams,

70
00:03:09,690 --> 00:03:11,760
then you can use Kinesis Data Firehose,

71
00:03:11,760 --> 00:03:13,350
and from this, you can send your data

72
00:03:13,350 --> 00:03:16,500
into Amazon Redshift for analytics purposes,

73
00:03:16,500 --> 00:03:19,500
or Amazon S3 if you want to archive some of the data,

74
00:03:19,500 --> 00:03:23,010
or Amazon OpenSearch to do some indexing and some searches

75
00:03:23,010 --> 00:03:26,880
on top of it, or if you are using DynamoDB Streams,

76
00:03:26,880 --> 00:03:28,830
you can have a processing layer, and you can use

77
00:03:28,830 --> 00:03:31,830
either the DynamoDB KCL Adapter to run your applications

78
00:03:31,830 --> 00:03:34,980
on top of EC two instances or Lambda functions,

79
00:03:34,980 --> 00:03:38,190
and from there, do some notifications on top of SNS,

80
00:03:38,190 --> 00:03:40,590
or do some filtering and transformations

81
00:03:40,590 --> 00:03:43,020
into another DynamoDB table,

82
00:03:43,020 --> 00:03:45,420
or use the processing layer to do whatever you want,

83
00:03:45,420 --> 00:03:49,200
such as again, send the data to Amazon OpenSearch.

84
00:03:49,200 --> 00:03:51,600
So I didn't represent all the possibilities

85
00:03:51,600 --> 00:03:52,433
of architectures.

86
00:03:52,433 --> 00:03:54,870
Of course, you can have EC two instances reading

87
00:03:54,870 --> 00:03:56,280
from Kinesis data streams.

88
00:03:56,280 --> 00:03:58,080
You can have Kinesis data analytics and so on

89
00:03:58,080 --> 00:04:00,930
on type of Kinesis data streams, many, many transformations,

90
00:04:00,930 --> 00:04:03,630
but again, you know enough now to figure out

91
00:04:03,630 --> 00:04:05,055
what is going to be the right architecture

92
00:04:05,055 --> 00:04:06,873
at the right time.

93
00:04:07,920 --> 00:04:10,890
DynamoDB also has a concept of global table.

94
00:04:10,890 --> 00:04:13,170
So a global table is a table, of course,

95
00:04:13,170 --> 00:04:16,140
that is going to be replicated across multiple regions.

96
00:04:16,140 --> 00:04:18,600
So you can have a table in US-east-one,

97
00:04:18,600 --> 00:04:21,329
and a table in AP-southeast-two.

98
00:04:21,329 --> 00:04:24,030
And there is a two-way replication between the tables.

99
00:04:24,030 --> 00:04:26,430
That means that you can write to a table either

100
00:04:26,430 --> 00:04:29,190
in US-east-one and AP-southeast-two.

101
00:04:29,190 --> 00:04:33,420
So here, the idea with global tables is to make DynamoDB

102
00:04:33,420 --> 00:04:36,900
accessible with low latency in multiple regions,

103
00:04:36,900 --> 00:04:39,330
and it is an active-active replication.

104
00:04:39,330 --> 00:04:41,850
That means that your applications can read and write

105
00:04:41,850 --> 00:04:44,730
to the table in any specific region.

106
00:04:44,730 --> 00:04:46,620
And to enable global tables,

107
00:04:46,620 --> 00:04:48,750
you must first enable DynamoDB Streams,

108
00:04:48,750 --> 00:04:51,540
because this is the underlying infrastructure

109
00:04:51,540 --> 00:04:53,853
to replicate a table across regions.

110
00:04:55,830 --> 00:04:59,760
DynamoDB also has a feature named Time to Live, or TTL.

111
00:04:59,760 --> 00:05:02,880
The idea is that you want to automatically delete items

112
00:05:02,880 --> 00:05:04,710
after an expiry time stamp.

113
00:05:04,710 --> 00:05:08,100
So you have your table session data, and then

114
00:05:08,100 --> 00:05:11,250
you will have one last attribute called expiration time,

115
00:05:11,250 --> 00:05:14,730
which is your TTL, and it has a timestamp in it.

116
00:05:14,730 --> 00:05:17,940
And the idea is that you're going to define a TTL,

117
00:05:17,940 --> 00:05:19,710
and as soon as the current time

118
00:05:19,710 --> 00:05:22,380
in the epoch timestamp is going to be

119
00:05:22,380 --> 00:05:24,540
over your expiration time column,

120
00:05:24,540 --> 00:05:27,630
then automatically it's going to expire items,

121
00:05:27,630 --> 00:05:31,920
and then eventually delete them through a deletion process.

122
00:05:31,920 --> 00:05:33,690
So the idea is that the items

123
00:05:33,690 --> 00:05:37,440
in your data table are going to be deleted after a while.

124
00:05:37,440 --> 00:05:40,020
So the use cases for this would be to only store data

125
00:05:40,020 --> 00:05:42,540
by keeping only the most current items,

126
00:05:42,540 --> 00:05:46,050
or to adhere to regulatory obligations by for example,

127
00:05:46,050 --> 00:05:49,200
deleting data after two years, or another use case

128
00:05:49,200 --> 00:05:52,650
that is very common at the exam is the web session handling.

129
00:05:52,650 --> 00:05:54,810
So a user would log into your website,

130
00:05:54,810 --> 00:05:56,820
and then would have a session, and you would keep

131
00:05:56,820 --> 00:05:59,460
the session in a central place such as DynamoDB

132
00:05:59,460 --> 00:06:02,700
for two hours, and you would set the station data there,

133
00:06:02,700 --> 00:06:05,160
and any kind of applications can access it.

134
00:06:05,160 --> 00:06:07,890
And at some point after two hours if it hasn't been renewed,

135
00:06:07,890 --> 00:06:11,283
then it will expire, and move away from this table.

136
00:06:12,750 --> 00:06:15,540
You can also use DynamoDB for disaster recovery.

137
00:06:15,540 --> 00:06:17,520
So what are your backups options?

138
00:06:17,520 --> 00:06:19,200
Well, you can have continuous backups

139
00:06:19,200 --> 00:06:21,840
with point-in-time recovery, PITR,

140
00:06:21,840 --> 00:06:23,340
so it's optionally enabled,

141
00:06:23,340 --> 00:06:25,830
but you can have it for the last 35 days.

142
00:06:25,830 --> 00:06:27,510
And then if you enable it, you can do

143
00:06:27,510 --> 00:06:29,910
a point-in-time recovery to any time

144
00:06:29,910 --> 00:06:32,010
within the backup window,

145
00:06:32,010 --> 00:06:34,080
and if you do happen to do a recovery,

146
00:06:34,080 --> 00:06:36,690
then it will create a new table.

147
00:06:36,690 --> 00:06:38,850
If you wanted to have longer term backups, then

148
00:06:38,850 --> 00:06:42,210
you can use on-demand backups, and they will be retained

149
00:06:42,210 --> 00:06:44,880
until you delete them explicitly,

150
00:06:44,880 --> 00:06:47,070
and doing these kind of backups do not affect

151
00:06:47,070 --> 00:06:50,640
the performance or the latency of your DynamoDB table.

152
00:06:50,640 --> 00:06:53,100
And if you wanted to have a better management

153
00:06:53,100 --> 00:06:56,940
of your backups, you could use the AWS backup service,

154
00:06:56,940 --> 00:07:00,180
and this will enable you to have lifecycle policies

155
00:07:00,180 --> 00:07:02,850
for your backups and so on, and also copy your backups

156
00:07:02,850 --> 00:07:05,970
across regions for disaster recovery purposes.

157
00:07:05,970 --> 00:07:09,390
And again, if you were to do a recovery

158
00:07:09,390 --> 00:07:11,990
of one of these backups, it will create a new table.

159
00:07:12,870 --> 00:07:14,730
Now let's talk about the integrations

160
00:07:14,730 --> 00:07:17,400
between DynamoDB and Amazon S3.

161
00:07:17,400 --> 00:07:21,450
So you can export a table into FS three, just like this,

162
00:07:21,450 --> 00:07:25,080
and to do so, you must enable point-in-time recovery,

163
00:07:25,080 --> 00:07:27,570
and you would export a DynamoDB table into S3,

164
00:07:27,570 --> 00:07:30,120
for example, if you wanted to do some queries,

165
00:07:30,120 --> 00:07:33,210
for example, using the Amazon Athena engine.

166
00:07:33,210 --> 00:07:35,370
So you can export at any point in time

167
00:07:35,370 --> 00:07:36,780
during the last 35 days,

168
00:07:36,780 --> 00:07:39,390
because you've enabled continuous backups.

169
00:07:39,390 --> 00:07:40,830
And when you do an export,

170
00:07:40,830 --> 00:07:42,900
this does not affect the read capacity

171
00:07:42,900 --> 00:07:45,060
of your table or your performance.

172
00:07:45,060 --> 00:07:46,590
So with this, you can perform, for example,

173
00:07:46,590 --> 00:07:48,780
data analysis on top of DynamoDB

174
00:07:48,780 --> 00:07:51,540
through exporting through Amazon S3 first.

175
00:07:51,540 --> 00:07:54,454
You can also use this to retain snapshots for auditing,

176
00:07:54,454 --> 00:07:57,000
or to do any kind of big transformation,

177
00:07:57,000 --> 00:08:00,450
maybe any ETL of the data before importing it back

178
00:08:00,450 --> 00:08:03,480
into a new DynamoDB table, for example.

179
00:08:03,480 --> 00:08:07,050
The format of the export can be in DynamoDB JSON,

180
00:08:07,050 --> 00:08:09,090
or the ION format.

181
00:08:09,090 --> 00:08:11,820
And similarly, you can import to Amazon S3.

182
00:08:11,820 --> 00:08:15,330
So you can export from S3 into the CSV, JSON,

183
00:08:15,330 --> 00:08:19,530
or ION format back into a new DynamoDB table,

184
00:08:19,530 --> 00:08:22,500
and this does not consume any write capacity,

185
00:08:22,500 --> 00:08:24,390
and it will create a new table.

186
00:08:24,390 --> 00:08:26,490
If there are any import errors,

187
00:08:26,490 --> 00:08:29,070
they will be logged in CloudWatch logs.

188
00:08:29,070 --> 00:08:30,900
So that's it for DynamoDB.

189
00:08:30,900 --> 00:08:31,830
I hope you liked it,

190
00:08:31,830 --> 00:08:33,783
and I will see you in the next lecture.