1
00:00:00,200 --> 00:00:01,580
‫Okay, so just a short lecture

2
00:00:01,580 --> 00:00:04,690
‫on two DynamoDB Operations you may get to sit on.

3
00:00:04,690 --> 00:00:08,140
‫So the first one is around how to do a table cleanup.

4
00:00:08,140 --> 00:00:10,220
‫So to do so, you have two options.

5
00:00:10,220 --> 00:00:12,660
‫First, you can scan all the items in your table

6
00:00:12,660 --> 00:00:14,240
‫and then delete them one by one,

7
00:00:14,240 --> 00:00:15,410
‫which is very, very slow

8
00:00:15,410 --> 00:00:18,230
‫and can consume a lot of RCU on the scan operation

9
00:00:18,230 --> 00:00:19,970
‫and WCU on the delete operation,

10
00:00:19,970 --> 00:00:21,260
‫so it's expensive.

11
00:00:21,260 --> 00:00:23,600
‫The option two is much more quick,

12
00:00:23,600 --> 00:00:25,340
‫which is to drop the table.

13
00:00:25,340 --> 00:00:28,380
‫So drop it, remove it, and then recreate this table.

14
00:00:28,380 --> 00:00:30,660
‫So it's fast, efficient and cheap.

15
00:00:30,660 --> 00:00:33,030
‫And you just need to make sure you recreate this table

16
00:00:33,030 --> 00:00:36,020
‫with the correct settings just like the one before.

17
00:00:36,020 --> 00:00:38,630
‫Now, if you wanted to copy DynamoDB Table

18
00:00:38,630 --> 00:00:41,670
‫across accounts, regions, places,

19
00:00:41,670 --> 00:00:42,503
‫there's two options.

20
00:00:42,503 --> 00:00:44,570
‫The first one is to use AWS Data Pipeline.

21
00:00:44,570 --> 00:00:46,610
‫And this is probably only time in the exam

22
00:00:46,610 --> 00:00:47,730
‫you will see Data Pipelines,

23
00:00:47,730 --> 00:00:49,260
‫so I'm not spending some time on it.

24
00:00:49,260 --> 00:00:50,540
‫I just want to show you what it does.

25
00:00:50,540 --> 00:00:52,910
‫So in this case, we want to copy DynamoDB Table

26
00:00:52,910 --> 00:00:53,970
‫into another one.

27
00:00:53,970 --> 00:00:55,690
‫So Data Pipeline in the backend

28
00:00:55,690 --> 00:00:58,160
‫is going to launch an Amazon EMR Cluster.

29
00:00:58,160 --> 00:01:00,130
‫EMR will be reading from the DynamoDB Table

30
00:01:00,130 --> 00:01:02,208
‫using a scan operation and writing it back

31
00:01:02,208 --> 00:01:04,850
‫into Amazon S3 to store it.

32
00:01:04,850 --> 00:01:06,550
‫Then, on the second step,

33
00:01:06,550 --> 00:01:09,110
‫it will read back the data from Amazon S3

34
00:01:09,110 --> 00:01:12,810
‫and then insert it back into a new DynamoDB Table, okay?

35
00:01:12,810 --> 00:01:13,917
‫So this is Data Pipeline

36
00:01:13,917 --> 00:01:16,070
‫and this thing will be synchronizing

37
00:01:16,070 --> 00:01:19,560
‫and coordinating all these operations in the backend.

38
00:01:19,560 --> 00:01:21,290
‫Option two, which I like a lot more,

39
00:01:21,290 --> 00:01:23,950
‫is to do a backup of your DynamoDB Table

40
00:01:23,950 --> 00:01:25,860
‫and restore it into a new table,

41
00:01:25,860 --> 00:01:26,870
‫which takes some time,

42
00:01:26,870 --> 00:01:28,650
‫but it's more efficient and doesn't require

43
00:01:28,650 --> 00:01:30,700
‫any external other services.

44
00:01:30,700 --> 00:01:33,350
‫And number three is a little bit more tricky.

45
00:01:33,350 --> 00:01:34,770
‫You do a scan on your own,

46
00:01:34,770 --> 00:01:36,830
‫and then you do a put item or batch rate items

47
00:01:36,830 --> 00:01:38,220
‫if you want to be more efficient.

48
00:01:38,220 --> 00:01:39,470
‫You have to write your own code,

49
00:01:39,470 --> 00:01:41,610
‫but you can do some transformations in the meantime.

50
00:01:41,610 --> 00:01:43,100
‫This is not the recommended way,

51
00:01:43,100 --> 00:01:46,330
‫but another option you would consider anyway, okay.

52
00:01:46,330 --> 00:01:47,430
‫So that's it for this lecture.

53
00:01:47,430 --> 00:01:50,343
‫I hope you liked it and I will see you in the next lecture.