1 00:00:00,120 --> 00:00:01,940 ‫So now let's look at two ways we can use 2 00:00:01,940 --> 00:00:04,190 ‫DynamoDB with Amazon S3 3 00:00:04,190 --> 00:00:05,370 ‫And the first one is how 4 00:00:05,370 --> 00:00:07,739 ‫to store large objects in DynamoDB. 5 00:00:07,739 --> 00:00:09,650 ‫Well, it turns out that as you know, 6 00:00:09,650 --> 00:00:11,970 ‫in your tables in DynamoDB, 7 00:00:11,970 --> 00:00:14,800 ‫you can only store up to 400 kilobytes of data. 8 00:00:14,800 --> 00:00:17,130 ‫So obviously if you want to start storing some images, 9 00:00:17,130 --> 00:00:18,670 ‫some videos, all that kind of stuff, 10 00:00:18,670 --> 00:00:20,793 ‫DynamoDB is not the best place for it. 11 00:00:21,862 --> 00:00:23,649 ‫So instead, what we're going to do is that 12 00:00:23,649 --> 00:00:26,186 ‫we are going to have an Amazon S3 bucket, 13 00:00:26,186 --> 00:00:27,470 ‫that will contain our large objects. 14 00:00:27,470 --> 00:00:29,640 ‫So what is the process to upload a large object? 15 00:00:29,640 --> 00:00:32,335 ‫Well, say we upload an image 16 00:00:32,335 --> 00:00:33,530 ‫into Amazon S3. 17 00:00:33,530 --> 00:00:36,360 ‫We're going to get back objects key. 18 00:00:36,360 --> 00:00:38,410 ‫And what we going to do is that we're going 19 00:00:38,410 --> 00:00:39,420 ‫to store this metadata from 20 00:00:39,420 --> 00:00:42,230 ‫the application into DynamoDB. 21 00:00:42,230 --> 00:00:44,220 ‫So we'll have a product ID, a product name, 22 00:00:44,220 --> 00:00:45,870 ‫and then an image URL, 23 00:00:45,870 --> 00:00:47,740 ‫which is a pointer directly 24 00:00:47,740 --> 00:00:49,550 ‫into Amazon S3. 25 00:00:49,550 --> 00:00:50,383 ‫Now, 26 00:00:50,383 --> 00:00:52,606 ‫what we've done is that we've effectively 27 00:00:52,606 --> 00:00:54,100 ‫stored a very small amount of data in our products table 28 00:00:54,100 --> 00:00:54,933 ‫in DynamoDB. 29 00:00:55,846 --> 00:00:57,980 ‫And we store the large item in Amazon S3. 30 00:00:57,980 --> 00:00:59,560 ‫Now from the reading perspective and 31 00:00:59,560 --> 00:01:01,320 ‫clients who wants to read this data, 32 00:01:01,320 --> 00:01:03,720 ‫first gets the metadata from dynamo DB, 33 00:01:03,720 --> 00:01:06,690 ‫and then we'll get the image back from Amazon S3 34 00:01:06,690 --> 00:01:09,060 ‫to reconstruct these large objects. 35 00:01:09,060 --> 00:01:12,345 ‫So we can go on and on and have many different products and 36 00:01:12,345 --> 00:01:13,310 ‫use the strategy at scale. 37 00:01:13,310 --> 00:01:16,347 ‫And the cool thing about this strategy is that we're using 38 00:01:16,347 --> 00:01:17,770 ‫each service for what it's good at. 39 00:01:17,770 --> 00:01:21,290 ‫So Amazon S3, is great for storing large objects. 40 00:01:21,290 --> 00:01:22,123 ‫Okay. 41 00:01:22,123 --> 00:01:24,446 ‫And DynamoDB is great for storing small objects 42 00:01:24,446 --> 00:01:27,460 ‫that are going to be indexed with specific attributes. 43 00:01:27,460 --> 00:01:28,293 ‫So in this example, 44 00:01:28,293 --> 00:01:31,740 ‫we have the perfect combination of Amazon S3 and DynamoDB 45 00:01:32,580 --> 00:01:34,100 ‫Another combination 46 00:01:34,100 --> 00:01:36,370 ‫or synergy we can have is to use DynamoDB 47 00:01:36,370 --> 00:01:39,330 ‫as a way to index S3 objects metadata. 48 00:01:39,330 --> 00:01:43,100 ‫So the application is going to upload objects into Amazon S3 49 00:01:43,100 --> 00:01:45,460 ‫and Amazon S3 we'll have notifications set up, 50 00:01:45,460 --> 00:01:47,550 ‫for example, to invoke a Lambda function 51 00:01:47,550 --> 00:01:48,970 ‫that Lambda function will store 52 00:01:48,970 --> 00:01:51,990 ‫the objects metadata into DynamoDB table, for example, 53 00:01:51,990 --> 00:01:54,169 ‫object size, date, who created it, 54 00:01:54,169 --> 00:01:56,663 ‫whatever you want to think of about these objects 55 00:01:56,663 --> 00:01:58,170 ‫and why do we do this? 56 00:01:58,170 --> 00:01:59,003 ‫Well, 57 00:01:59,003 --> 00:02:01,180 ‫because it's much easier for us to build some queries on top 58 00:02:01,180 --> 00:02:02,600 ‫of the DynamoDB table. 59 00:02:02,600 --> 00:02:05,046 ‫Than on top of an S3 bucket, again, 60 00:02:05,046 --> 00:02:06,760 ‫A S3 bucket is not meant to be really scanned. 61 00:02:06,760 --> 00:02:09,820 ‫It's meant to store large objects, and it's supposed to, 62 00:02:09,820 --> 00:02:12,180 ‫you're supposed to have some sort of database that knows 63 00:02:12,180 --> 00:02:14,990 ‫what these objects are, their attributes and so on. 64 00:02:14,990 --> 00:02:17,740 ‫So by creating an application on top of DynamoDB, 65 00:02:17,740 --> 00:02:19,600 ‫we can answer some questions such as, hey, 66 00:02:19,600 --> 00:02:22,660 ‫we want you to find objects by a specific timestamp on our 67 00:02:22,660 --> 00:02:23,520 ‫S3 buckets. 68 00:02:23,520 --> 00:02:26,350 ‫Or we want to find the total storage used by a customer or 69 00:02:26,350 --> 00:02:28,815 ‫list all the object for attributes or find all the S3 70 00:02:28,815 --> 00:02:31,870 ‫objects uploaded within a date range, although this, 71 00:02:31,870 --> 00:02:33,470 ‫by querying DynamoDB table. 72 00:02:33,470 --> 00:02:35,950 ‫And then we read it back to the results from DynamoDB 73 00:02:35,950 --> 00:02:39,040 ‫and we retrieve the necessary objects from your S3 buckets. 74 00:02:39,040 --> 00:02:39,873 ‫Okay. 75 00:02:39,873 --> 00:02:41,580 ‫So hopefully these two strategies make sense. 76 00:02:41,580 --> 00:02:43,960 ‫They're quite common and they can come up at the exam. 77 00:02:43,960 --> 00:02:44,793 ‫I hope you liked it. 78 00:02:44,793 --> 00:02:46,610 ‫And I will see you in the next lecture.