1 00:00:00,000 --> 00:00:02,050 Quick theory lecture on S3 Select 2 00:00:02,050 --> 00:00:03,100 and Glacier Select. 3 00:00:03,100 --> 00:00:05,640 The the idea is we want to retrieve less data, 4 00:00:05,640 --> 00:00:08,710 so subsets of what we're requesting using SQL 5 00:00:08,710 --> 00:00:10,970 by performing server side filtering, 6 00:00:10,970 --> 00:00:12,930 and so the SQLs queries are quite simple. 7 00:00:12,930 --> 00:00:15,280 They can only be used to filter by rows and columns 8 00:00:15,280 --> 00:00:16,980 so they're very simple SQL statements. 9 00:00:16,980 --> 00:00:19,380 You cannot do aggregations or anything like this, 10 00:00:19,380 --> 00:00:23,100 and you will use less network and less CPU cost client-side 11 00:00:23,100 --> 00:00:24,930 because you don't retrieve the full file. 12 00:00:24,930 --> 00:00:28,070 S3 will perform the select, the filtering for you 13 00:00:28,070 --> 00:00:30,140 and only return to you what you need. 14 00:00:30,140 --> 00:00:32,930 So the idea is that before you have Amazon S3 15 00:00:32,930 --> 00:00:35,360 sending all the data into your application 16 00:00:35,360 --> 00:00:37,110 and then you have to filter it application-side 17 00:00:37,110 --> 00:00:38,400 to find the right rows you want 18 00:00:38,400 --> 00:00:40,250 and only keep the columns you want, 19 00:00:40,250 --> 00:00:44,540 and after you request the data from S3 using S3 Select 20 00:00:44,540 --> 00:00:46,800 and it only gives you the data you need, 21 00:00:46,800 --> 00:00:48,810 the columns you want and the rows you want 22 00:00:48,810 --> 00:00:51,030 and the results Amazon is telling you is 23 00:00:51,030 --> 00:00:54,530 that you are up to 400% faster and up to 80% cheaper 24 00:00:54,530 --> 00:00:56,600 because you have less network traffic going through 25 00:00:56,600 --> 00:00:59,740 and the filtering happens server-side, okay. 26 00:00:59,740 --> 00:01:02,410 So similarly, let's just do another diagram. 27 00:01:02,410 --> 00:01:04,489 We have the client asking to get a CSV file 28 00:01:04,489 --> 00:01:07,890 with S3 Select to only get a few columns and a few rows. 29 00:01:07,890 --> 00:01:09,980 Amazon S3 will perform server-side filtering 30 00:01:09,980 --> 00:01:12,370 on that CSV file to find the right columns 31 00:01:12,370 --> 00:01:13,450 and the rows we want 32 00:01:13,450 --> 00:01:16,870 and send back the data filtered back to our client, 33 00:01:16,870 --> 00:01:20,370 so obviously, less network, less CPU, and faster, 34 00:01:20,370 --> 00:01:22,050 so this is great. 35 00:01:22,050 --> 00:01:23,710 To summarize from an exam perspective, 36 00:01:23,710 --> 00:01:26,340 any time you see filtering of data server-side 37 00:01:26,340 --> 00:01:31,140 in S3 to get less, think about S3 Select English or select, 38 00:01:31,140 --> 00:01:32,540 that works on Glacier as well 39 00:01:32,540 --> 00:01:34,530 and then for more complex querying, 40 00:01:34,530 --> 00:01:36,290 that's gonna be server less on S3, 41 00:01:36,290 --> 00:01:37,330 you'll see in the future lectures 42 00:01:37,330 --> 00:01:39,390 we have something called Amazon Athena. 43 00:01:39,390 --> 00:01:40,223 All right. 44 00:01:40,223 --> 00:01:42,220 That's it. I will see you in the next lecture.