1 00:00:00,360 --> 00:00:01,859 Hello, the beautiful people. 2 00:00:01,859 --> 00:00:02,950 And welcome back. 3 00:00:02,969 --> 00:00:08,310 Now, in the last video, we put our files into a tar ball, and now we want to learn about how to compress 4 00:00:08,310 --> 00:00:11,790 that table to save space on our hard drive. 5 00:00:11,820 --> 00:00:17,280 Now, compression happens using compression algorithms, and there are two main compression algorithms 6 00:00:17,280 --> 00:00:25,020 in use in the Linux world, and the first is the GZIP algorithm and the second is B zip b zip two. 7 00:00:25,290 --> 00:00:31,470 And the main difference between them is usually GZIP is faster but has less compression power and b 8 00:00:31,470 --> 00:00:31,710 zip. 9 00:00:31,710 --> 00:00:37,620 Two on the other hand can generally compress files to a smaller size than GZIP, but does require more 10 00:00:37,620 --> 00:00:38,850 computational time. 11 00:00:38,850 --> 00:00:41,670 So both of these compression algorithms are really easy to use. 12 00:00:41,670 --> 00:00:44,250 So let's just start by talking about GZIP. 13 00:00:44,700 --> 00:00:51,390 So to compress a tarball using GZIP, all you need to do is type gzip and then the archive name. 14 00:00:51,390 --> 00:00:59,730 So who are our archive data and press enter and we'll see that that file has been compressed in place 15 00:01:00,900 --> 00:01:04,500 using the GZIP algorithm and we can see that it has. 16 00:01:04,500 --> 00:01:12,120 If we use the LZ command here, we can see that the dot GZ file extension has automatically been added 17 00:01:12,120 --> 00:01:14,970 and we can use the file command to confirm that. 18 00:01:14,970 --> 00:01:22,680 Yep, this is indeed GZIP compressed data and it was archived tar and it tells you when it was last 19 00:01:22,680 --> 00:01:24,240 modified as well. 20 00:01:24,930 --> 00:01:25,830 So that's really cool. 21 00:01:25,830 --> 00:01:27,600 We can see quite a lot of information about that. 22 00:01:27,600 --> 00:01:29,760 Now let's take a look at the file size. 23 00:01:29,760 --> 00:01:34,680 So if I do the LZ command with the L option, actually, no, that's just the L option. 24 00:01:35,850 --> 00:01:36,510 We'll see that. 25 00:01:36,510 --> 00:01:40,410 File one, file two and file three are 10,000 bytes in size. 26 00:01:40,410 --> 00:01:49,680 But our archive, the compressed archive is only 23,000 bytes, so it's about 7000 bytes or seven kilobytes 27 00:01:49,680 --> 00:01:52,950 less than what the files would have been originally. 28 00:01:52,950 --> 00:01:57,540 So we actually have managed to do a compression of about 23%. 29 00:01:57,540 --> 00:01:58,470 So that's not bad. 30 00:01:58,470 --> 00:02:01,860 Now this will vary from file type to file type and what you're actually compressing. 31 00:02:01,860 --> 00:02:05,760 But in this case, we've got a compression of 23% with GZIP. 32 00:02:05,820 --> 00:02:09,030 So notice how compression is a two stage thing. 33 00:02:09,030 --> 00:02:16,050 First we make a tarball and then we compress that tarball with a compression algorithm which in this 34 00:02:16,050 --> 00:02:17,190 case was GZIP. 35 00:02:17,640 --> 00:02:18,180 Okay, cool. 36 00:02:18,180 --> 00:02:20,280 So how about reversing the GZIP? 37 00:02:20,280 --> 00:02:22,200 How can we get the tar ball back? 38 00:02:22,320 --> 00:02:23,700 Well, this is very easy. 39 00:02:23,700 --> 00:02:27,030 All we need to do is just clear the screen. 40 00:02:27,030 --> 00:02:30,030 G Unzip our archive. 41 00:02:30,720 --> 00:02:35,220 GZ Okay, so you just use G unzip on whatever it was. 42 00:02:35,220 --> 00:02:37,830 And when you do that, you see it's now back to a tar file. 43 00:02:37,830 --> 00:02:40,440 And if we look, we can see that it's a tar file. 44 00:02:40,440 --> 00:02:46,290 We're using the file command to archive tar and it tells you it's a POSIX tar archive. 45 00:02:46,320 --> 00:02:46,950 Hooray. 46 00:02:46,950 --> 00:02:50,610 Okay, so we've managed to undo the GZIP operation. 47 00:02:51,000 --> 00:02:53,160 Okay, so what about B's zip two? 48 00:02:53,190 --> 00:02:58,140 Well, B, zip two is a more heavy duty compression algorithm, and it takes more computational time 49 00:02:58,140 --> 00:03:05,430 than GZIP, but usually not always, but usually returns more compressed files than you get with GZIP. 50 00:03:05,640 --> 00:03:06,840 So it's a bit of a tradeoff. 51 00:03:06,870 --> 00:03:09,960 Now let's compress our tarball with B's zip two and see what we get. 52 00:03:10,170 --> 00:03:11,130 So it's very easy. 53 00:03:11,130 --> 00:03:17,940 All we have to do is be zip to our archive tar and when we see that we now see that it's gone and got 54 00:03:17,940 --> 00:03:25,200 the Z to file extension and if we use the file command on it, we can see that it says it's b zip to 55 00:03:25,200 --> 00:03:26,190 compress data. 56 00:03:26,190 --> 00:03:28,500 It doesn't tell you what it was before it was compressed. 57 00:03:28,500 --> 00:03:30,720 So we lose a bit of that information. 58 00:03:30,720 --> 00:03:34,680 But it has indeed been compressed and the file command knows that it has. 59 00:03:34,680 --> 00:03:35,570 So how big is it? 60 00:03:35,580 --> 00:03:41,010 Well, if we take a look with the URL option, we can just have a look right there and we can see that 61 00:03:41,010 --> 00:03:44,280 this is 23,137 bytes. 62 00:03:44,280 --> 00:03:49,680 So it's slightly larger than the G's version of 23,023 bytes. 63 00:03:49,710 --> 00:03:53,970 Now, zip two is usually best saved until you have really large files. 64 00:03:53,970 --> 00:03:58,560 Now, for these relatively small text files, it didn't do as much good to use be zip two. 65 00:03:58,560 --> 00:04:03,060 But on a large file such as a video, you'll definitely see a large difference between the compression 66 00:04:03,060 --> 00:04:05,550 results achieved by B's zip two and G zip. 67 00:04:05,550 --> 00:04:08,610 Now to undo a B zip to compression, it's actually very easy. 68 00:04:08,610 --> 00:04:13,500 All we've got to do is be unzip to and then whatever the b zip to archive is. 69 00:04:13,500 --> 00:04:16,200 So in our case, this archive turned up easy too. 70 00:04:16,200 --> 00:04:20,850 And when we press enter, we see that we've come back to the tar file and again we can use the file 71 00:04:20,850 --> 00:04:24,870 command to check and we've got our tar archive back. 72 00:04:25,410 --> 00:04:29,850 Now there are other compression algorithms out there, such as the x z compression algorithm. 73 00:04:29,850 --> 00:04:34,740 And if you want to see a more in-depth comparison between the performance of G, Zip B's it too and 74 00:04:34,740 --> 00:04:39,660 the Z algorithm, I put a link in the resources section for you to find out. 75 00:04:39,660 --> 00:04:44,520 Now there is something that you should bear in mind, however, which is how do you create zip files? 76 00:04:44,520 --> 00:04:51,570 Now, zip files are files that you know are commonly used on Windows or Mac and, you know, as compressed 77 00:04:51,570 --> 00:04:52,230 archives. 78 00:04:52,230 --> 00:04:54,060 Now they're not in. 79 00:04:54,430 --> 00:04:58,350 I don't think they are as compressed as what you'll get with GZIP or zip too. 80 00:04:58,350 --> 00:04:59,340 But they. 81 00:04:59,390 --> 00:05:03,140 They are commonly used on other on other operating systems. 82 00:05:03,140 --> 00:05:07,520 So if you want to create a zip file, it's actually a one step process. 83 00:05:07,520 --> 00:05:08,870 You'll just type zip. 84 00:05:08,890 --> 00:05:09,300 Okay? 85 00:05:09,470 --> 00:05:11,180 Then you'll put the file. 86 00:05:11,180 --> 00:05:17,300 So let's, let's call it our thing, zip, and then we'll put file one point txt file to task and file 87 00:05:17,300 --> 00:05:18,290 three dot txt. 88 00:05:18,320 --> 00:05:22,820 And when we press enter, we'll see that we now have a zip file in there. 89 00:05:22,820 --> 00:05:27,770 So if I press file our file after thing, sorry, zip. 90 00:05:27,800 --> 00:05:33,890 It tells us that it's a zip archive data and this is what you can use on other operating systems such 91 00:05:33,890 --> 00:05:34,430 as Windows. 92 00:05:34,430 --> 00:05:40,880 And you usually can't with g zipped files or b zip to compressed files. 93 00:05:41,510 --> 00:05:44,780 They don't tend to work too well on other operating systems by default. 94 00:05:44,780 --> 00:05:49,940 You tend to have to install other software like seven zip or winrar or whatever it might be to extract 95 00:05:49,940 --> 00:05:50,270 them. 96 00:05:50,270 --> 00:05:56,750 And of course you can just unzip that zip file and it'll ask you, do you want to extract them all? 97 00:05:56,750 --> 00:05:58,010 And then you can extract them all. 98 00:05:58,220 --> 00:06:02,540 This is just extra asking us because it was like, Oh, you might have to overwrite some stuff, but 99 00:06:02,540 --> 00:06:03,110 that's no problem. 100 00:06:03,110 --> 00:06:04,400 We just said yes, and that worked. 101 00:06:04,400 --> 00:06:08,390 So if you want to make zip files, zip and unzip is the way to go. 102 00:06:08,660 --> 00:06:14,060 So if we clear the screen now, so far we've been looking at compression as a two step process. 103 00:06:14,300 --> 00:06:19,340 First, we've been creating a tar ball using the tar command and then compressing that tar ball using 104 00:06:19,340 --> 00:06:23,660 either GZIP or BS it two or, you know, any other compression algorithm. 105 00:06:24,080 --> 00:06:26,840 But is there any way to do it in one step? 106 00:06:26,930 --> 00:06:31,070 Well, because creating backups is such a common thing to do, the answer is yes, you can actually 107 00:06:31,070 --> 00:06:36,380 create a tarball and compress it using a compression algorithm straight from within the tar command. 108 00:06:36,380 --> 00:06:39,230 So let's just delete our tarball for now in our zip folder. 109 00:06:39,230 --> 00:06:43,940 So our to archive tar, we're going to remove that and our fingers zip. 110 00:06:43,940 --> 00:06:48,830 So we're going to get rid of all, every archive that we've got and then we're going to take it right 111 00:06:48,830 --> 00:06:54,080 from just the raw data here, these raw files straight to a compressed archive in one go. 112 00:06:54,080 --> 00:06:54,560 Okay. 113 00:06:54,680 --> 00:07:00,050 So to create the tab, to create the tar ball itself, how would we do that? 114 00:07:00,050 --> 00:07:06,530 Well, what we do is we create we do tar CBF, so create verbose and allow it to accept files. 115 00:07:06,800 --> 00:07:14,300 We're going to create our archive tar there we go and file 1 to 3. 116 00:07:14,300 --> 00:07:15,980 So this is the way that we normally do it. 117 00:07:16,670 --> 00:07:21,620 But now to compress the files, we just need to give it one more option. 118 00:07:22,100 --> 00:07:27,080 So to do it with GZIP, you just give it the Z option. 119 00:07:27,770 --> 00:07:34,280 And if, if we do that, that will compress using GZIP and we should remember that instead of naming 120 00:07:34,280 --> 00:07:35,390 it as our archive. 121 00:07:35,570 --> 00:07:41,540 TAR Because this is going to be a g zip file, we should add g z at the end as well just for convention 122 00:07:41,540 --> 00:07:47,330 because it won't be added automatically when using this method when you run GZIP on it before it was 123 00:07:47,330 --> 00:07:50,300 added automatically the GZ file extension. 124 00:07:50,300 --> 00:07:55,250 But now because we're doing it all in one step, we have to provide the file name, so we need to add 125 00:07:55,250 --> 00:07:55,730 it on there. 126 00:07:55,730 --> 00:07:58,010 Just so you know, it's a bit easier to see what's going on. 127 00:07:58,010 --> 00:08:04,640 And when we have when we press enter, we'll see that our archive GZ has been, has been created and 128 00:08:04,640 --> 00:08:10,430 we can check that it is in fact a zip file using the file command and it says that it is GZIP compressed 129 00:08:10,430 --> 00:08:11,090 data. 130 00:08:11,090 --> 00:08:12,170 So that's super cool. 131 00:08:12,200 --> 00:08:15,260 Okay, now how would you do it using B zip two. 132 00:08:15,260 --> 00:08:18,590 Well, if we clear the screen b's it two is exactly the same. 133 00:08:18,590 --> 00:08:21,620 Except instead of Z you just have J. 134 00:08:22,370 --> 00:08:27,830 So if you press, if you have J and now remember, we're going to change our extension to B zip two. 135 00:08:27,950 --> 00:08:33,650 And when we press enter, we'll see that our archive, zip too has been created and we can check that 136 00:08:33,650 --> 00:08:38,870 that is indeed valid by using the file command, telling it to look at the b zip to archive and it tells 137 00:08:38,870 --> 00:08:41,570 us that it's b zip two compressed data. 138 00:08:41,570 --> 00:08:44,420 So that's how you can compress in one go. 139 00:08:44,780 --> 00:08:49,520 How would we go about extracting from those compressed archives in one go? 140 00:08:49,520 --> 00:08:50,720 Well, that's really easy as well. 141 00:08:50,720 --> 00:08:51,890 Okay, let's delete them again. 142 00:08:51,900 --> 00:08:53,480 Actually, no, it's not delete them because we'll need them. 143 00:08:53,480 --> 00:08:53,780 Right. 144 00:08:53,780 --> 00:08:56,990 So here's how would we normally extract from a tar file. 145 00:08:56,990 --> 00:09:04,460 Well, how would normally do it, if you remember, is tar then x vrf and so x for extract, v for verbose 146 00:09:04,460 --> 00:09:08,060 and f to accept a file and you'd give it the the, the tar file. 147 00:09:09,050 --> 00:09:11,510 But let's try and extract the GZIP option first. 148 00:09:11,510 --> 00:09:17,090 Let's delete these original files and let's try and extract the the GZIP option first. 149 00:09:17,090 --> 00:09:22,670 So to extract the GZIP option we've got to do is type the Z again because Z is for GZIP. 150 00:09:22,670 --> 00:09:27,170 Okay, so you had this the Z when compressing, you have the Z when extracting, all you've got to do 151 00:09:27,170 --> 00:09:30,080 is change the C to an X and you're good to go. 152 00:09:30,080 --> 00:09:35,570 So when we press that, we'll see that now we've extracted them all out again and we've got the data 153 00:09:35,570 --> 00:09:36,050 back. 154 00:09:36,140 --> 00:09:39,020 So if we delete these files here, if we just delete them there. 155 00:09:39,020 --> 00:09:41,420 Now how about extracting using B's? 156 00:09:41,450 --> 00:09:44,080 YP Well, you remember that J is for B's. 157 00:09:44,090 --> 00:09:48,890 YP So instead of having Z, let's just put the J and then we just need to tell it what the archive is 158 00:09:48,890 --> 00:09:52,070 and remember that it's a B's YP to archive, so we need to actually change the name. 159 00:09:52,460 --> 00:09:58,630 So we are tar extracting verbose through b zip two and then giving it the file and. 160 00:09:58,720 --> 00:09:59,080 There we are. 161 00:09:59,080 --> 00:10:00,340 We've extracted using bees too. 162 00:10:00,340 --> 00:10:01,420 So it's very, very easy. 163 00:10:01,420 --> 00:10:03,610 As long as you can remember that Z is for GZIP. 164 00:10:03,640 --> 00:10:05,740 J is for bees too. 165 00:10:05,770 --> 00:10:09,160 You saw it and you can extract and compress all in one go. 166 00:10:09,700 --> 00:10:10,280 Hurray. 167 00:10:10,300 --> 00:10:12,760 So we've seen so much in the last two videos. 168 00:10:12,760 --> 00:10:15,760 It's quite hilarious that a section on compression would take so long. 169 00:10:15,760 --> 00:10:16,180 Right. 170 00:10:16,480 --> 00:10:18,220 Anyway, let's have a quick recap. 171 00:10:18,220 --> 00:10:21,280 So you first learned about the concept of a tar ball. 172 00:10:21,280 --> 00:10:26,860 Now tar files or tar balls are just containers for you to store the files that you want to compress. 173 00:10:27,340 --> 00:10:30,610 Now you can create tar balls using the tar command. 174 00:10:30,610 --> 00:10:33,820 And tar balls do not really do any compression on their own. 175 00:10:34,060 --> 00:10:35,770 That's the job of a compression algorithm. 176 00:10:35,770 --> 00:10:42,040 The tar ball just contains the files and you saw how to add files into tar balls and how to extract 177 00:10:42,040 --> 00:10:44,500 files from tar balls using the tar command. 178 00:10:44,500 --> 00:10:48,700 And remember that there's a cheat sheet in the resources section for this video that will list out the 179 00:10:48,700 --> 00:10:51,370 different options and different ways of using the tar command. 180 00:10:51,400 --> 00:10:57,400 Now, once you have the tar ball in order to compress it, you need to use a compression algorithm so 181 00:10:57,400 --> 00:11:00,610 tar balls can be compressed using a variety of different compression algorithms. 182 00:11:00,610 --> 00:11:07,330 And we took a look at GZIP and Bee Zip two, which are common options on Linux, but the Z option is 183 00:11:07,330 --> 00:11:07,810 another option. 184 00:11:07,810 --> 00:11:10,960 We didn't discuss it here, but if you want to read more about the comparison between the different 185 00:11:10,960 --> 00:11:15,280 algorithms, I've also put a link to a interesting blog post in the resources section for this video 186 00:11:15,460 --> 00:11:16,630 for you to have a look at as well. 187 00:11:17,650 --> 00:11:24,220 But with regards to GZIP and BBS, it to GZIP tends to be faster but give less compression, whereas 188 00:11:24,220 --> 00:11:27,700 BS YP too tends to give more compression but take a bit more computation time. 189 00:11:27,910 --> 00:11:29,290 So that's just something to bear in mind. 190 00:11:29,290 --> 00:11:33,850 But by all means check out the link in the resources section for a more in-depth discussion. 191 00:11:34,450 --> 00:11:41,470 You saw also how you can create zip files for compatibility with things like Windows or Mac using the 192 00:11:41,470 --> 00:11:42,820 zip and unzip commands. 193 00:11:42,820 --> 00:11:50,140 So that's something as well that you've got on record as well and you saw as well how to not just do 194 00:11:50,500 --> 00:11:52,870 making the tarball and then compressing in two steps. 195 00:11:52,870 --> 00:11:58,390 You saw how to do archiving and compression in just one step using a variety of shortcuts and all that 196 00:11:58,390 --> 00:12:02,830 stuff is in the cheat sheet that you can find in the resources section for this video. 197 00:12:02,830 --> 00:12:06,730 If you ever need a refresher, it's all very simple, and once you look at the cheat sheet, probably 198 00:12:06,730 --> 00:12:08,110 you'll be able to see the pattern. 199 00:12:08,800 --> 00:12:11,950 But it's really quite simple stuff once you've done it once or twice. 200 00:12:11,950 --> 00:12:18,250 Okay, so all in all, you can now create your own archives and backups using the Linux, using the 201 00:12:18,250 --> 00:12:23,650 Linux command line, which is super awesome because on this section all about learning how to use files. 202 00:12:23,650 --> 00:12:26,770 Now when you've created your files, you've edited them, they've copied the paste them, you've moved 203 00:12:26,770 --> 00:12:28,360 them, you've done whatever you want with them. 204 00:12:28,360 --> 00:12:32,830 You can also create your own backups and save for later in a say in a space efficient way. 205 00:12:32,830 --> 00:12:33,910 So well done, you. 206 00:12:33,910 --> 00:12:39,910 Now in the next video, what we're going to be doing is we're going to be closing our discussion for 207 00:12:39,910 --> 00:12:42,610 this section on files and the file system. 208 00:12:42,610 --> 00:12:46,180 And I want to say that you've really come so very far in this section. 209 00:12:46,180 --> 00:12:50,350 And the next video, I want to take the time to have a recap of what we've learned in this section and 210 00:12:50,350 --> 00:12:52,060 also to celebrate your progress. 211 00:12:52,060 --> 00:12:56,980 So well done you now for a summary of this section of the course, as well as some a deserved set of 212 00:12:56,980 --> 00:12:57,700 praises. 213 00:12:57,700 --> 00:12:59,770 I'll see you in the next video.