0 1 00:00:13,810 --> 00:00:14,520 Hello everyone. 1 2 00:00:14,530 --> 00:00:21,130 One of video of expert mother announces in reverse engineering course in this video we will dive further 2 3 00:00:21,490 --> 00:00:25,880 into some of the more detailed sections of AP fine. 3 4 00:00:26,260 --> 00:00:32,260 So if you'll recall from our previous video they looked at this high level simplified diagram of a PE 4 5 00:00:32,310 --> 00:00:39,880 file how the headers and sections look like so in the headers if you see we have four different parts. 5 6 00:00:39,880 --> 00:00:44,270 The DOS header, the PE header, the optional header and the section stable. 6 7 00:00:44,630 --> 00:00:51,180 So in this particular video, we are going to look closer into the optional header and the section 7 8 00:00:51,190 --> 00:00:51,760 header. 8 9 00:00:51,850 --> 00:00:59,110 We have already understood pretty much how the dos and the PE headers look like. To understand it 9 10 00:00:59,200 --> 00:01:05,080 I want to use a very interesting diagram that is present on Wikipedia. 10 11 00:01:05,080 --> 00:01:13,560 Here is the link to the diagram. I have also provided that link in the description/resource section of this video. 11 12 00:01:13,720 --> 00:01:19,930 If you look at this diagram on the browser it is probably one of the best diagrams that you can find 12 13 00:01:19,930 --> 00:01:22,790 about portable executable files. 13 14 00:01:22,810 --> 00:01:25,250 So let me expand it a little bit. 14 15 00:01:26,140 --> 00:01:35,030 So if you look at this diagram, it pretty much starts with the DOS header that is the standard signature of a exe 15 16 00:01:35,030 --> 00:01:41,030 file which is 5A4D and then you have the DOS stub which consists of this program cannot run 16 17 00:01:41,030 --> 00:01:41,980 into DOS mode. 17 18 00:01:42,110 --> 00:01:45,960 Pretty much the standard stuff which we covered in the previous video. 18 19 00:01:46,130 --> 00:01:53,010 Then we have the signature of the PE file which is 50 45 00 00. 19 20 00:01:53,090 --> 00:01:56,640 Then you have the machine information number of sections and stuff like that. 20 21 00:01:56,650 --> 00:02:00,460 So this was something which we had already covered in the previous section. 21 22 00:02:00,530 --> 00:02:07,410 Now going forward we want to spend some time on the other sections of the file format as well so you 22 23 00:02:07,410 --> 00:02:09,440 can see here the yellow part. 23 24 00:02:09,450 --> 00:02:12,050 This is basically the standard COFF fields. 24 25 00:02:12,330 --> 00:02:19,530 So again if you remember our discussion from the previous video we mentioned that all the executable 25 26 00:02:19,530 --> 00:02:26,330 file formats are built on top of the COFF file format which is the compound object file formats. 26 27 00:02:26,400 --> 00:02:29,460 So this is what this diagram actually proves. 27 28 00:02:29,580 --> 00:02:31,050 You have the COFF format 28 29 00:02:31,050 --> 00:02:37,510 And on top of it you have the PE file format bundled on it to create an executable format for the windows 29 30 00:02:37,580 --> 00:02:37,900 environment. 30 31 00:02:38,520 --> 00:02:45,030 So this is how a standard file format is used to create proprietary formats for different 31 32 00:02:45,030 --> 00:02:53,060 kinds of operating systems. If you look at the standard COFF fields, you have to magic which basically 32 33 00:02:53,060 --> 00:03:02,150 tells whether it's 32 bit or 64 bit executable operating system then you have this isolated code section. 33 34 00:03:02,150 --> 00:03:09,800 Then you have the size of initializing data the amount of size that initialized executable will use and 34 35 00:03:10,250 --> 00:03:14,180 on Uninitialized data size. Then you have the address of entry point. 35 36 00:03:14,300 --> 00:03:20,420 So address of entry point is the address where the windows loader will begin execution. 36 37 00:03:20,420 --> 00:03:26,160 This is basically a virtual address which tells windows loader 37 38 00:03:26,180 --> 00:03:27,980 This is the location from where 38 39 00:03:28,080 --> 00:03:34,470 The windows at this particular executable has to be run. Consider it as an entry point. 39 40 00:03:34,470 --> 00:03:39,890 will load and executable in a debugger This is the location where the debugger will basically come and 40 41 00:03:40,160 --> 00:03:41,270 halt to. 41 42 00:03:41,300 --> 00:03:47,210 It might sound a little bit crazy right now but yeah as we move forward with our PEfile analysis, all 42 43 00:03:47,210 --> 00:03:55,440 these sections will make a lot of sense. We then have base of code and base of data image based section 43 44 00:03:55,440 --> 00:04:00,330 alignments and a lot of other fields like the operating system version. 44 45 00:04:00,330 --> 00:04:08,310 So this basically tells which operating systems are supported for this PE file. We have the image version 45 46 00:04:08,310 --> 00:04:11,470 as well then the subsystem versions. 46 47 00:04:11,490 --> 00:04:16,890 So if you see here all these fields are very specific to the Windows environment. 47 48 00:04:16,890 --> 00:04:23,040 So these sections can be specifically populated if you want to restrict the exe to run on certain 48 49 00:04:23,040 --> 00:04:26,100 versions of windows and things like that. 49 50 00:04:26,130 --> 00:04:33,170 So if you've come down to the data directories, you have some interesting fields again the export table 50 51 00:04:33,200 --> 00:04:34,920 is basically 51 52 00:04:35,030 --> 00:04:37,970 a table of export it functions. 52 53 00:04:38,100 --> 00:04:41,660 Then you have import table which is a table of import functions. 53 54 00:04:41,660 --> 00:04:48,710 The resources that are used by the exe and then you have the exceptions table. 54 55 00:04:48,710 --> 00:04:54,410 So if you want to have certain exceptions which are very particular to the executable they can be defined 55 56 00:04:54,410 --> 00:04:56,180 in this particular area. 56 57 00:04:56,180 --> 00:04:58,130 Then you have certificate related information. 57 58 00:04:58,130 --> 00:05:03,950 So whenever you sign a binary you can basically extract the certificate related information from this 58 59 00:05:03,950 --> 00:05:06,050 particular section of the PE file. 59 60 00:05:06,170 --> 00:05:10,990 You have debug related information, architecture data, config data. 60 61 00:05:11,150 --> 00:05:17,540 Then you have input address tables. import Address table(IAT) is something which stores the runtime address 61 62 00:05:17,600 --> 00:05:18,910 of imported functions. 62 63 00:05:19,220 --> 00:05:26,690 So if you want to if you want to use any any windows specific function then it's address has to 63 64 00:05:26,690 --> 00:05:28,000 be defined in the IAT. 64 65 00:05:28,010 --> 00:05:34,820 The import actress.We will be working on import address table in quite a detail as you move into the other 65 66 00:05:35,150 --> 00:05:38,570 sections of malware analysis of the files. 66 67 00:05:40,010 --> 00:05:41,640 So that's about the data directories. 67 68 00:05:41,640 --> 00:05:45,680 and in the end you have the sections table 68 69 00:05:45,680 --> 00:05:53,070 so the section table contains information related to the various sections available in the image 69 70 00:05:53,070 --> 00:05:54,970 of an executable file. 70 71 00:05:55,260 --> 00:06:00,870 The sections in the image are sorted by the relative virtual addresses rather than alphabetically. 71 72 00:06:00,870 --> 00:06:07,380 So here you can get information about the virtual size the virtual address and other important details 72 73 00:06:07,500 --> 00:06:10,670 of the section of the PE file. 73 74 00:06:10,770 --> 00:06:18,900 So there are a lot of different fields inside the header of PE file or inside a PE structure. 74 75 00:06:19,770 --> 00:06:25,950 you basically have to spend time on each of them to really understand what they mean. 75 76 00:06:30,550 --> 00:06:38,110 So coming back to our PPT Let's now jump to a demo. If you remember the previous example we used to 76 77 00:06:38,120 --> 00:06:46,500 CFF explorer to basically look at the hex representation of a PE file where we looked at the 77 78 00:06:46,520 --> 00:06:52,610 different sections of the MZ header, the DOS stub and the PE signatures. 78 79 00:06:52,730 --> 00:06:58,100 So let us look at other information which CFF explorer provides us. 79 80 00:06:58,310 --> 00:07:07,370 So if you start looking here you have information like the DOS header that is something which we looked 80 81 00:07:07,370 --> 00:07:14,320 at then we have the NT header, file header and then the optional header. In the optional header you'll 81 82 00:07:14,370 --> 00:07:19,920 see all those sections which we just saw in the previous diagram on wikipedia so you can see here you 82 83 00:07:19,920 --> 00:07:27,660 have the magic bytes, the major Linker version, the minor version, size of code, address of entry point, 83 84 00:07:27,690 --> 00:07:36,180 base of code, image section. So all these sections are the same which we looked at 84 85 00:07:36,300 --> 00:07:38,940 in the previous image. 85 86 00:07:38,940 --> 00:07:47,100 So if you see here what CFF explorer did was it basically parsed all those sections of the PE file 86 87 00:07:47,460 --> 00:07:51,480 and extracted the offset values and added it over here. 87 88 00:07:51,720 --> 00:08:00,750 So this is how these security tools understand the nature of a PE file or the properties of a PE file. 88 89 00:08:00,990 --> 00:08:07,620 So that was my whole motive to help you understand the P E structure so that you can really make more 89 90 00:08:07,620 --> 00:08:11,010 sense of all these values that have been added here. 90 91 00:08:11,280 --> 00:08:13,710 So you have the data directories. 91 92 00:08:13,770 --> 00:08:19,800 So if I bring you back to this example the data directory is this particular part 92 93 00:08:19,800 --> 00:08:22,460 So those are the values that you will see here. 93 94 00:08:22,470 --> 00:08:30,630 The real relative virtual address, you have the export directory, import directory, addresses so everything 94 95 00:08:30,630 --> 00:08:33,950 that is RVA is basically a relative virtual address. 95 96 00:08:33,960 --> 00:08:38,410 We will see in more detail in our next video what exactly are RVA means. 96 97 00:08:39,270 --> 00:08:43,860 So you have then this section headers. 97 98 00:08:44,000 --> 00:08:50,100 So if you come to the section headers here you will see all the other sections of the PE file which contains 98 99 00:08:50,460 --> 00:08:53,620 the other data which is important for the PE file. 99 100 00:08:53,780 --> 00:09:00,240 So so what happens is in the PE file you have this structure and after the structure, you have a bunch of 100 101 00:09:00,240 --> 00:09:08,870 sections which contain other relevant properties of thePE file. For example the text section is normally 101 102 00:09:08,870 --> 00:09:14,030 the first section and contains the executable code for the applications all the executable properties 102 103 00:09:14,030 --> 00:09:20,900 of the PE file are present in the text section. The data section contains the initialize data of an 103 104 00:09:20,900 --> 00:09:23,030 application such as strings and stuff like that. 104 105 00:09:23,030 --> 00:09:30,890 So whatever value you're initializing in the PE file is stored in the data section. the rdata section 105 106 00:09:31,790 --> 00:09:38,060 is basically a name that is used for import tables that are located in the 106 107 00:09:38,630 --> 00:09:46,520 PE file and at the end, you have the rsrc file. The resource section which is a common name for the resource 107 108 00:09:46,520 --> 00:09:52,610 container section which contains things like images or other such additional properties that you are 108 109 00:09:52,700 --> 00:10:01,910 using inside the PE file. So you want to look at another example of 010 editor which is very much similar 109 110 00:10:01,910 --> 00:10:03,410 to this particular example. 110 111 00:10:03,410 --> 00:10:11,750 So if you load up PE file in 010 Editor, it's a pretty neat hex editor and it has a bunch of templates to 111 112 00:10:11,750 --> 00:10:18,110 parse different file formats. It has executable file format templates as well. 112 113 00:10:18,350 --> 00:10:24,830 So if you load an executable file and if you run the template it basically analyzes 113 114 00:10:24,920 --> 00:10:31,190 all these sections and populate all the section values over here. Very similar to how CFF explorer 114 115 00:10:31,190 --> 00:10:31,750 does it. 115 116 00:10:31,910 --> 00:10:36,880 I just wanted to show you another tool so that you have one example that you can refer later. 116 117 00:10:37,140 --> 00:10:46,970 So if you see here it has the MZ signature then it has the size then if you keep moving down you have 117 118 00:10:46,970 --> 00:10:53,570 the DOS header over here then if you keep moving down you have these NT headers. 118 119 00:10:53,570 --> 00:10:56,660 So this particular section 2 This is where our optional header lies. 119 120 00:10:56,660 --> 00:11:02,140 So if you expanded you will see that you have the optional headers. If you expand the optional headers you will 120 121 00:11:02,150 --> 00:11:07,670 start seeing all those same values which we just looked at in CFF explorer and Wikipedia 121 122 00:11:07,700 --> 00:11:08,230 diagram. 122 123 00:11:08,580 --> 00:11:15,230 So yeah the major Linker version, minor linker version, the size of code, base of code and things like that. 123 124 00:11:15,590 --> 00:11:22,070 So if you understand how PE file format looks like you can pretty much make sense of all this data. 124 125 00:11:22,070 --> 00:11:25,200 So this was all about the PE file's structure. 125 126 00:11:25,400 --> 00:11:32,180 There will be just one more video on PE files and that will be our last video and that will basically 126 127 00:11:32,210 --> 00:11:38,330 give you a good idea of what happens when you execute a file in a Windows environment. 127 128 00:11:38,330 --> 00:11:39,260 Thanks for watching.