Before I start the blog(I actually started), I wanted to say that I'm
a lousy coder, my code is a mess, only I can read it, but the important
thing is, it works :D.
Recently I was working on some project regarding PE files, where I've to parse the PE file to extract some information. As I've previous experience with the PE files, so it was supposed to be a smooth ride. But I hit a road bump, when I reached the resource section. I don't know why Microsoft had to made it so messy. I tried to search internet for info regarding the structure, but the information was too vague or hard to understand or in most cases incomplete (or I guess I don't know how to google stuff).
But somehow, I was able to parse the code and extract all the resources. So, I decided to blog about this, so that it may help some confused person like me. So, here are my findings about the resource section how to parse it easily using any language. It may help you if you are trying to learn file structure or create any program regarding resources in a PE file, so...lets go.
We'll go step by step how to parse resource section.
Step 1. To parse, first of all we need to locate where the resources of a file are located. Now it may be easier to look at the .rsrc section of the file, but in some files, there is no such section, or in case malware has infected the file, you wont find anything. So to avoid this, we'll go the legit way i.e. looking at the Data Directories in the PE file. These provide easy way to locate certain areas in PE file which are important in one way or another. The Data Directory has following structure:
Step 2. Now that we got the address where the resource of the PE file are, now we can parse it. Well, as easy it may sound, it was not that easy for me. So to do this, we have to understand the structure of the resources in the PE file. The resource of PE comprises of two primary structure, which sort of form a tree like structure. The structures are:
So how many resources dierctories are there in the PE file? Its simple, we can get it from _IMAGE_RESOURCE_DIRECTORY i.e NumberOfNamedEntries + NumberOfIdEntries, it will gives you the exact number of resources in a PE file.
Now here some concentration(cup of 0xcoffee) is required.
The resources in PE files are of various categories. here are some :
Leaf node is in the form of structure
Please leave your feedback, any comments, I might be wrong about something in the blog, any mistakes or any suggestion.
Recently I was working on some project regarding PE files, where I've to parse the PE file to extract some information. As I've previous experience with the PE files, so it was supposed to be a smooth ride. But I hit a road bump, when I reached the resource section. I don't know why Microsoft had to made it so messy. I tried to search internet for info regarding the structure, but the information was too vague or hard to understand or in most cases incomplete (or I guess I don't know how to google stuff).
But somehow, I was able to parse the code and extract all the resources. So, I decided to blog about this, so that it may help some confused person like me. So, here are my findings about the resource section how to parse it easily using any language. It may help you if you are trying to learn file structure or create any program regarding resources in a PE file, so...lets go.
We'll go step by step how to parse resource section.
Step 1. To parse, first of all we need to locate where the resources of a file are located. Now it may be easier to look at the .rsrc section of the file, but in some files, there is no such section, or in case malware has infected the file, you wont find anything. So to avoid this, we'll go the legit way i.e. looking at the Data Directories in the PE file. These provide easy way to locate certain areas in PE file which are important in one way or another. The Data Directory has following structure:
typedef struct _IMAGE_DATA_DIRECTORY {
DWORD VirtualAddress;
DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
We're concerned with VirtualAddress part to locate the concerned data directory. Remember the VirtualAddress denotes the RelativeVirtualAddress, you might need it to convert it into virtual or raw address. The Data Directory we are looking here is IMAGE_DIRECTORY_ENTRY_RESOURCE(defined
as DataDirectorty[2]). Once we got the address, we go to the next step,
which was real pain in the back for me to understand.Step 2. Now that we got the address where the resource of the PE file are, now we can parse it. Well, as easy it may sound, it was not that easy for me. So to do this, we have to understand the structure of the resources in the PE file. The resource of PE comprises of two primary structure, which sort of form a tree like structure. The structures are:
typedef struct _IMAGE_RESOURCE_DIRECTORY {
DWORD Characteristics;
DWORD TimeDateStamp;
WORD MajorVersion;
WORD MinorVersion;
WORD NumberOfNamedEntries;
WORD NumberOfIdEntries;
// IMAGE_RESOURCE_DIRECTORY_ENTRY DirectoryEntries[];
}
This structure is immediately followed by another structure
typedef struct _IMAGE_RESOURCE_DIRECTORY_ENTRY {
union {
struct {
DWORD NameOffset:31;
DWORD NameIsString:1;
} DUMMYSTRUCTNAME;
DWORD Name;
WORD Id;
} DUMMYUNIONNAME;
union {
DWORD OffsetToData;
struct {
DWORD OffsetToDirectory:31;
DWORD DataIsDirectory:1;
} DUMMYSTRUCTNAME2;
} DUMMYUNIONNAME2;
}
Remember, the structure is immediately followed,
there is no link between to, they are just adjacent to each other(I
don't know why they are not linked).So how many resources dierctories are there in the PE file? Its simple, we can get it from _IMAGE_RESOURCE_DIRECTORY i.e NumberOfNamedEntries + NumberOfIdEntries, it will gives you the exact number of resources in a PE file.
Now here some concentration(cup of 0xcoffee) is required.
The resources in PE files are of various categories. here are some :
- Cursor
- Bitmap
- Icon
- Menu
- Dialog
- String
- Font directory
- Font
- Accelerator
- RCData
- Message table
- Version
- Dialog
- Plug and Play
- VXD
- Animated Cursor
- Animated Icon
- HTML
- Manifest
Leaf node is in the form of structure
typedef struct _IMAGE_RESOURCE_DATA_ENTRY {
DWORD OffsetToData;
DWORD Size;
DWORD CodePage;
DWORD Reserved;
}
In psuedocode language:This thing is a bit complicated. I actually have to look back to my code to understand once again. But I think this is almost it. Once you get the address of the data of particular resource, you can now figure what to do with it. For example, in my case, I wanted to check if there is any executable file in the resource or not.function parse_resource_section(){ for(NumberOfNamedEntries + NumberOfIdEntries){ //leaf node not reached if DataIsDirectory is 1{ call parse_resource_section() } else{ //we reached the leaf node, map the address to _IMAGE_RESOURCE_DATA_ENTRY } } }
Please leave your feedback, any comments, I might be wrong about something in the blog, any mistakes or any suggestion.
0 comments:
Post a Comment