An overview of the
The resource-arsc file is a resource index file generated during the Apk packaging process. The resource-arsc file is visible when you unzip the APK or analyze the APK using Android Studio.
Learning about the resource-arsc file structure provides insight into the techniques of resource duplication and resource file name confusion used in APK package volume optimization.
Arsc file function
To access a file in Java, you need to provide the file name, for example:
new File("./res/drawable-xxhdpi/img.png");
Copy the code
In Android, however, resource files can be obtained by drawable Id:
getDrawable(R.drawable.img);
Copy the code
Here with an ID to obtain the content of the resource file, eliminating the need for manual input of the file path, behind it is by reading the ARSC file to achieve.
XXX, r. Layout. XXX, r. string. XXX and other values (stored in the R.jar or R. Java file) are called resource indexes. These resource indexes can be used to search the actual resource path or value in the ARSC file.
For example, getDrawable(r.drawable.img) is compiled to getDrawable(2131099964), and the id is converted to hexadecimal:
2131099964 = 0x7f06013c
Copy the code
The resource index is 0x7F06013c.
The resource index has a fixed format: 0xPPTTEEEE
PackageId(2 bits) + TypeId (2 bits) + EntryId(4 bits)
PP:Package ID, Package namespace. The value range is [0x01, 0x7f]. For third-party applications, the value is 7f.
TT: indicates the resource type, including ANIM, Layout, MIpmap, String, and style.
EEEE: indicates the value of a resource in the offset array
So, in 0x7F06013c, PackageId = 0x7f, TypeId = 0x06, EntryId = 0x013c
At its simplest, we can think of the arsc function as a file containing arrays of pairs, with one Pair[] for each resource type (TypeId) (or more, just one for the sake of understanding). So looking for the value of element 0x7F06013c in arSC is to try to find the array Pair[] corresponding to TypeId=0x06, and then find the 0X013c element Pair[0X013c]. /res/drawable-xxhdpi/img.png”); /res/drawable-xxhdpi/img.png”); The program can then access the corresponding resource file.
Of course, the actual ARSC file structure is a little more complex, let’s start to analyze the ARSC file structure.
chunk
Before introducing the resource. Arsc (arSC) file, you need to explain chunk. The word “chunk” is used several times in other articles.
A tree can be divided into three chunks: its crown, its stem, and its root. You can also view a tree as a chunk, and the chunk is the tree.
Arsc file structure
resources.arsc
Is a binary file whose internal structure is defined inResourceTypes.hIf you don’t like this file, you can look at this description firstarsc
Network image of file structure.
The image as a whole describes the relationship between the chunks in the ARSC file (note the left and right sides of the image):
- The whole
arsc
The file is aRES_TABLE_TYPE
The type ofchunk
;RES_TABLE_TYPE
It can be divided into three parts: file head and two childrenchunk
(RES_STRING_POOL_TYPE
、RES_TABLE_PACKAGE_TYPE
);RES_TABLE_PACKAGE_TYPE
Contains: header, resource type string constant pool, resource item name string constant pool, multiple childrenchunk
(RES_TABLE_TYPE_SPEC_TYPE
andRES_TABLE_TYPE_TYPE
);- Of every type
chunk
Both have a head structure
The structure of an ARSC file can be roughly expressed as the following pseudocode:
//---------------------------------------------------------------------------
//: arsc file is a chunk of type RES_TABLE_TYPE
RES_TABLE_TYPE {
table_header// File header
RES_STRING_POOL_TYPE // Constant pool chunk
RES_TABLE_PACKAGE_TYPE/ / content chunk
}
//---------------------------------------------------------------------------
//: String constant pool chunk
RES_STRING_POOL_TYPE {
pool_header// String constant pool header
string[] / / constant pool
}
//---------------------------------------------------------------------------
/ / : content chunk
RES_TABLE_PACKAGE_TYPE {
package_header/ / the chunk in the head
RES_STRING_POOL_TYPE// Resource type string constant pool, type: RES_STRING_POOL_TYPE, content :[anim, attr, bool, color, dimen, drawable, id, integer, interpolator, layout, mipmap, string, style]
RES_STRING_POOL_TYPE// Resource item name string constant pool
// Resource type chunk: In the above ResTypeName_StringPool (resource type constant pool) each type has one chunk of resource type. Here, take drawable as an example
// Drawable Resource type chunk
RES_TABLE_TYPE_SPEC_TYPE{
spec_header/ / spec head
//drawable-mdpi
RES_TABLE_TYPE_TYPE
//drawable-hdpi
RES_TABLE_TYPE_TYPE
...
}
//attr Resource type chunk
RES_TABLE_TYPE_SPEC_TYPE{
RES_TABLE_TYPE_TYPE
RES_TABLE_TYPE_TYPE{
type_header/ / the type of the head
// Specific resource item pool: Resource name: resource valueResName:ResValue ResName:ResValue ResName:ResValue ResName:ResTableMapEntry->[Res_value1, Res_value2] ResName:ResTableMapEntry->->[Res_value1, Res_value2,Res_value3] } ... }... . }//---------------------------------------------------------------------------
Copy the code
The Chunk head structure
Each chunk starts with a header structure that starts with ResChunk_header. In resourcetypes.h, this header structure is defined as ResChunk_header:
/** * Header that appears at the front of every data chunk in a resource. */
struct ResChunk_header
{
// Type identifier for this chunk. The meaning of this value depends
// on the containing chunk.
uint16_t type;
// Size of the chunk header (in bytes). Adding this value to
// the address of the chunk allows you to find its associated data
// (if any).
uint16_t headerSize;
// Total size of this chunk (in bytes). This is the chunkSize plus
// the size of any data associated with the chunk. Adding this value
// to the chunk allows you to completely skip its contents (including
// any child chunks). If this value is the same as chunkSize, there is
// no data associated with the chunk.
uint32_t size;
};
Copy the code
Uint16_t: 16-bit unsigned integer (2 bytes), uint32_T: 32-bit unsigned integer (4 bytes)
Structure analysis
- Type: indicates the type of the chunk, which is partially defined as follows:
enum { RES_NULL_TYPE = 0x0000, RES_STRING_POOL_TYPE = 0x0001, RES_TABLE_TYPE = 0x0002.// Chunk types in RES_TABLE_TYPE RES_TABLE_PACKAGE_TYPE = 0x0200, RES_TABLE_TYPE_TYPE = 0x0201, RES_TABLE_TYPE_SPEC_TYPE = 0x0202, RES_TABLE_LIBRARY_TYPE = 0x0203 }; Copy the code
- headerSize :
chunk
Head size- Size: indicates the size of the chunk
ResTable_header
First, the file header is a ResTable_header structure:
struct ResTable_header
{
struct ResChunk_header header;
// The number of ResTable_package structures.
uint32_t packageCount;
};
Copy the code
Structural analysis:
header
:ResChunk_header
Type, wheretype
forRES_TABLE_TYPE
packageCount
:arsc
In the fileResTablePackage
Is usually 1.
So the head structure is as follows:
StringPool
Next comes the string resource poolchunk
, its structure is as follows:The string constant pool holds the contents of all string resources in APK, this onechunk
It consists of five parts as shown in the figure:
ResStringPool_header
: String constant pool constant headerString Offset Array
: Array of string offsets. Each element of the array records an offset from the starting position of a string in the constant pool. Each offset has a size of 4 bytes, so the size of this region is (4 x)stringCount
Bytes)Style Offset Array
: String style offset arrayString Content
: String constant pool content area. Each string element in the pool contains a terminatorStyle Content
: String style content area
Our main concerns are: ResStringPool_header, String Offset Array, and String Content
First examine the string constant pool header, which is a ResStringPool_header structure:
struct ResStringPool_header
{
struct ResChunk_header header;
// Number of strings in this pool (number of uint32_t indices that follow
// in the data).
uint32_t stringCount;
// Number of style span arrays in the pool (number of uint32_t indices
// follow the string indices).
uint32_t styleCount;
// Flags.
enum {
// If set, the string index is sorted by the string values (based
// on strcmp16()).
SORTED_FLAG = 1<<0.// String pool is encoded in UTF-8
UTF8_FLAG = 1<<8
};
uint32_t flags;
// Index from header of the string data.
uint32_t stringsStart;
// Index from header of the style data.
uint32_t stylesStart;
};
Copy the code
Structural analysis:
header
:ResChunkHeader
, includingtype
是RES_STRING_POOL_TYPE
stringCount
: Indicates the number of strings in the constant poolstyleCount
: Number of string styles in the constant poolflags
: = 0,SORTED_FLAG
,UTF8_FLAG
Or their combined values to describe the properties of a string resource string, for example,SORTED_FLAG
Bits equal to 1 means that the string is sorted, andUTF8_FLAG
Bit equal to 1 indicates that the string is usedUTF8
It’s coded, otherwise it’sUTF16
The codingstringsStart
: The offset between the string contents and the starting point of the header of the constant poolstylesStart
: The offset between the string style content and the starting point of the header of the constant pool
Package
Finally, examining the Package, the chunk starts with a ResTable_package structure:
/** * A collection of resource data types within a package. Followed by * one or more ResTable_type and ResTable_typeSpec structures containing the * entry values for each resource type. */
struct ResTable_package
{
struct ResChunk_header header;
// If this is a base package, its ID. Package IDs start
// at 1 (corresponding to the value of the package bits in a
// resource identifier). 0 means this is not a base package.
uint32_t id;
// Actual name of this package, \0-terminated.
uint16_t name[128];
// Offset to a ResStringPool_header defining the resource
// type symbol table. If zero, this package is inheriting from
// another base package (overriding specific values in it).
uint32_t typeStrings;
// Last index into typeStrings that is for public use by others.
uint32_t lastPublicType;
// Offset to a ResStringPool_header defining the resource
// key symbol table. If zero, this package is inheriting from
// another base package (overriding specific values in it).
uint32_t keyStrings;
// Last index into keyStrings that is for public use by others.
uint32_t lastPublicKey;
uint32_t typeIdOffset;
};
Copy the code
Structural analysis:
header
: the type ofResChunk_header
, itstype
isRES_TABLE_PACKAGE_TYPE
id
: the packageID
, is equal to the Package Id of the general user PackagePackage Id
for0X7F
, system resource bundlePackage Id
为0X01
.name
: the package nametypeStrings
: Resource Type String Offset from the header of the resource poollastPublicType
: Type String Size of the resource poolkeyStrings
: Offset from the header of the resource item stringlastPublicKey
: Name of a resource item String Size of the resource pooltypeIdOffset
: Unknown, the value is 0
Of the above structurestypeStrings
,keyStrings
Resource type string constant pool and resource item name constant pool are mentioned in theResStringPool
, their position is close behindResTable_package
And then we have thetaType String Pool
withType String Pool
. You can see it in the following pictureResTable_package
Relation to the position of these two string constant pools:Plus the previous string constant pool throughoutarsc
The file contains three string constant pools: string resource constant pool, resource type string constant pool, and resource item name string constant pool.
Such as:
<string name="tip">hello world</string>
Copy the code
Indicates a resource whose type is string, name is tip, and value is Hello World.
hello world
forString resourcesAnd stored in theString resourcesConstant pool;string
forThe resource typeAnd stored in theThe resource typeString constant pool;tip
forResource item NameAnd stored in theResource item NameString constant pool;If the resource is r.drawable. Img, the resource type is drawable, the resource item name is img, and the file path corresponding to the resource is stored in string resources.
ResTable_typeSpec
和 ResTable_type
An ARSC is a file composed of multiple pairs []. For each resource type (anim, attr, Drawable, String, etc.), there is a Pair[]. The Pair[] is ResTable_typeSpec and ResTable_type.
In fact, in the ARSC file, there is a ResTable_typeSpec for each resource type, which describes the configuration difference of the resource item. Each ResTable_typeSpec consists of a header, one or more ResTable_type, The number of ResTable_type is determined by the number of adaptation types. For example, each adaptation type, such as Drawable, drawable- MDpi, and drawable- hdPI, has one ResTable_type. Each ResTable_type consists of a header and an array of resource items, the Pair[] mentioned above.
For example, the ResTable_typeSpec and ResTable_type of drawable can be expressed as follows:
//drawable
RES_TABLE_TYPE_SPEC_TYPE{
//drawable-mdpi
RES_TABLE_TYPE_TYPE
//drawable-hdpi
RES_TABLE_TYPE_TYPE{
ResChunk_header/ / the type of the head
// Array of specific resource items: resource name -> Resource value
ResName->ResValue
ResName->ResValue
ResName->ResValue
//ResName->ResTableMapEntry
//ResName->ResTableMapEntry. }... }Copy the code
What about ResTable_typeSpec and ResTable_type in arSC?
First look at the ResTable_typeSpec type:
struct ResTable_typeSpec
{
struct ResChunk_header header;
// The type identifier this chunk is holding. Type IDs start
// at 1 (corresponding to the value of the type bits in a
// resource identifier). 0 is invalid.
uint8_t id;
// Must be 0.
uint8_t res0;
// Must be 0.
uint16_t res1;
// Number of uint32_t entry configuration masks that follow.
uint32_t entryCount;
enum : uint32_t {
// Additional flag indicating an entry is public.
SPEC_PUBLIC = 0x40000000u,
// Additional flag indicating an entry is overlayable at runtime.
// Added in Android-P.
SPEC_OVERLAYABLE = 0x80000000u,
};
};
Copy the code
Structural analysis:
header
: the head,type
Is equal to theRES_TABLE_TYPE_SPEC_TYPE
id
: indicates the resource typeid
By thisid
You can get the resource type in the resource type constant pool, this oneid
is0xPPTTEEEE
In theTT
res0
,res1
: Reserved field. The value is 0entryCount
: Indicates the number of resource items of this type. Note that this refers to the number of resource items with the same name
After the analysis of the resource type is complete, let’s look at the ResTable_type used for the adaptation type and the specific resource item.
Again, start with the head:
struct ResTable_type
{
struct ResChunk_header header;
enum {
NO_ENTRY = 0xFFFFFFFF
};
// The type identifier this chunk is holding. Type IDs start
// at 1 (corresponding to the value of the type bits in a
// resource identifier). 0 is invalid.
uint8_t id;
enum {
// If set, the entry is sparse, and encodes both the entry ID and offset into each entry,
// and a binary search is used to find the key. Only available on platforms >= O.
// Mark any types that use this with a v26 qualifier to prevent runtime issues on older
// platforms.
FLAG_SPARSE = 0x01};uint8_t flags;
// Must be 0.
uint16_t reserved;
// Number of uint32_t entry indices that follow.
uint32_t entryCount;
// Offset from header where ResTable_entry data starts.
uint32_t entriesStart;
// Configuration this collection of entries is designed for. This must always be last.
ResTable_config config;
};
Copy the code
Structural analysis:
header
:ResChunk_header
Type, wheretype
Is equal to theRES_TABLE_TYPE_TYPE
reserved
: Reserved field. The value is 0entryCount
: Indicates the number of resource items of this type. Note that this refers to the number of resource items with the same name.entriesStart
: The data block of the resource item is relative to the sourcechunk
The offset value of the header.config
: Points to a ResTable_config that describes the configuration informationType
What type of adaptation is it?
What follows is exactly how the resource item pool (an array of resource items) stores specific resources.
Resource items in a resource item pool can be stored in the following modes:
- Common Resources:
ResTable_entry
+Res_value
bag
Resources:ResTable_entry
+ResTable_map_entry
+Res_Table_map
* n
ResTable_entry points to the resource entry name and identifies whether the resource is a BAG resource. Res_value and Res_Table_map point to specific resources. The following figure shows the specific storage modes of the two resource types:
Finally, take a look at the internal structure of ResTable_entry, Res_value, and ResTable_map_entry.
Look at the ResTable_entry:
struct ResTable_entry
{
// Number of bytes in this structure.
uint16_t size;
enum {
// If set, this is a complex entry, holding a set of name/value
// mappings. It is followed by an array of ResTable_map structures.
FLAG_COMPLEX = 0x0001.// If set, this resource has been declared public, so libraries
// are allowed to reference it.
FLAG_PUBLIC = 0x0002.// If set, this is a weak resource and may be overriden by strong
// resources of the same name/type. This is only useful during
// linking with other resource tables.
FLAG_WEAK = 0x0004
};
uint16_t flags;
// Reference into ResTable_package::keyStrings identifying this entry.
struct ResStringPool_ref key;
};
Copy the code
Structural analysis:
size
: Indicates the header size of the resource item.flags
: Indicates the flag bit of the resource item.flags = FLAG_COMPLEX
Indicates the resource isBag
Resource item, and inResTable_entry
After following theResTable_map
Array represents the contents of the resource item, otherwise, inResTable_entry
After following theRes_value
: Resource item. If it is a resource item that can be referenced, thenFLAG_PUBLIC
The bits are equal to 1.key
: __ Resource item name __ Index in the resource pool.
The name of the resource item is found in ResTable_entry, and then look at the resource value Res_Value:
struct Res_value
{
// Number of bytes in this structure.
uint16_t size;
// Always set to 0.
uint8_t res0;
uint8_t dataType;
// The data for this item, as interpreted according to dataType.
typedef uint32_t data_type;
data_type data;
};
Copy the code
Structural analysis:
size
:Res_value
The size of theres0
: Reserved field. The value is 0dataType
: The type of the current data. This is an enumeration type (String, Dimension, etc.)ResourceTypes.hdata
: data. Based on the data type above, if the type is string, the current value is the index in the string resource pool
Finally, look at the specific contents of the storage structure of the BAG resource (obviously, the storage structure of the BAG resource does not meet the Pair object we said above, the introduction of the Pair object only to help us understand the non-BAG resource storage structure), ResTable_map_entry and ResTable_ref
struct ResTable_map_entry : public ResTable_entry
{
// Resource identifier of the parent mapping, or 0 if there is none.
// The resource ID of the parent ResTable_map_entry, which is equal to 0 if there is no parent ResTable_map_entry
ResTable_ref parent;
// Number of name/value pairs that follow for FLAG_COMPLEX.
// The number of bag entries
uint32_t count;
};
struct ResTable_map
{
// The resource item ID of bag
ResTable_ref name;
// This mapping's value.
// The value of the resource item in bag
Res_value value;
};
struct ResTable_ref
{
uint32_t ident;
};
Copy the code
So far, the analysis of resource. Arsc asking price has been completed.
At the end
The overall structure of arSC files is not very complex. The Android chunk-utils is a tool written in Java to parse arSC files. It helps to understand the structure of arSC files.
Recommended reading
1, ResourceTypes. H
2. Analyze the compilation and packaging process of Android application resources
3, Android hand analysis resources. Arsc
4, Android reverse note — ARSC file format parsing