An overview of the


The resource-arsc file is a resource index file generated during the Apk packaging process. The resource-arsc file is visible when you unzip the APK or analyze the APK using Android Studio.

Learning about the resource-arsc file structure provides insight into the techniques of resource duplication and resource file name confusion used in APK package volume optimization.

Arsc file function


To access a file in Java, you need to provide the file name, for example:

    new File("./res/drawable-xxhdpi/img.png");
Copy the code

In Android, however, resource files can be obtained by drawable Id:

    getDrawable(R.drawable.img);
Copy the code

Here with an ID to obtain the content of the resource file, eliminating the need for manual input of the file path, behind it is by reading the ARSC file to achieve.

XXX, r. Layout. XXX, r. string. XXX and other values (stored in the R.jar or R. Java file) are called resource indexes. These resource indexes can be used to search the actual resource path or value in the ARSC file.

For example, getDrawable(r.drawable.img) is compiled to getDrawable(2131099964), and the id is converted to hexadecimal:

2131099964 = 0x7f06013c
Copy the code

The resource index is 0x7F06013c.

The resource index has a fixed format: 0xPPTTEEEE

PackageId(2 bits) + TypeId (2 bits) + EntryId(4 bits)

  • PP:Package ID, Package namespace. The value range is [0x01, 0x7f]. For third-party applications, the value is 7f.

  • TT: indicates the resource type, including ANIM, Layout, MIpmap, String, and style.

  • EEEE: indicates the value of a resource in the offset array

So, in 0x7F06013c, PackageId = 0x7f, TypeId = 0x06, EntryId = 0x013c

At its simplest, we can think of the arsc function as a file containing arrays of pairs, with one Pair[] for each resource type (TypeId) (or more, just one for the sake of understanding). So looking for the value of element 0x7F06013c in arSC is to try to find the array Pair[] corresponding to TypeId=0x06, and then find the 0X013c element Pair[0X013c]. /res/drawable-xxhdpi/img.png”); /res/drawable-xxhdpi/img.png”); The program can then access the corresponding resource file.

Of course, the actual ARSC file structure is a little more complex, let’s start to analyze the ARSC file structure.

chunk


Before introducing the resource. Arsc (arSC) file, you need to explain chunk. The word “chunk” is used several times in other articles.

A tree can be divided into three chunks: its crown, its stem, and its root. You can also view a tree as a chunk, and the chunk is the tree.

Arsc file structure


    resources.arscIs a binary file whose internal structure is defined inResourceTypes.hIf you don’t like this file, you can look at this description firstarscNetwork image of file structure.

The image as a whole describes the relationship between the chunks in the ARSC file (note the left and right sides of the image):

  1. The wholearscThe file is aRES_TABLE_TYPEThe type ofchunk;
  2. RES_TABLE_TYPEIt can be divided into three parts: file head and two childrenchunk( RES_STRING_POOL_TYPERES_TABLE_PACKAGE_TYPE);
  3. RES_TABLE_PACKAGE_TYPEContains: header, resource type string constant pool, resource item name string constant pool, multiple childrenchunk(RES_TABLE_TYPE_SPEC_TYPEandRES_TABLE_TYPE_TYPE );
  4. Of every typechunkBoth have a head structure

The structure of an ARSC file can be roughly expressed as the following pseudocode:

//---------------------------------------------------------------------------
//: arsc file is a chunk of type RES_TABLE_TYPE
RES_TABLE_TYPE {
    table_header// File header
    RES_STRING_POOL_TYPE // Constant pool chunk
    RES_TABLE_PACKAGE_TYPE/ / content chunk
}
//---------------------------------------------------------------------------
//: String constant pool chunk
RES_STRING_POOL_TYPE {
    pool_header// String constant pool header
    string[] / / constant pool
}
//---------------------------------------------------------------------------
/ / : content chunk
RES_TABLE_PACKAGE_TYPE {
    package_header/ / the chunk in the head
    RES_STRING_POOL_TYPE// Resource type string constant pool, type: RES_STRING_POOL_TYPE, content :[anim, attr, bool, color, dimen, drawable, id, integer, interpolator, layout, mipmap, string, style]
    RES_STRING_POOL_TYPE// Resource item name string constant pool
    // Resource type chunk: In the above ResTypeName_StringPool (resource type constant pool) each type has one chunk of resource type. Here, take drawable as an example
    // Drawable Resource type chunk
    RES_TABLE_TYPE_SPEC_TYPE{
        spec_header/ / spec head
        //drawable-mdpi
        RES_TABLE_TYPE_TYPE
        //drawable-hdpi
        RES_TABLE_TYPE_TYPE
        ...
    }
    //attr Resource type chunk
    RES_TABLE_TYPE_SPEC_TYPE{
        RES_TABLE_TYPE_TYPE
        RES_TABLE_TYPE_TYPE{
            type_header/ / the type of the head
            // Specific resource item pool: Resource name: resource valueResName:ResValue ResName:ResValue ResName:ResValue ResName:ResTableMapEntry->[Res_value1, Res_value2] ResName:ResTableMapEntry->->[Res_value1, Res_value2,Res_value3] } ... }... . }//---------------------------------------------------------------------------
Copy the code

The Chunk head structure

Each chunk starts with a header structure that starts with ResChunk_header. In resourcetypes.h, this header structure is defined as ResChunk_header:

/** * Header that appears at the front of every data chunk in a resource. */
struct ResChunk_header
{
	
    // Type identifier for this chunk. The meaning of this value depends
    // on the containing chunk.
    uint16_t type;

    // Size of the chunk header (in bytes). Adding this value to
    // the address of the chunk allows you to find its associated data
    // (if any).
    uint16_t headerSize;

    // Total size of this chunk (in bytes). This is the chunkSize plus
    // the size of any data associated with the chunk. Adding this value
    // to the chunk allows you to completely skip its contents (including
    // any child chunks). If this value is the same as chunkSize, there is
    // no data associated with the chunk.
    uint32_t size;
};
Copy the code

Uint16_t: 16-bit unsigned integer (2 bytes), uint32_T: 32-bit unsigned integer (4 bytes)

Structure analysis

  • Type: indicates the type of the chunk, which is partially defined as follows:
enum {
   RES_NULL_TYPE               = 0x0000,
   RES_STRING_POOL_TYPE        = 0x0001,
   RES_TABLE_TYPE              = 0x0002.// Chunk types in RES_TABLE_TYPE
   RES_TABLE_PACKAGE_TYPE      = 0x0200,
   RES_TABLE_TYPE_TYPE         = 0x0201,
   RES_TABLE_TYPE_SPEC_TYPE    = 0x0202,
   RES_TABLE_LIBRARY_TYPE      = 0x0203
};
Copy the code
  • headerSize : chunkHead size
  • Size: indicates the size of the chunk

ResTable_header

First, the file header is a ResTable_header structure:

struct ResTable_header
{
    struct ResChunk_header header;

    // The number of ResTable_package structures.
    uint32_t packageCount;
};
Copy the code

Structural analysis:

  • header : ResChunk_headerType, wheretypeforRES_TABLE_TYPE
  • packageCount : arscIn the fileResTablePackageIs usually 1.

So the head structure is as follows:

StringPool

Next comes the string resource poolchunk, its structure is as follows:The string constant pool holds the contents of all string resources in APK, this onechunkIt consists of five parts as shown in the figure:

  • ResStringPool_header: String constant pool constant header
  • String Offset Array: Array of string offsets. Each element of the array records an offset from the starting position of a string in the constant pool. Each offset has a size of 4 bytes, so the size of this region is (4 x)stringCountBytes)
  • Style Offset Array: String style offset array
  • String Content: String constant pool content area. Each string element in the pool contains a terminator
  • Style Content: String style content area

Our main concerns are: ResStringPool_header, String Offset Array, and String Content

First examine the string constant pool header, which is a ResStringPool_header structure:

struct ResStringPool_header
{
    struct ResChunk_header header;

    // Number of strings in this pool (number of uint32_t indices that follow
    // in the data).
    uint32_t stringCount;

    // Number of style span arrays in the pool (number of uint32_t indices
    // follow the string indices).
    uint32_t styleCount;

    // Flags.
    enum {
        // If set, the string index is sorted by the string values (based
        // on strcmp16()).
        SORTED_FLAG = 1<<0.// String pool is encoded in UTF-8
        UTF8_FLAG = 1<<8
    };
    uint32_t flags;

    // Index from header of the string data.
    uint32_t stringsStart;

    // Index from header of the style data.
    uint32_t stylesStart;
};
Copy the code

Structural analysis:

  • header : ResChunkHeader, includingtypeRES_STRING_POOL_TYPE
  • stringCount: Indicates the number of strings in the constant pool
  • styleCount: Number of string styles in the constant pool
  • flags: = 0,SORTED_FLAG,UTF8_FLAGOr their combined values to describe the properties of a string resource string, for example,SORTED_FLAGBits equal to 1 means that the string is sorted, andUTF8_FLAGBit equal to 1 indicates that the string is usedUTF8It’s coded, otherwise it’sUTF16The coding
  • stringsStart: The offset between the string contents and the starting point of the header of the constant pool
  • stylesStart: The offset between the string style content and the starting point of the header of the constant pool

Package

Finally, examining the Package, the chunk starts with a ResTable_package structure:

/** * A collection of resource data types within a package. Followed by * one or more ResTable_type and ResTable_typeSpec structures containing the * entry values for each resource type. */
struct ResTable_package
{
    struct ResChunk_header header;
    
    // If this is a base package, its ID. Package IDs start
    // at 1 (corresponding to the value of the package bits in a
    // resource identifier). 0 means this is not a base package.
    uint32_t id;

    // Actual name of this package, \0-terminated.
    uint16_t name[128];

    // Offset to a ResStringPool_header defining the resource
    // type symbol table. If zero, this package is inheriting from
    // another base package (overriding specific values in it).
    uint32_t typeStrings;

    // Last index into typeStrings that is for public use by others.
    uint32_t lastPublicType;

    // Offset to a ResStringPool_header defining the resource
    // key symbol table. If zero, this package is inheriting from
    // another base package (overriding specific values in it).
    uint32_t keyStrings;

    // Last index into keyStrings that is for public use by others.
    uint32_t lastPublicKey;

    uint32_t typeIdOffset;
};
Copy the code

Structural analysis:

  • header: the type ofResChunk_header, itstypeisRES_TABLE_PACKAGE_TYPE
  • id: the packageID, is equal to the Package Id of the general user PackagePackage Idfor0X7F, system resource bundlePackage Id0X01.
  • name: the package name
  • typeStrings: Resource Type String Offset from the header of the resource pool
  • lastPublicType: Type String Size of the resource pool
  • keyStrings: Offset from the header of the resource item string
  • lastPublicKey: Name of a resource item String Size of the resource pool
  • typeIdOffset: Unknown, the value is 0

Of the above structurestypeStrings,keyStringsResource type string constant pool and resource item name constant pool are mentioned in theResStringPool, their position is close behindResTable_packageAnd then we have thetaType String PoolwithType String Pool. You can see it in the following pictureResTable_packageRelation to the position of these two string constant pools:Plus the previous string constant pool throughoutarscThe file contains three string constant pools: string resource constant pool, resource type string constant pool, and resource item name string constant pool.

Such as:

<string name="tip">hello world</string>
Copy the code

Indicates a resource whose type is string, name is tip, and value is Hello World.

  • hello worldforString resourcesAnd stored in theString resourcesConstant pool;
  • stringforThe resource typeAnd stored in theThe resource typeString constant pool;
  • tipforResource item NameAnd stored in theResource item NameString constant pool;

If the resource is r.drawable. Img, the resource type is drawable, the resource item name is img, and the file path corresponding to the resource is stored in string resources.

ResTable_typeSpecResTable_type

An ARSC is a file composed of multiple pairs []. For each resource type (anim, attr, Drawable, String, etc.), there is a Pair[]. The Pair[] is ResTable_typeSpec and ResTable_type.

In fact, in the ARSC file, there is a ResTable_typeSpec for each resource type, which describes the configuration difference of the resource item. Each ResTable_typeSpec consists of a header, one or more ResTable_type, The number of ResTable_type is determined by the number of adaptation types. For example, each adaptation type, such as Drawable, drawable- MDpi, and drawable- hdPI, has one ResTable_type. Each ResTable_type consists of a header and an array of resource items, the Pair[] mentioned above.

For example, the ResTable_typeSpec and ResTable_type of drawable can be expressed as follows:

//drawable
RES_TABLE_TYPE_SPEC_TYPE{
	//drawable-mdpi
	RES_TABLE_TYPE_TYPE
	//drawable-hdpi
	RES_TABLE_TYPE_TYPE{
		ResChunk_header/ / the type of the head
		// Array of specific resource items: resource name -> Resource value
		ResName->ResValue
		ResName->ResValue
		ResName->ResValue
		//ResName->ResTableMapEntry
		//ResName->ResTableMapEntry. }... }Copy the code

What about ResTable_typeSpec and ResTable_type in arSC?

First look at the ResTable_typeSpec type:

struct ResTable_typeSpec
{
    struct ResChunk_header header;

    // The type identifier this chunk is holding. Type IDs start
    // at 1 (corresponding to the value of the type bits in a
    // resource identifier). 0 is invalid.
    uint8_t id;
    
    // Must be 0.
    uint8_t res0;
    // Must be 0.
    uint16_t res1;
    
    // Number of uint32_t entry configuration masks that follow.
    uint32_t entryCount;

    enum : uint32_t {
        // Additional flag indicating an entry is public.
        SPEC_PUBLIC = 0x40000000u,

        // Additional flag indicating an entry is overlayable at runtime.
        // Added in Android-P.
        SPEC_OVERLAYABLE = 0x80000000u,
    };
};
Copy the code

Structural analysis:

  • header: the head,typeIs equal to theRES_TABLE_TYPE_SPEC_TYPE
  • id: indicates the resource typeidBy thisidYou can get the resource type in the resource type constant pool, this oneidis0xPPTTEEEEIn theTT
  • res0,res1: Reserved field. The value is 0
  • entryCount: Indicates the number of resource items of this type. Note that this refers to the number of resource items with the same name

After the analysis of the resource type is complete, let’s look at the ResTable_type used for the adaptation type and the specific resource item.

Again, start with the head:

struct ResTable_type
{
    struct ResChunk_header header;

    enum {
        NO_ENTRY = 0xFFFFFFFF
    };
    
    // The type identifier this chunk is holding. Type IDs start
    // at 1 (corresponding to the value of the type bits in a
    // resource identifier). 0 is invalid.
    uint8_t id;
    
    enum {
        // If set, the entry is sparse, and encodes both the entry ID and offset into each entry,
        // and a binary search is used to find the key. Only available on platforms >= O.
        // Mark any types that use this with a v26 qualifier to prevent runtime issues on older
        // platforms.
        FLAG_SPARSE = 0x01};uint8_t flags;

    // Must be 0.
    uint16_t reserved;
    
    // Number of uint32_t entry indices that follow.
    uint32_t entryCount;

    // Offset from header where ResTable_entry data starts.
    uint32_t entriesStart;

    // Configuration this collection of entries is designed for. This must always be last.
    ResTable_config config;
};
Copy the code

Structural analysis:

  • header : ResChunk_headerType, wheretypeIs equal to theRES_TABLE_TYPE_TYPE
  • reserved: Reserved field. The value is 0
  • entryCount: Indicates the number of resource items of this type. Note that this refers to the number of resource items with the same name.
  • entriesStart: The data block of the resource item is relative to the sourcechunkThe offset value of the header.
  • config: Points to a ResTable_config that describes the configuration informationTypeWhat type of adaptation is it?

What follows is exactly how the resource item pool (an array of resource items) stores specific resources.

Resource items in a resource item pool can be stored in the following modes:

  • Common Resources:ResTable_entry + Res_value
  • bagResources:ResTable_entry + ResTable_map_entry + Res_Table_map * n

ResTable_entry points to the resource entry name and identifies whether the resource is a BAG resource. Res_value and Res_Table_map point to specific resources. The following figure shows the specific storage modes of the two resource types:

Finally, take a look at the internal structure of ResTable_entry, Res_value, and ResTable_map_entry.

Look at the ResTable_entry:

struct ResTable_entry
{
    // Number of bytes in this structure.
    uint16_t size;

    enum {
        // If set, this is a complex entry, holding a set of name/value
        // mappings. It is followed by an array of ResTable_map structures.
        FLAG_COMPLEX = 0x0001.// If set, this resource has been declared public, so libraries
        // are allowed to reference it.
        FLAG_PUBLIC = 0x0002.// If set, this is a weak resource and may be overriden by strong
        // resources of the same name/type. This is only useful during
        // linking with other resource tables.
        FLAG_WEAK = 0x0004
    };
    uint16_t flags;
    
    // Reference into ResTable_package::keyStrings identifying this entry.
    struct ResStringPool_ref key;
};
Copy the code

Structural analysis:

  • size: Indicates the header size of the resource item.
  • flags: Indicates the flag bit of the resource item.flags = FLAG_COMPLEXIndicates the resource isBagResource item, and inResTable_entryAfter following theResTable_mapArray represents the contents of the resource item, otherwise, inResTable_entryAfter following theRes_value: Resource item. If it is a resource item that can be referenced, thenFLAG_PUBLICThe bits are equal to 1.
  • key: __ Resource item name __ Index in the resource pool.

The name of the resource item is found in ResTable_entry, and then look at the resource value Res_Value:

struct Res_value
{
    // Number of bytes in this structure.
    uint16_t size;
    
    // Always set to 0.
    uint8_t res0;
    
    uint8_t dataType;
    // The data for this item, as interpreted according to dataType.
    typedef uint32_t data_type;
    
    data_type data;
};

Copy the code

Structural analysis:

  • size: Res_valueThe size of the
  • res0: Reserved field. The value is 0
  • dataType: The type of the current data. This is an enumeration type (String, Dimension, etc.)ResourceTypes.h
  • data: data. Based on the data type above, if the type is string, the current value is the index in the string resource pool

Finally, look at the specific contents of the storage structure of the BAG resource (obviously, the storage structure of the BAG resource does not meet the Pair object we said above, the introduction of the Pair object only to help us understand the non-BAG resource storage structure), ResTable_map_entry and ResTable_ref

struct ResTable_map_entry : public ResTable_entry
{
    // Resource identifier of the parent mapping, or 0 if there is none.
    // The resource ID of the parent ResTable_map_entry, which is equal to 0 if there is no parent ResTable_map_entry
    ResTable_ref parent;
    // Number of name/value pairs that follow for FLAG_COMPLEX.
    // The number of bag entries
    uint32_t count;
};

struct ResTable_map
{
	// The resource item ID of bag
    ResTable_ref name;
    
    // This mapping's value.
    // The value of the resource item in bag
    Res_value value;
};

struct ResTable_ref
{
    uint32_t ident;
};
Copy the code

So far, the analysis of resource. Arsc asking price has been completed.

At the end

The overall structure of arSC files is not very complex. The Android chunk-utils is a tool written in Java to parse arSC files. It helps to understand the structure of arSC files.

Recommended reading

1, ResourceTypes. H

2. Analyze the compilation and packaging process of Android application resources

3, Android hand analysis resources. Arsc

4, Android reverse note — ARSC file format parsing