He is a senior Python engineer at Ele. me. He had a brush with the file system. It is said that he writes C++ well, how good is unknown.

  • The introduction
  • 1.MBR
    • 1.1 Composition of MBR
    • 1.2 Partition entry structure in MBR
    • 1.3 MBR restrictions
  • 2.GPT
    • 2.1 Composition of GPT
    • 2.2 protect the MBR
    • 2.3 EFI part
      • 2.3.1 EFI information area data structure
      • 2.3.2 partition table
      • 2.3.3 Backing up the partition table and THE GPT header
    • 2.4 GPT advantage
  • tool

The introduction

MBR and GPT are hard disk partition tables, and after the hard disk partition table comes the file systems we use everyday such as NTFS, FAT32, EXT, etc. Although MBR is not a common disk partition table nowadays, GPT has added MBR to the structure header of GPT for compatibility with older disk partition tables. In the following section, I will briefly introduce MBR and then GPT architecture.

This article posts some structure code, which is extracted and modified by the author from the template of 010Editor. These structures are easy to use programmatically, as long as the data buffer pointer is cast to the structure pointer, this structure pointer can be used to extract the data, eliminating the need to use code displacement to read the data. (Try to use 1-byte structure alignment to avoid unnecessary trouble)

1.MBR

1.1 Composition of MBR

The first sector of the hard disk houses the MBR table, which consists of four parts: the Master Boot Record, the data area, the partition table (containing four partition entries), and the end flag.

The table below shows the offsets and introductions of the four sections:

  1. Master Boot Record is hard disk partition table Boot program and data area, a total of 446 bytes, BIOS read and execute this code, if the system can not be damaged to read the data on the hard disk, the system can not start.
  2. Partitioning items are covered in more detail later
  3. The end identifier is the end bit of the MBR, and an error will be read.

Bootinst is the main bootloader and data area, partitions are four partition entries, the mbr_table type is described in more detail below, this type takes up 16 bytes, and signature is the closing representation.

typedef struct {
    char    bootinst[446];   /* space to hold actual boot code */
    mbr_table partitions[4];
    unsigned short  signature;/* set to 0xAA55 to indicate PC MBR format */
} mbr_head;
Copy the code

Disk Data Screenshot

1.2 Partition entry structure in MBR

The following table is a 16-byte interpretation of a single partition entry in the MBR

Storage bit Content and Meaning
1 byte Guide flags. If the value is 80H, the partition is active; if the value is 00H, the partition is inactive. Bytes 2, 3, 4 Start head number, sector number, and cylinder number of the partition. Where: magnetic head number — byte 2; Sector number — 6 bits lower than the 3rd byte; Cylinder number — 2 bits higher than the 3rd byte + 8 bits higher than the 4th byte.
5 bytes Partition type character. 00H — The partition is not used (that is, not specified); 06H — FAT16 basic partition; 0BH — FAT32 basic partition; 05H — Extended partition; 7 h – NTFS partition; 0FH — (LBA mode) Extended partition (83H for Linux partition, etc.).
Bytes 6, 7, 8 End magnetic head number, sector number, cylinder number. Where: magnetic head number — byte 6; Sector number — 6 bits lower than the 7th byte; Cylinder number — the top 2 bits of byte 7 + byte 8.
Bytes 9, 10, 11, 12 The first sector of the partition.
Bytes 13, 14, 15, 16 Total number of sectors in a sector.

The Arabic digits in the following structure refer to bits. Unsigned short takes up 2 bytes, making a total of 16 bits

  1. unsigned short begsect : 6; Represents the first six bits

  2. unsigned short begcyl : 10; Represents the next 10 bits

Proofread one note

Type [name] : bitsize tells the compiler to use only one of the bitsize bits of the member to store data. Most compilers compress the structure when two adjacent members are bit-fields, have the same type, and the type length is greater than the sum of the two bit-fields. For example, the begsect and begcyl structures, which are both bit-fields, have unsigned short types, and the sum of the bits of the two bit-fields is less than the length of the unsigned short types. So the two bitfields are compressed into a single unsigned short space that takes up 2 bytes.

typedef struct {
    unsigned char  bootid;         /* bootable? 0=no, 128=yes */
    unsigned char beghead ;    /* beginning head number */
    unsigned short begsect : 6;    /* beginning sector number */
    unsigned short begcyl  : 10;   /* 10 bit nmbr */
    unsigned char  systid;         /* Operating System type indicator code */
    unsigned char endhead ;    /* ending head number */
    unsigned short endsect : 6;    /* ending sector number */
    unsigned short endcyl  : 10;   /* also a 10 bit nmbr */
    unsigned int relsect;          /* first sector relative to start of disk */
    unsigned int numsect;          /* number of sectors in partition */
} mbr_table;
Copy the code

Disk Data Screenshot

1.3 MBR restrictions

The MBR stores data on four partitions, called primary partitions. The partition adopts the “cylinder/head/sector” notation, i.e. CHS notation. In the preceding structures, Beghead, Begsect, and Begcyl indicate the head, sector, and cylinder at the beginning of the partition respectively, and endhead, EndSect, and Endcyl indicate the head, sector, and cylinder at the end of the partition respectively. The size of each sector is 512 BYTES, and the total number of sectors =(, i.e.,), so it can only describe at most 8G () disk area.

Proofread one note

For those of you who have experience with MBR partitioning, you might wonder why you can only represent 8G. When actually installing system partition, obviously easily separated out of a few hundred G partition ah. This is because modern BIOS uses LBA mode for disks larger than 8 GB, and CHS is usually set to 0xFEFFFF and ignored, using the 4-byte relative value of Offset 0x08-0x0C for internal conversion. However, relsect(32 bits) is used to describe the start sector number. Relsect + numsect indicates the end sector number. The total disk size cannot exceed 2TB due to the 32-bit address.).

In addition to this 2TB problem, MBR has other difficulties. The main difficulty is the limitation of four primary partitions. One possible way to overcome this limitation is to put aside a primary partition as a placeholder (called extended partitions) for any number of additional partitions (called logical partitions).

MBR also has data integrity issues. It is a single data structure that is vulnerable to misoperation and disk failure. In addition, because logical partitions are defined in a linked table structure, if one logical partition is corrupted, access to the remaining logical partitions is blocked.


2.GPT

2.1 Composition of GPT

Let’s first post the overall structure of GPT to give you an overall impression.

GPT structure

GPT structured data

2.2 protect the MBR

Protective MBR (see LBA 0: Protective MBR in GPT structure diagram) is attached to the head of the GPT. The data of Protective MBR is shown in the following table.

Only the first partition entry of the protection MBR has a value. The other partition entries are empty.

  1. The PartitionType of the first partition entry (systid) is 0xEE

  2. StartSectors A value of 2 indicates LBA1, where the GPT structure header is located. For disks smaller than 2TB, SectorsInPartition(total secsect) is the numSect (Total secsect) of the previous section. The value is the entire disk size. Since only 4 bytes represent 2TB, the value is fixed to FFFFFFH if the value is greater than 2TB

Protecting the MBR The first partition entry records the entire GPT as partition entry data. It also prevents disk tools that do not recognize the GPT partition from attempting to format it, so this sector is called “protecting the MBR”. EFI doesn’t actually use this partition table at all.

2.3 EFI part

EFI can be divided into four areas: EFI information area (GPT header), partition table, GPT partition and backup area.

  1. EFI information area (GPT header) : LBA1 starts on the disk and usually occupies only this single sector. It defines the location and size of a partitioned table. The GPT header also contains a checksum between the header and the partitioned table so that errors can be found in time

  2. Partitioned table: The partitioned table area contains partition table entries. This area is defined by the GPT header and generally occupies sectors LBA2 to LBA33 of the disk. Each partition entry in the partition table consists of a start address, an end address, a type value, a name, an attribute flag, and a GUID value. After the partition table is established, the 128-bit GUID is unique to the system

  3. GPT partition: The largest region, consisting of the sectors allocated to the partition. The start and end addresses for this area are defined by the GPT header

  4. Backup area: The backup area is located at the tail of the disk and contains the BACKUP of the GPT header and partition table. It occupies the last 33 sectors of the hard disk data. The last sector is used to back up the EFI information area (GPT header) of sector LBA1, and the remaining 32 sectors are used to back up the partition table of sectors LBA2 to LBA33

The EFI information area (GPT header), partition table, and backup area are discussed separately below.

2.3.1 EFI information area data structure

Relative byte offset (hexadecimal) The number of bytes Description [Integers are represented by little endian]
07 00 ~ 8 GPT header signature 45 46 49 20 50 41 52 54 (ASCII code: EFI PART)
08 ~ 0 b 4 Version number, currently version 1.0, with the value “00 00 01 00”
0 c ~ 0 f 4 The size (number of bytes) of the GPT header is typically “5C 00 00 00” (0x5C), which is 92 bytes.
10 ~ 13 4 GPT header CRC checksum (evaluate the field itself as zero)
14 ~ 17 4 Reserved, must be “00 00 00 00”
18 ~ 1 f 8 The starting sector number of the EFI information area (GPT header), usually “01 00 00 00 00 00 00 00 00”, or LBA1.
20 to 27 8 The sector number of the BACKUP position of the EFI information area (GPT header), that is, the end sector number of the EFI area. It is usually the last sector of the entire disk.
28 ~ 2 f 8 The start sector number of the GPT partition area is usually “22 00 00 00 00 00 00 00 00” (0x22), or LBA34.
30 ~ 37 8 The end sector number of a GPT partition area, usually the 34th from last.
38 ~ 47 16 Disk GUID(globally unique identifier, synonym with UUID)
48 ~ 4 f 8 The start sector number of the partition table, usually “02 00 00 00 00 00 00 00 00” (0x02), or LBA2.
50 ~ 53 4 The total number of partition table entries is usually limited to “80 00 00 00 00” (0x80), that is, 128 entries.
54 ~ 57 4 The number of bytes occupied by each partition entry is usually limited to “80 00 00 00” (0x80), which is 128 bytes.
58 ~ 5 b 4 CRC checksum of partition table
5 c ~ Reserved, usually an all-zero fill

The number of GPT partition entries can be extended based on partition table attributes in THE GPT. That is, the number of partition tables in GPT is configurable, but generally the size is 128 partition entries. The MBR has only four partition entries.

typedef struct {
    BYTE   SIGNATURE[8];
    DWORD  Revision;
    DWORD  Headersize;
    DWORD  CRC32OfHeader;
    DWORD  Reserved;
    UINT64 CurrentLBA;
    UINT64 BackupLBA; //location of the other head copy
    UINT64 FirstUsableLBA; //primary partition table last LBA+1
    UINT64 LastUsableLBA; //secondary parition table first LBA-1
    BYTE DiskGUID[16];
    UINT64 PartitionEntries;
    DWORD  NumOfPartitions;
    DWORD  SizeOfPartitionEntry;
    DWORD  CRC32ofPartitionArray;
    BYTE reserved[420];
} gpt_head_table;
Copy the code

GPT header data

2.3.2 partition table

Partition item structure

Relative byte offset (hexadecimal) The number of bytes Description [Integers are represented by little endian]
00 ~ 0 f 16 The partition type represented by GUID
10 to 1 f 16 The unique identifier of the partition represented by GUID
20 to 27 8 The starting sector of the partition, represented by the LBA value.
28 ~ 2 f 8 The end sector (inclusive) of the partition, represented by the LBA value, is usually odd.
30 ~ 37 8 Attribute flag for the partition
38 ~ 7 f 72 Utf-16le encoded human-readable partition name, up to 32 characters long.

PartitionStartLBA(the start sector) and PartitionEndLBA(the end sector of the partition) are both 64 bits, which is increased from MBR24 bits to 64 bits, and CRC is added.

typedef struct {
    BYTE PartitionTypeGUID[16];
    BYTE PartitionGUID[16];
    UINT64 PartitionStartLBA;
    UINT64 PartitionEndLBA;
    UINT64 PartitionProperty;
    wchar_t PartitionName[36]; //Unicode
} gpt_paptition_table;
Copy the code

The following table shows PartitionTypeGUID(PartitionTypeGUID) values.

Related operating systems GUID[little endian] meaning
None 00000000 – the 0000-0000-0000-000000000000 Don’t use
None 024DEE41-33E7-11D3-9D69-0008C781F39F The MBR partition table
None C12A7328-F81F-11D2-BA4B-00A0C93EC93B EFI System Partition (ESP)
None 21686148-6449-6E6F-744E-656564454649 The BIOS boot partition, whose corresponding ASCII string is “Hah! IdontNeedEFI “.
None D3BFE2DE-3DAF-11DF-BA40-E3A556D89593 Intel Fast Flash (iFFS) partition (for Intel Rapid Start technology)
Windows E3C9E316-0B5C-4DB8-817D-F92DF00215AE Microsoft Reserved Partitions
Windows EBD0A0A2-B9E5-4433-87C0-68B6B72699C7 Basic data partition
Windows DE94BBA4-06D1-4D40-A16A-BFD50179D6AC Windows Recovery Environment
Linux 0FC63DAF-8483-4772-8E79-3D69D8477DE4 Data partitioning. Linux used to use the same GUID as Windows base data partitions. This new GUID was invented by the developers of GPT FDisk and GNU Parted based on Linux’s traditional “8300” partition code.
Linux 44479540-F297-41B2-9AF7-D131D5F0458A X86 root partition (/) This is an invention of Systemd and can be used for automatic mount without fstab
Linux 4F68BCE3-E8CD-4DB1-96E7-FBCAF984B709 X86-64 root partition (/) This is an invention of Systemd and can be used for automatic mount without fstab
Linux 3B8F8425-20E0-4F3B-907F-1A25A76F98E8 Server Data (/ SRV) This is an invention of Systemd and can be used for automatic mount without fstab
Linux 933AC7E1-2EB4-4F13-B844-0E14E2AEF915 HOME partition (/ HOME) This is an invention of Systemd and can be used for automatic mount without fstab
Linux 0657FD6D-A4AB-43C4-84E5-0933C84B4F4F Swap partition (SWAP) is not an invention of Systemd, but can also be used for auto-mount without fSTAB
Linux A19D880F-05FC-4D3B-A006-743F0F84911E RAID partition
Linux E6D6D379-F507-44C2-A23C-238F2A3DF928 Logical Volume Manager (LVM) partitions
Linux 8DA63339-0007-60C0-C436-083AC8230908 keep

Microsoft further subdivides the attributes of a partition: the low 4 bytes represent attributes independent of the partition type, and the high 4 bytes represent attributes related to the partition type. Microsoft currently uses the following attributes:

Bit explain
0 System partition (The disk partition tool must leave this partition as it is and not modify it)
1 EFI hidden partition (EFI invisible partition)
2 Bootable partition flag for traditional BIOS
60 read-only
62 hidden
63 Do not automatically mount, that is, do not automatically assign drive letters

Partition entry data

2.3.3 Backing up the partition table and THE GPT header

The backup partition table and partition header are at the end of the hard disk data and contain the same data and structure as in the partition table and partition header. Is a backup of the header.

2.4 GPT advantage

GPT eliminates the 2TB barrier due to an increase in the number of partitions and the number of bits stored in the start and end sectors of the partition table. Unlike MBR, GPT has a single data structure with backup partition tables and CRC, which can effectively prevent structure corruption.

tool

Tools used in this article: 010Editor Hexadecimal editor






Not enough to read blogs?

You are welcome to scan the QR code to join the exchange group, discuss the technical issues related to the blog, but also can have more interaction with the blogger

For blog reprint, offline activities and cooperation, please email to
[email protected] communicate