Block device driver management code:

The file name location function
blk.h Linux – 0.12 \ kernel \ blk_drv Block device-specific header file
ll_rw_blk.c Linux – 0.12 \ kernel \ blk_drv Interface program for other programs to access block devices
hd.c Linux – 0.12 \ kernel \ blk_drv Hard disk driver
ramdisk.c Linux – 0.12 \ kernel \ blk_drv Memory virtual disk driver
floppy.c Linux – 0.12 \ kernel \ blk_drv Floppy disk driver

The primary device number in the Linux 0.11 kernel

The main equipment, type instructions Request item manipulation functions
0 There is no There is no NULL
1 Block/character Ram, virtual disks, memory devices do_rd_request()
2 block Fd, floppy drive device do_fd_request()
3 block Hd, hard disk device do_hd_request()
4 character Ttyx equipment NULL
5 character Tty device NULL
6 character Lp printer equipment NULL

The Linux 0.11 kernel mainly supports hard disks, floppy disks, and virtual disks.

9.1 General Functions

Read and write operations to data on hard and floppy disk devices are performed through interrupt handlers. The kernel reads and writes data in a logical block (1KB), and the block device controller reads and writes data in sectors (512 KB). During processing, read/write request item wait queues are used to sequentially buffer read/write operations on multiple logical blocks at a time.

When a process needs to be read from the hard disk logical block, will apply to the buffer management procedures, buffer management program will seek the existence of the block and in the buffer if there is a will buffer size pointer returned to the application process, if does not exist the block in the buffer, the buffer management program calls low-level block read and write functions ll_rw_block (), Send a request to the block device driver.

9.1.1 Block Device Managed data structures — Block device requests and request queues

The kernel manages various block devices using a block device table, with each block device having an entry in the block device table.

struct blk_dev_struct {
	void (*request_fn)(void);	// The function pointer to the request item operation
	struct request * current_request;	// Current request item pointer
};
extern struct blk_dev_struct blk_dev[NR_BLK_DEV];	// Block device table (array) NR_BLK_DEV=7
Copy the code
  • The first field, the function pointer, is used to manipulate the request item on the corresponding block device, or do_hd_request() for the hard disk driver.
  • The second field, current request structure pointer: used to indicate the request currently being processed by the block device.

When the kernel sends a block device read/write request or other operation request, the ll_rw_block() function will establish a block device request according to the operation command specified by the parameter and the device number in the data buffer bulk, using the corresponding request item operation function do_XX_request() to construct the request item queue using the elevator scheduling algorithm.

struct request {
	int dev;		// Device number in use (-1 indicates idle)
	int cmd;		// Command (read or write)
	int errors;		// The number of errors generated during the operation
	unsigned long sector;	// Start sector (1 block =2 sectors)
	unsigned long nr_sectors;	// Number of read/write sectors
	char * buffer;	// Data buffer
	struct task_struct * waiting;	// Where the task waits for the operation to complete
	struct buffer_head * bh;	// Buffer header pointer
	struct request * next;	// points to the next request
};
extern struct request request[NR_REQUEST];	// Array of requests (NR_REQUEST=32)
Copy the code

The current request pointer of each block device and the request necklace list of the device in the request item array together constitute the request queue of the device. The linked list is formed between the items. All the request items have only 32 items, and all the block devices share 32 request items.

The purpose of using arrays + linked lists:

  1. Use the array structure of request items to improve efficiency in searching for free request blocks.
  2. Meet the elevator algorithm insertion request operation.

The free item search range for write setup is limited to the first 2/3 of the entire request item array, leaving the rest for read setup.

9.1.2 Block Device Access Scheduling Process

Before sending an operation command to the disk controller, the system sorts the read/write disk sector data (processed by the I/O scheduler) so that the disk sector data blocks accessed by the request items are operated in sequence rather than being sent to the block device for processing in the sequence that the request items are received.

Elevator algorithm: the head moves in one direction, either all the way to the center of the disk or the other way to the edge of the disk.

Write disk operation

When writing data: The write command sent by the CPU is hd_out(), which allows write signal DRQ is the data request service flag of the controller state register. When the controller writes all data to the drive (or sends errors), it generates an interrupt signal and calls preset C function (write_intr()) to check whether there is any data to write. If there is, A sector’s data is fed into the drive and the interrupt is triggered again.

If all data has been written to the drive, the C function performs the post-write processing: wakes up the related process waiting for the request item, wakes up the process waiting for the request item, releases the current request item and removes it from the linked list, and releases the locked related buffer. Call the request item operation function to execute the next read/write disk request.

Loading operation

Read data process: THE CPU sends parameters to the controller, the controller reads data from the drive to its own buffer as required, generates an interrupt to the CPU, executes the pre-set C function (read_intr()) to send the data in the controller to the system buffer, and the CPU continues to wait for the interrupt signal.

The preset C function first puts data from a sector in the controller buffer into the system buffer, adjusts the current write position in the system buffer, and then decrement the number of sectors to read. If there is still data to read, continue to wait for the controller to issue the next interrupt signal.

For the virtual device, the read and write operations of the current request to the virtual device are completely implemented in do_rd_request() because there is no synchronization between the request and the external device.

Scheduling process between systems:

9.2 Hard disk Initialization Program setup.s

linux/boot/setup.S

Setup. S is an operating system loader that uses interrupts to read machine system data from the BIOS and saves the data to a location starting at 0x90000. The memory addresses 0x90080 and 0x90090 hold the parameter tables of two hard disks, or empty them if there is no second hard disk.

INITSEG = DEF_INITSEG ! 0x9000 ! Mov ax;#0x0000mov ds,ax lds si,[4*0x41] ! Ds :si mov ax, ds:si mov ax,#INITSEG
	mov	es,ax
	mov	di,#0x0080 ! Disk Parameter Description Destination Address 0x9000:0x0080 -> ES: DI
	mov	cx,#0x10 ! The length of the table is 16Brep movsb ! Mov ax,#0x0000mov ds,ax lds si,[4*0x46] ! Ds :si mov ax,#INITSEG
	mov	es,ax
	mov	di,#0x0090 ! Destination address: 0x9000:0x0090 -> ES: DI
	mov	cx,#0x10rep movsb ! Check whether the second hard disk mov AX exists,#0x01500 ! Functional x15 ah = 0
	mov	dl,#0x81 ! Drive letter
	int	0x13
	jc	no_disk1
	cmp	ah,# 3
	je	is_disk1
no_disk1:
	mov	ax,#INITSEG
	mov	es,ax
	mov	di,#0x0090
	mov	cx,#0x10
	mov	ax,#0x00
	rep
	stosb
is_disk1:
Copy the code

Code flow:

  1. Copy the contents of the first hard disk parameter table into memory. The vector value at 0x41 saves the first address of the disk parameter table (4B), and copies the content of the first disk parameter table in the BIOS to the memory 0x90080. In line 5, the address of the first disk parameter table is saved to [ds:si]. As shown in lines 6 to 12 of the code, set destination address [es:di]=0x9000:0x0080, number of bytes transferred (16 bytes).
  2. Copy the parameters of the second hard disk to the memory. As shown in lines 14 through 22, the address of the second hard disk parameter table is stored at interrupt 0x46.
  3. Check if there is a second hard disk, and if not, clear the second table. Such as code 25Line 39 shows code 25Line 30 implements the function to determine whether it is a hard disk, and line 25 and 26 pass in the function number ah=0x15 and drive letter DL =0x81 to select the second hard disk. Then call 0x13 to interrupt the type fetching function. Line 29 of code, according to the type code saved in AH, determine whether it is 3 (disk type) to determine whether there is a second hard disk. Int13 CF = 1 — operation failed, AH = status code, otherwise, AH = 00H — no drive installed.

The BIOS invokes the disk type function using INT13

  • Parameter 1: ah (function number, indicating disk type) =0x15;
  • Parameter 2: DL (drive letter) : 0x80- first hard disk, 0x81- second hard disk.

Return result ah (type code) : 00- The disk CF is not set; 01- is a floppy drive, no change-line support; 02- is a floppy drive (or other removable device) with change-line support; 03- It’s a hard drive.

9.3 Interface program ll_rw_blk.c

graph TD
A[blk-dev-init] --> B[ll-rw-block]
A --> C[ll-rw-page]
B --> D[make-request]
C --> E[add-request]
D --> E

9.3.1 Function Description

== Creates a block device read/write request for another device and inserts it into the specified device request queue. = =

This program is mainly used to perform underlying block device read/write operations. It is the interface program between all block devices (hard disks, floppy disks, and virtual Ram disks) and the rest of the system. Other programs in the system can asynchronously read and write data from block devices by calling the program’s low-level block-write function ll_rw_block(). The actual read and write operations are performed by the device’s request handler request_fn() (do_hd_request() for hard disks, do_fd_request() for floppy disks, do_rd_request() for virtual disks).

Ll_rw_block () creates a request item for a block device and determines that the device is free by testing the current request item pointer to the block device as null. Request_fn () is directly called to operate on the request item. Otherwise, the elevator scheduling algorithm is used to insert the newly created request item into the request necklace list of the device for processing. When request_FN () finishes processing a request item, it removes the request item from the list. Each request item is processed in interrupt mode.

9.3.2

9.3.3 Block Device initialization function

linux/kernel/blk_drv/ll_rw_blk.c/blk_dev_init()

This program is mainly called by main.c to complete the initialization of the request array **request[NR_REQUEST]** and set all the request items to idle (-1).

struct request request[NR_REQUEST];	// Queue NR_REQUEST=32
void blk_dev_init(void)	// Initializes the request item
{
	int i;
	for (i=0 ; i<NR_REQUEST ; i++) {
		request[i].dev = - 1;	// Indicates that the device is idle
		request[i].next = NULL; }}Copy the code
// In the request queue, the structure of the request item blk.c
struct request {
	int dev;		/* -1 if no request */
	int cmd;		/* READ orc WRITE */ // READ(0) WRITE(1)
	int errors; // The number of times an error occurred during the operation
	unsigned long sector;   // Start sector
	unsigned long nr_sectors;   // Number of read/write sectors
	char * buffer;  // Data buffer
    struct task_struct * waiting;   // The queue in which a task waits for a request to complete an operation
	struct buffer_head * bh;    // Buffer header pointer
	struct request * next;  // points to the next request
};
Copy the code

9.3.4 Interface functions between block Device drivers and the system

linux/kernel/blk_drv/ll_rw_blk.c/ll_rw_block()

Low-level block read/write functions. This function is usually called in the fs/buffer.c program. Its main function is to create a block device read request item and insert it into the specified block device request queue. The actual reads and writes are done by the device’s Request_fn ().

void ll_rw_block(int rw, struct buffer_head * bh)	// The interface function checks the arguments by calling make_request
{
	unsigned int major;	// Master device number
	if ((major=MAJOR(bh->b_dev)) >= NR_BLK_DEV ||	// NR_BLK_DEV=7! (blk_dev[major].request_fn)) {// Check whether the main device number exists and whether the request operation function of the device exists
		printk("Trying to read nonexistent block-device\n\r");
		return;
	}
	make_request(major,rw,bh);
}
Copy the code

Function inputs: rw — READ, READA, WRITE, WRITEA, bh — data buffer bulk pointer.

In line 4~5, determine whether the device’s main device number exists or whether the device’s request operation function does not exist. If yes, display an error message; otherwise, create a request item and insert the request queue.

// The data buffer bulk pointer defines the fs.h file system
struct buffer_head {
	char * b_data;			/* pointer to data block (1024 bytes) */	/ / pointer
	unsigned long b_blocknr;	/* block number */	/ / block number
	unsigned short b_dev;		/* device (0 = free) */	// Device number of the data source
	unsigned char b_uptodate;	// Update flag: indicates whether data has been updated
	unsigned char b_dirt;		/* 0-clean,1-dirty */	// Change flags: 0: unmodified, 1: modified
	unsigned char b_count;		/* users using this block */	// The number of users used
	unsigned char b_lock;		/* 0 - ok, 1 -locked */	// Whether the buffer is locked
	struct task_struct * b_wait;	// points to the task waiting for the buffer to unlock
	struct buffer_head * b_prev;	// The first block of the hash queue (these four Pointers are used for buffer management)
	struct buffer_head * b_next;	// The next block of the hash queue
	struct buffer_head * b_prev_free;	// Free table forward
	struct buffer_head * b_next_free;	// Free table next block
};
// Block device array
struct blk_dev_struct blk_dev[NR_BLK_DEV] ={{NULL.NULL },		/* no_dev */
	{ NULL.NULL },		/* dev mem */
	{ NULL.NULL },		/* dev fd */
	{ NULL.NULL },		/* dev hd */
	{ NULL.NULL },		/* dev ttyx */
	{ NULL.NULL },		/* dev tty */
	{ NULL.NULL }		/* dev lp */
};
Copy the code
struct task_struct {	// in the scheduler header file sched.h
/* these are hardcoded - don't touch */
	long state;	/* -1 unrunnable, 0 runnable, >0 stopped */
	long counter;
	long priority;
	long signal;
	struct sigaction sigaction[32].
	long blocked;	/* bitmap of masked signals */
/* various fields */
	int exit_code;
	unsigned long start_code,end_code,end_data,brk,start_stack;
	long pid,pgrp,session,leader;
	int	groups[NGROUPS];
	/* * pointers to parent process, youngest child, younger sibling, * older sibling, respectively. (p->father can be replaced with * p->p_pptr->pid) */
	struct task_struct	*p_pptr, *p_cptr, *p_ysptr, *p_osptr;
	unsigned short uid,euid,suid;
	unsigned short gid,egid,sgid;
	unsigned long timeout,alarm;
	long utime,stime,cutime,cstime,start_time;
	struct rlimit rlim[RLIM_NLIMITS]; 
	unsigned int flags;	/* per process flags, defined below */
	unsigned short used_math;
/* file system info */
	int tty;		/* -1 if no tty, so it must be signed */
	unsigned short umask;
	struct m_inode * pwd;
	struct m_inode * root;
	struct m_inode * executable;
	struct m_inode * library;
	unsigned long close_on_exec;
	struct file * filp[NR_OPEN];
/* ldt for this task 0 - zero 1 - cs 2 - ds&ss */
	struct desc_struct ldt[3].
/* tss for this task */
	struct tss_struct tss;
};
struct blk_dev_struct {
	void (*request_fn)(void); 
	struct request * current_request; 
};
Copy the code

9.3.5 Low-level Page read and write functions

linux/kernel/blk_drv/ll_rw_blk.c/ ll_rw_page()

Access block device data in 4K pages, reading and writing 8 sectors at a time.

After the request item is set up, add_request() is called to add it to the request queue, and the scheduling function is called directly to make the current process sleep wait page read from the switch.

struct task_struct * wait_for_request = NULL;	// ll_rw_page is used to temporarily wait for the process when the request array is not free
void ll_rw_page(int rw, int dev, int page, char * buffer)
{
	struct request * req;
	unsigned int major = MAJOR(dev);
	// Check the parameters
    // Check whether the main device number and the request operation function of the device exist
	if(major >= NR_BLK_DEV || ! (blk_dev[major].request_fn)) {printk("Trying to read nonexistent block-device\n\r");
		return;
	}
    // Check whether the parameter command is READ or WRITE
	if(rw! =READ && rw! =WRITE)panic("Bad block dev command, must be R/W");
	// Create a request item
repeat:
	req = request+NR_REQUEST;	// Point to the end of the queue
	while (--req >= request)
		if (req->dev<0)	// Indicates that the item is free
			break;
	if (req < request) {
		sleep_on(&wait_for_request);	// Sleep and check the request queue later
		goto repeat;
	}
// Enter the request information into the idle request item and queue it
/* fill up the request-info, and add it to the queue */
	req->dev = dev;	/ / device number
	req->cmd = rw;	// Command (READ/WRITE)
	req->errors = 0;	// Count of read/write errors
	req->sector = page<<3;	// Start read sector swap_nr
	req->nr_sectors = 8;	// Number of read/write sectors 8
	req->buffer = buffer;	// Data buffer
	req->waiting = current;	// The current process enters the request waiting queue
	req->bh = NULL;	// Unbuffered bulk pointer (no caching)
	req->next = NULL;	// Next request item pointer
	current->state = TASK_UNINTERRUPTIBLE;	// Set it to the uninterruptible state
	add_request(major+blk_dev,req);	// Queue the request item
	schedule(a);// It takes a long time to read/write 8 sectors to the switch, so the current process is put to sleep
}
Copy the code

Function execution flow:

  1. Check parameters. Line 8~14, if the device main device number does not exist or the device request operation function does not exist, display error message, exit; If neither READ nor WRITE is given, the kernel program has failed.
  2. Establish the request item. Lines 16 to 24 address free items from the request array (from back to front), sleep if not, and so on.
  3. Fill in the request information to the idle request item. Fill in the relevant information in the request item (lines 27-35 of code), and put the current process into the uninterruptible sleep state in line 36 of code. Add the request item to the queue in line 37 of code. Because it takes a long time to read/write 8 sectors to the switching device, the current process is put into sleep wait.
struct task_struct {
/* these are hardcoded - don't touch */
	long state;	/* -1 unrunnable, 0 runnable, >0 stopped */
	long counter;
	long priority;
	long signal;
	struct sigaction sigaction[32].
	long blocked;	/* bitmap of masked signals */
/* various fields */
	int exit_code;
	unsigned long start_code,end_code,end_data,brk,start_stack;
	long pid,pgrp,session,leader;
	int	groups[NGROUPS];
	/* * pointers to parent process, youngest child, younger sibling, * older sibling, respectively. (p->father can be replaced with * p->p_pptr->pid) */
	struct task_struct	*p_pptr, *p_cptr, *p_ysptr, *p_osptr;
	unsigned short uid,euid,suid;
	unsigned short gid,egid,sgid;
	unsigned long timeout,alarm;
	long utime,stime,cutime,cstime,start_time;
	struct rlimit rlim[RLIM_NLIMITS]; 
	unsigned int flags;	/* per process flags, defined below */
	unsigned short used_math;
/* file system info */
	int tty;		/* -1 if no tty, so it must be signed */
	unsigned short umask;
	struct m_inode * pwd;
	struct m_inode * root;
	struct m_inode * executable;
	struct m_inode * library;
	unsigned long close_on_exec;
	struct file * filp[NR_OPEN];
/* ldt for this task 0 - zero 1 - cs 2 - ds&ss */
	struct desc_struct ldt[3].
/* tss for this task */
	struct tss_struct tss;
};
void sleep_on(struct task_struct **p)
{
	__sleep_on(p,TASK_UNINTERRUPTIBLE);
}
Copy the code

9.3.6 Two operations on buffer blocks

linux/kernel/blk_drv/ll_rw_blk.c

The two operations on the buffer block are to lock the specified buffer block and to release the locked buffer.

// Lock the specified buffer block
static inline void lock_buffer(struct buffer_head * bh)
{
	cli(a);// Disable interrupts
	while (bh->b_lock)	// If the buffer is locked, sleep until the buffer is unlocked
		sleep_on(&bh->b_wait);
	bh->b_lock=1;	// Lock the buffer immediately
	sti(a);// Enable interrupts
}
// Release the locked buffer
static inline void unlock_buffer(struct buffer_head * bh)
{
	if(! bh->b_lock)// The buffer is not locked
		printk("ll_rw_block.c: buffer not locked\n\r");
	bh->b_lock = 0;	/ / unlock
	wake_up(&bh->b_wait);	// Wake up the task waiting for the buffer
}
Copy the code

9.3.7 Creating a Request and Inserting it into the Request Queue

linux/kernel/blk_drv/ll_rw_blk.c

Add_request () : Adds the set request item REq to the request necklace list for the specified device. If the current request item pointer for the device is null, you can set REQ to the current request item and call the device request item handler immediately; otherwise, the REQ request item is inserted into the request necklace table.

Make_request () : Creates the request entry.

// Add the request item to the list
static void add_request(struct blk_dev_struct * dev, struct request * req) // dev: specifies the block device structure pointer. Req: indicates a request item that has been set
{
	struct request * tmp;
	req->next = NULL;	// Empty the pointer to the next request in the request
	cli(a);/ / off the interrupt
	if (req->bh)
		req->bh->b_dirt = 0;	// Clear the buffer "dirty" flag
	if(! (tmp = dev->current_request)) {// Device dev has no request =0: the device has no request. This is the first request
		dev->current_request = req;	// Direct the block device current request pointer to the request item
		sti(a);/ / the interrupt
		(dev->request_fn)();	// The disk is do_hd_request().
		return;
	}
	for(; tmp->next ; tmp=tmp->next) {// If the specified device dev is busy and there are idle requests being processed, the current request item is inserted into the request list. The judgment statement in the for loop is used to compare the request item indicated by REQ with the existing request item in the request queue to find the correct position where REQ is inserted into the queue (elevator scheduling algorithm).
		if(! req->bh)if (tmp->next->bh)
				break;  // Page reading takes precedence
			else
				continue;
		if ((IN_ORDER(tmp,req) ||
		    !IN_ORDER(tmp,tmp->next)) &&
		    IN_ORDER(req,tmp->next))
			break;
        /* if (tmp > req && req > tmp->next) break; elif (tmp <= tmp->next && req > tmp->next ) break; else continue; * /
	}
	req->next=tmp->next;
	tmp->next=req;
	sti(a); }// Create the request item and insert the request queue
static void make_request(int major,int rw, struct buffer_head * bh) // Main device number; To specify a command; Pointer to the buffer header that holds data
{
	struct request * req;
	int rw_ahead;

/* WRITEA/READA is special case - it is not really needed, so if the */
/* buffer is locked, we just forget about it, else it's a normal read */
	if (rw_ahead = (rw == READA || rw == WRITEA)) { // rw_ahead preread write flag 68 lines, forego command
		if (bh->b_lock)	// The specified buffer is in use, has been locked, foreread write request abandoned
			return;
		if (rw == READA)
			rw = READ;
		else
			rw = WRITE;
	}
	if(rw! =READ && rw! =WRITE)// The command is not READ/WRITE
		panic("Bad block dev command, must be R/W/RA/WA");
	lock_buffer(bh);
	if((rw == WRITE && ! bh->b_dirt) || (rw == READ && bh->b_uptodate)) {// Write an unmodified buffer block to the block device and read a block from the block device that has been synchronized with the buffer contents.
		unlock_buffer(bh);
		return;
	}
repeat:
/* we don't allow the write-requests to fill up the queue completely: * we want some room for reads: they take precedence. The last third * of the requests are only for reads. */
	if (rw == READ)
		req = request+NR_REQUEST;	Ll_rw_blk. C struct request request[NR_REQUEST];
	else
		req = request+((NR_REQUEST*2) /3);	For write requests, the pointer points to 2/3 of the queue
/* find an empty request */
	while (--req >= request)	// Search for an empty request
		if (req->dev<0)
			break;
/* if none found, sleep on new requests: check for rw_ahead */
	if (req < request) {	// The search has reached the end
		if (rw_ahead) {	// Exit if read/write requests are made ahead of time
			unlock_buffer(bh);
			return;
		}
		sleep_on(&wait_for_request);	// Sleep wait
		goto repeat;
	}
/* fill up the request-info, and add it to the queue */
	req->dev = bh->b_dev;	/ / device number
	req->cmd = rw;	/ / command
	req->errors=0;	// The number of errors generated during the operation
	req->sector = bh->b_blocknr<<1;	// Start sector. Block number is converted to sector number.
	req->nr_sectors = 2;	// Number of sectors to be read and written by this request Item The number of sectors is 2
	req->buffer = bh->b_data;	// The request buffer pointer points to the data buffer to be read or written to
	req->waiting = NULL;	// Where the task waits for the operation to complete
	req->bh = bh;	// Buffer the bulk pointer
	req->next = NULL;	// points to the next request
	add_request(major+blk_dev,req);	// Queue the request item (blk_dev[major],req)
}
Copy the code

Add a request item to the list — add_request()

Arguments: dev — specifies a block device structure pointer; Req — The request item structure pointer that has been set up.

Function execution flow:

  1. Set reQ parameters. As shown in lines 5 to 8, the next request pointer to the request item is null, the interrupt is turned off, and the dirty flag of the buffer associated with the request item is cleared.
  2. If the device is idle. As shown in lines 9 through 14, the current_request property of the device number checks to see if the current device has requests. If there is no request item, it indicates that the device is idle. At this moment, the current pointer of the block device points directly to the request item, and the request function of the corresponding device is executed immediately.
  3. If the device is busy. As shown in lines 15 to 25, the elevator scheduling algorithm is used to find the best place to insert the request item in the request list. If it is determined that the buffer size pointer to the request to be inserted is empty, that is, there is no buffer block, then it needs to find an item that already has a buffer block available, so if the free item buffer size pointer at the current insertion location is not empty, this location is selected.
// blk.h
// Elevator algorithm: read operation before write operation
// s1 and s2 are Pointers to request items. Determine the order of the two request items according to CMD, dev, and sector in the request items
#define IN_ORDER(s1,s2) \
((s1)->cmd<(s2)->cmd || (s1)->cmd==(s2)->cmd && \
((s1)->dev < (s2)->dev || ((s1)->dev == (s2)->dev && \
(s1)->sector < (s2)->sector)))
Copy the code

Create a request — make_request()

Parameters: Major — master device number; Rw — specifies the command; Bh — buffer header pointer to hold data.

Function execution flow:

  1. Check parameter information. In lines 38 to 52, for both READA and WRITEA commands, the preread/write request is abandoned when the specified buffer is locked while in use. Otherwise, it is processed as a normal READ/WRITE command. Rw_ahead indicates the flag of the read ahead and write command.
  2. Sets the request item pointer position. In lines 58 to 61, for read requests, we search directly from the end of the queue, while for write requests, we search 2/3 of the queue to the head of the queue for empty entries.
  3. Look for free requests. Lines 62 to 74 search for an empty request item, if there is no free item sleep wait, then iterate again to find the free request item, if read/write request in advance, exit.
  4. Add information to the free item to queue the request item. As shown in lines 76 through 85.
// Fs.h
struct buffer_head {
	char * b_data;			/* pointer to data block (1024 bytes) */
	unsigned long b_blocknr;	/* block number */
	unsigned short b_dev;		/* device (0 = free) */
	unsigned char b_uptodate;
	unsigned char b_dirt;		/* 0-clean,1-dirty */
	unsigned char b_count;		/* users using this block */
	unsigned char b_lock;		/* 0 - ok, 1 -locked */
	struct task_struct * b_wait;
	struct buffer_head * b_prev;
	struct buffer_head * b_next;
	struct buffer_head * b_prev_free;
	struct buffer_head * b_next_free;
};
Copy the code

9.4 Block Device Header File blk.h

kernel/blk_drv/blk.h

Header file about block device parameters such as hard disk. It defines the data structure of request item in the request waiting queue request, defines the elevator search algorithm with macro statement, and supports three block devices of virtual disk, floppy disk and hard disk.

#ifndef _BLK_H
#define _BLK_H
#define NR_BLK_DEV	7   // Number of block device types
#define NR_REQUEST	32  // The number of items contained in the request queue
// The structure of the request item in the request queue
struct request {
	int dev;		/* -1 if no request */
	int cmd;		/* READ or WRITE */ // READ(0) WRITE(1)
	int errors; // The number of times an error occurred during the operation
	unsigned long sector;   // Start sector
	unsigned long nr_sectors;   // Number of read/write sectors
	char * buffer;  // Data buffer
    struct task_struct * waiting;   // The queue in which a task waits for a request to complete an operation
	struct buffer_head * bh;    // Buffer header pointer
	struct request * next;  // points to the next request
};
// Elevator algorithm, read operation is more strict than write operation
#define IN_ORDER(s1,s2) \
((s1)->cmd<(s2)->cmd || (s1)->cmd==(s2)->cmd && \
((s1)->dev < (s2)->dev || ((s1)->dev == (s2)->dev && \
(s1)->sector < (s2)->sector)))
// Block device processing structure
struct blk_dev_struct {
	void (*request_fn)(void);   // Request handler pointer
	struct request * current_request;   // The structure of the request currently being processed
};
// Block device table. Each block device occupies one item.
extern struct blk_dev_struct blk_dev[NR_BLK_DEV];
// Array of 32 request items
extern struct request request[NR_REQUEST];
// The process queue header pointer waiting for idle requests
extern struct task_struct * wait_for_request;
// An array of Pointers to the total number of blocks on a block device. Each pointer entry points to the total block array hd_sizes[] specifying the main device number. Each entry in the total block array corresponds to the total number of data blocks owned by a child device (1 block =1KB)
extern int * blk_size[NR_BLK_DEV];
// The main device number used by the program
#ifdef MAJOR_NR

#if (MAJOR_NR == 1)
/* ram disk */
#define DEVICE_NAME "ramdisk"   // Device name (" memory virtual disk ")
#define DEVICE_REQUEST do_rd_request    // Device request item handler
#define DEVICE_NR(device) ((device) & 7)    // Sub-device number (0 to 7)
#define DEVICE_ON(device)   // Start the device
#define DEVICE_OFF(device)  // Shut down the device

#elif (MAJOR_NR == 2)
/* floppy */
#define DEVICE_NAME "floppy"    // Device name (" floppy drive ")
#define DEVICE_INTR do_floppy   // Device interrupt handler function
#define DEVICE_REQUEST do_fd_request    // Device request item handler
#define DEVICE_NR(device) ((device) & 3)    // Subdevice number (0-3)
#define DEVICE_ON(device) floppy_on(DEVICE_NR(device))  // Open the device macro
#define DEVICE_OFF(device) floppy_off(DEVICE_NR(device))    // Close the device macro

#elif (MAJOR_NR == 3)
/* harddisk */
#define DEVICE_NAME "harddisk"  // Device name (" hard disk ")
#define DEVICE_INTR do_hd   // Device interrupt handler function
#define DEVICE_TIMEOUT hd_timeout   // Device timeout value
#define DEVICE_REQUEST do_hd_request    // Device request item handler
#define DEVICE_NR(device) (MINOR(device)/5) // Disk device number (0-1)
#define DEVICE_ON(device)   // The hard disk is always running when it is turned on
#define DEVICE_OFF(device)

#elif
/* unknown blk device */
#error "unknown blk device" // Error message "unknown block device" displayed during compilation preprocessing

#endif

#define CURRENT (blk_dev[MAJOR_NR].current_request) // Specifies the current request structure item pointer for the device number
#define CURRENT_DEV DEVICE_NR(CURRENT->dev) // Device id in the CURRENT request item

#ifdef DEVICE_INTR   // The device interrupts the symbol constant and declares it as a function pointer
void (*DEVICE_INTR)(void) = NULL;
#endif
#ifdef DEVICE_TIMEOUT   // Device timeout symbol, define the variable of the same name, set its value to 0, and define the SET_INTR() macro
int DEVICE_TIMEOUT = 0;
#define SET_INTR(x) (DEVICE_INTR = (x),DEVICE_TIMEOUT = 200)
#else
#define SET_INTR(x) (DEVICE_INTR = (x))
#endif
static void (DEVICE_REQUEST)(void); // Declare the device request symbol constant DEVICE_REGUEST is a static function pointer

// Unlocks the specified buffer block
extern inline void unlock_buffer(struct buffer_head * bh)
{
	if(! bh->b_lock) printk(DEVICE_NAME": free buffer being unlocked\n");
	bh->b_lock=0;
	wake_up(&bh->b_wait);
}

// Unlock the request handling macro
extern inline void end_request(int uptodate)
{
	DEVICE_OFF(CURRENT->dev);   // Shut down the device
	if (CURRENT->bh) {  // CURRENT is the CURRENT request structure item pointer
		CURRENT->bh->b_uptodate = uptodate; // set the update flag
		unlock_buffer(CURRENT->bh); // Unlock the buffer
	}
	if(! uptodate) {// This request operation failed
		printk(DEVICE_NAME " I/O error\n\r");   // Displays an I/O error message about the block device
		printk("dev %04x, block %d\n\r",CURRENT->dev,
			CURRENT->bh->b_blocknr);
	}
	wake_up(&CURRENT->waiting); // Wake up the process waiting for the request
	wake_up(&wait_for_request); // Wake up a process waiting for an idle request
	CURRENT->dev = - 1;  // Release the request
	CURRENT = CURRENT->next;    // points to the next request item
}blk.h

#ifdef DEVICE_TIMEOUT   // Device timeout symbol constant
#define CLEAR_DEVICE_TIMEOUT DEVICE_TIMEOUT = 0;
#else
#define CLEAR_DEVICE_TIMEOUT
#endif

#ifdef DEVICE_INTR  // Device interrupt symbol constant
#define CLEAR_DEVICE_INTR DEVICE_INTR = 0;
#else
#define CLEAR_DEVICE_INTR
#endif
// Define initializing request macros
/* This macro evaluates the validity of the current request item: if the current request item of the device is empty, it indicates that the device has no request item to process. So do a little cleaning work to exit the corresponding function. Otherwise, if the primary device number of the device in the current request is not equal to the primary device number defined by the driver, the request queue is out of order, and the kernel displays an error message and stops. Otherwise, if the buffer block used in the request is not locked, there is a problem with the kernel program, which also displays an error message and stops. * /
#define INIT_REQUEST \
repeat: \
	if (!CURRENT) {\
		CLEAR_DEVICE_INTR \
		CLEAR_DEVICE_TIMEOUT \
		return; \
	} \
	if(MAJOR(CURRENT->dev) ! = MAJOR_NR) \ panic(DEVICE_NAME": request list destroyed"); \
	if (CURRENT->bh) { \
		if(! CURRENT->bh->b_lock) \ panic(DEVICE_NAME": block not locked"); The \}

#endif

#endif
Copy the code

9.5 Disk Controller driver HD.c

The HD. c program is a disk controller driver that provides read and write operations on disk controllers and block devices, as well as initialization of disks.

Related program functions:

  • Initialize hard disk — sys_setup()
    • Set hard drive parameters.
    • Read the disk partition table.
    • Load the virtual drive root file system on the boot drive into the memory virtual drive.
  • Set the data structure information for the hard drive — hd_init()
    • Sets the disk controller interrupt descriptor.
    • Reset the masking code in the hard disk controller.
  • Send the command — hd_out() to the disk controller
    • Function pointer with pre-set interrupt procedure call.
    • Sends an interrupt request signal to the CPU to perform disk interrupt processing.
  • — do_hd_request()
    • Determines whether the current request exists.
    • Check the validity of the device ID and the start sector of the disk.
    • Calculate the track number, magnetic head number, and cylinder number of the requested data based on the request item.
    • If the reset flag is set, the hard disk is reset.
    • If the recalibration flag is set, send the disk recalibration command to the controller, pre-set the C function (recal_intr()) that needs to be executed in the interrupt caused by the command before sending, and exit.
    • If the current request is a write operation, first set the C function called by the disk controller to write_intr(), send the command parameter block of the write operation to the controller, and check the status register of the controller loop to determine whether the request service flag (DRQ) is set. If this flag is set, the controller has “agreed” to accept the data, which is then written to the controller’s data buffer. If the flag is not set after the query times out, the operation fails. The bad_rw_intr() function is called to determine whether to give up processing the current request item or to set a reset flag to continue processing the current request item based on the number of errors that have occurred in processing the current request item.
    • If the current request is a read operation, set the disk controller to call the C function read_intr() and send a read operation command to the controller.
  • C functions called during disk interruption processing, such as read_intr(), write_intr(), bad_rw_intr(), recal_intr()
    • The function that is called after the controller writes– write_intr ()
      • Call win_result() to read the status register and determine if an error occurred.
      • If there is an error, call bad_rw_intr() to determine whether the request item needs to be abandoned and whether the reset flag needs to be set.
      • If there is no error, determine whether all data requested by this request is written to disk, if there is data, then call port_write() to write a sector of data to the controller buffer.
      • If all data has been written to disk, the end_request() function is called to handle the termination: wake up the process waiting for the completion of this request, wake up the process waiting for the idle request, set the buffer data update flag for the current request, and release the current request(remove the item from the block device list).
      • Call the do_hd_request() function again.
    • A function that is called after a controller read operation is complete– read_intr ()
      • Call win_result() to read the status register and determine if an error occurred.
      • If there is an error, bad_rw_intr() is called to determine whether to abandon the request item and whether to set the reset flag.
      • If there is no error, the port_read() function is used to copy a sector’s data from the controller buffer into the buffer specified by the request entry.
      • If there is more data to read, exit and wait for the next interrupt.
      • The end_request() function is called to handle the termination of the current request item: wake up the process waiting for the completion of this request item, wake up the process waiting for the idle request item, set the buffer data to be updated flag for the current request item, and release the current request item (remove the item from the block device list).
      • Call the do_hd_request() function again.
  • Disk controller operation auxiliary functions, such as controler_ready(), drive_busy(), win_result(), hd_out(), reset_controler()

Sequence relation of data read from hard disks:

Timing relation of data writing to hard disks:

Related functions in HD. c:

9.5.1

9.5.2 Initializing disks and Setting Data Structures for Disks

linux/kernel/blk_drv/hd.c/hd_init()

The disk system is initialized.

void hd_init(void)
{
	blk_dev[MAJOR_NR].request_fn = DEVICE_REQUEST;	// Set the request handler pointer to do_hd_request()
	set_intr_gate(0x2E,&hd_interrupt);	// Set the disk interrupt gate descriptor, interrupt handler pointer
	outb_p(inb_p(0x21) &0xfb.0x21);	// Reset the master chip to allow the slave chip to send interrupt request signals
	outb(inb_p(0xA1) &0xbf.0xA1);	// Reset the slave on-chip hard disk interrupt request masking bit to allow the hard disk controller to send interrupt request signals
}
Copy the code

Function flow:

  1. In line 4 of the code, hd_intrupt (kernel/sys_call.s line 235 is the address of the interrupt processing process), the disk interrupt number is int 0x2E, corresponding to the interrupt request signal IRQ14 of the 8259A chip, The interrupt descriptor setup macro set_intr_gate() is implemented in include/ ASM /system.h.
  2. Line 5 resets the mask bit of the main 8259A INT2 to allow an interrupt request signal from the chip.
  3. Line 6 resets the slave on-chip hard disk interrupt request masking bit, allowing the hard disk controller to send interrupt request signals.
// Blk.h block device processing structure
struct blk_dev_struct {
	void (*request_fn)(void);   // Request handler pointer
	struct request * current_request;   // The structure of the request currently being processed
};
// System.h
#defineSet_intr_gate (n, addr) \ _set_gate (& idt [n], 14, 0, addr)
#define _set_gate(gate_addr,type,dpl,addr) \
__asm__ ("movw %%dx,%%ax\n\t" \
	"movw %0,%%dx\n\t" \
	"movl %%eax,%1\n\t" \
	"movl %%edx,%2"\ \ :"i" ((short) (0x8000+(dpl<<13)+(type<<8))), \
	"o" (*((char *) (gate_addr))), \
	"o" (*(4+(char *) (gate_addr))), \
	"d" ((char *) (addr)),"a" (0x00080000))
// io.h
#define outb(value,port) \
__asm__ ("outb %%al,%%dx": :"a" (value),"d" (port))
#define outb_p(value,port) \
__asm__ ("outb %%al,%%dx\n" \
		"\tjmp 1f\n" \
		"1:\tjmp 1f\n" \
		"1:": :"a" (value),"d" (port))
Copy the code

linux/kernel/blk_drv/hd.c/sys_setup()

System setup function. The main function is to read the information of the CMOS disk parameter table, which is used to set the disk partition structure HD, and try to load the RAM virtual disk and root file system.

Hard disk Parameter Table

Graph LR A((sys-setup)) --> B[1. Hd-info] A --> C[2. Hd0] A --> D[3. A --> F[5. Count the total number of data blocks in each partition hd0-4] A --> G[6. Load the root file image in memory virtual disk] A --> H[7.
#define MAX_HD		2	// Maximum number of disks supported by the system
int sys_setup(void * BIOS)	// init/main.c init subroutine set to point to the hard disk parameter table structure pointer
{
	static int callable = 1;	// Limit this function to use only once
	int i,drive;
	unsigned char cmos_disks;
	struct partition *p;
	struct buffer_head * bh;

	if(! callable)return - 1;
	callable = 0;
#ifndef HD_TYPE	// if no HD_TYPE is defined, read. The symbolic constant HD_TYPE is defined in the include/ Linux /config.h file
	for (drive=0 ; drive<2 ; drive++) {
		hd_info[drive].cyl = *(unsigned short *) BIOS; // The number of tracks
		hd_info[drive].head = *(unsigned char(*)2+BIOS);	/ / head count
		hd_info[drive].wpcom = *(unsigned short(*)5+BIOS);	// Write the pre-compensated cylinder number
		hd_info[drive].ctl = *(unsigned char(*)8+BIOS);	// Control bytes
		hd_info[drive].lzone = *(unsigned short(*)12+BIOS);	// The cylinder number of the landing zone
		hd_info[drive].sect = *(unsigned char(*)14+BIOS);	// Number of sectors per track
		BIOS += 16;	// Each hard disk parameter table is 16 bytes long, and the BIOS points to the next table
	}
	if (hd_info[1].cyl)
		NR_HD=2;	// Set the number of hard disks to 2
	else
		NR_HD=1;
#endif
	for (i=0 ; i<NR_HD ; i++) {// Set the disk partition structure. Item 0 and item 5 represent the overall parameters of the two disks
		hd[i*5].start_sect = 0;	// Start sector number of the hard disk
		hd[i*5].nr_sects = hd_info[i].head*
				hd_info[i].sect*hd_info[i].cyl;	// Total number of sectors of the hard disk
	}
	if ((cmos_disks = CMOS_READ(0x12)) & 0xf0)	// Read the disk type byte from the CMOS offset address 0x12. The lower half byte holds the type value of the second disk
		if (cmos_disks & 0x0f)	// If the lower half byte is not 0, the system has two disks
			NR_HD = 2;
		else
			NR_HD = 1;
	else
		NR_HD = 0;	// Non-AT-compatible hard disk
	for (i = NR_HD ; i < 2 ; i++) {	// Two hard disk data structures are cleared
		hd[i*5].start_sect = 0;
		hd[i*5].nr_sects = 0;
	}
	for (drive=0 ; drive<NR_HD ; drive++) {// Read the partition information of each hard disk in the first sector of the first data block on each hard disk
		if(! (bh =bread(0x300 + drive*5.0))) { // bread()- Read block functions 0x300, 0x305 Device numbers of two disks, 0: the block number to be read from the disk to the buffer
			printk("Unable to read partition table of drive %d\n\r",
				drive);
			panic("");
		}
		if (bh->b_data[510] != 0x55| | -unsigned char)// Determine the last two bytes of sector 1 of the hard disk
		    bh->b_data[511] != 0xAA) {
			printk("Bad partition table on drive %d\n\r",drive);
			panic("");
		}
		p = 0x1BE + (void *)bh->b_data;	// The partition table at offset 0x1BE in the first sector of each disk
		for (i=1; i<5; i++,p++) { hd[i+5*drive].start_sect = p->start_sect;
			hd[i+5*drive].nr_sects = p->nr_sects;
		}
		brelse(bh);	// Release the buffer requested for storing disk blocks
	}
	for (i=0 ; i<5*MAX_HD ; i++) // MAX_HD=2
		hd_sizes[i] = hd[i].nr_sects>>1 ;	Static int hd_sizes[5*MAX_HD] = {0,}; static int hd_sizes[5*MAX_HD] = {0,};
	blk_size[MAJOR_NR] = hd_sizes;	// The device entry of the total number of device blocks pointer array points to that array
	if (NR_HD)
		printk("Partition table%s ok.\n\r",(NR_HD>1)?"s":"");
	rd_load(a);// Load the root file system image ramdisk.c contained in the boot disk on the system memory virtual disk
	init_swapping(a);// The swap device initializes swap.c
	mount_root(a);// Install the root file system super.c (Fs)
	return (0);
}
Copy the code

Input parameters: BIOS is set by init subroutine in init/main.c as a pointer to the hard disk parameter table structure. The disk parameter list structure contains two disk parameter lists copied from memory 0x90080. Execution process:

  1. Set the Callable flag. As shown in lines 9 through 11, this function can only be called once.

  2. Set the disk information array hd_fo[]. If HD_TYPE is defined in include/ Linux /config.h, the hd_info[] data is already set. If not, set the hd_info data as shown in lines 13 through 27 of code. First, set hd_info initialization content, as shown in lines 23 to 26 of code. When obtaining BIOS disk parameter information, because if there is only one hard disk, all 16 bytes corresponding to the second hard disk will be cleared. Therefore, check whether the number of cylinders of the second hard disk is 0 until the second hard disk exists.

  3. Set the disk partition structure array HD []. As shown in lines 28 through 32, items 0 and 5 of the array represent the overall parameters of the two disks, and items 1-9 and 6-9 represent the parameters of the four partitions of the two disks.

  4. The first drive parameter is stored in the high half byte at the CMOS byte 0x12, and the second in the low half byte. This 4-bit byte information can be either a drive type or 0xF — representing the use of 0x19 byte in CMOS as the 8-bit type byte for drive 1 and 0x1A byte in CMOS as the type byte for drive 2.

    Determine the number of disks in the system. As shown in lines 33 to 43, check the CMOS_READ bytes to determine the number of hard disks. If the hard disk is not compatible with the AT controller, clear all data structures of both hard disks. If the number of hard disks is 1, clear the parameter of the second hard disk.

  5. Read partition table information in sector 1 on each hard disk. Set disk partition information in the partition structure array HD []. As shown in 44 ~ 61 lines of code, using the read block function of bread () to read the hard disk first data block (the first parameter, respectively, are two hard disk device number, the second parameter is needed to read the block number), if you read the success, the data will be stored in a buffer block bh data area, if the buffer pointer to 0, then read operation fails, display an error message and shutdown. We determine the validity of data in the hard disk sector based on whether the last two bytes of the first sector are 0xAA55. If valid, the disk partition table information is stored in the disk partition structure array HD []. Finally, release the BH buffer.

  6. Count the total number of data blocks in each partition. As shown in lines 62 through 64, store the total number of data blocks in each partition in the total data block array hd_size[], and then have the device entry of the total device data block pointer array point to that array.

  7. Subsequent calls. As shown in lines 65 to 69, load the root file system image contained in the boot disk on the system memory virtual disk. Initialize the switch device. Install the root file system.

// hdreg.h
struct partition {
	unsigned char boot_ind;		/* 0x80 - active (unused) */
	unsigned char head;		/ *? * /
	unsigned char sector;		/ *? * /
	unsigned char cyl;		/ *? * /
	unsigned char sys_ind;		/ *? * /
	unsigned char end_head;		/ *? * /
	unsigned char end_sector;	/ *? * /
	unsigned char end_cyl;		/ *? * /
	unsigned int start_sect;	/* starting sector counting from 0 */
	unsigned int nr_sects;		/* nr of sectors in partition */
};
// buffer.c
struct buffer_head * bread(int dev,int block)
{
	struct buffer_head * bh;

	if(! (bh=getblk(dev,block)))
		panic("bread: getblk returned NULL\n");
	if (bh->b_uptodate)
		return bh;
	ll_rw_block(READ,bh);
	wait_on_buffer(bh);
	if (bh->b_uptodate)
		return bh;
	brelse(bh);
	return NULL;
}
// hd.c reads the CMOS parameter macro function
#define CMOS_READ(addr) ({ \	
outb_p(0x80|addr,0x70); \	/ / 0 x70 is write port 0 x80 | addr is to read the CMOS memory address
inb_p(0x71); \	0x71 is the read port number
})
// io.h
#define outb(value,port) \
__asm__ ("outb %%al,%%dx": :"a" (value),"d" (port))	// IO drive access command
#define inb(port) ({ \
unsigned char _v; \
__asm__ volatile ("inb %%dx,%%al":"=a" (_v):"d"(port)); \ _v; The \})
#define outb_p(value,port) \
__asm__ ("outb %%al,%%dx\n" \
		"\tjmp 1f\n" \
		"1:\tjmp 1f\n" \
		"1:": :"a" (value),"d" (port))
#define inb_p(port) ({ \
unsigned char _v; \
__asm__ volatile ("inb %%dx,%%al\n" \
	"\tjmp 1f\n" \
	"1:\tjmp 1f\n" \
	"1:":"=a" (_v):"d"(port)); \ _v; The \})
// Define the disk partition structure. Give the physical starting sector number and the total number of partition sectors for each partition starting from hard disk 0. The multiple of 5 (including HD [0]) represents the hD.c parameter of the entire hard disk
static struct hd_struct {
	long start_sect;	// Physical start sector number
	long nr_sects;	// Total number of partitions
} hd[5*MAX_HD]={{0.0}};// hdreg.h
struct partition {
	unsigned char boot_ind;		/* 0x80 - active (unused) */
	unsigned char head;		/ *? * /
	unsigned char sector;		/ *? * /
	unsigned char cyl;		/ *? * /
	unsigned char sys_ind;		/ *? * /
	unsigned char end_head;		/ *? * /
	unsigned char end_sector;	/ *? * /
	unsigned char end_cyl;		/ *? * /
	unsigned int start_sect;	/* starting sector counting from 0 */
	unsigned int nr_sects;		/* nr of sectors in partition */
};
// Hard disk interrupt handler code sys_call.s
_hd_interrupt:
	pushl %eax
	pushl %ecx
	pushl %edx
	push %ds
	push %es
	push %fs
	movl $0x10,%eax
	mov %ax,%ds
	mov %ax,%es
	movl $0x17,%eax
	mov %ax,%fs
	movb $0x20,%al
	outb %al,$0xA0		# EOI to interrupt controller #1
	jmp 1f			# give port chance to breathe
1:	jmp 1f
1:	xorl %edx,%edx
	movl %edx,_hd_timeout
	xchgl _do_hd,%edx
	testl %edx,%edx
	jne 1f
	movl $_unexpected_hd_interrupt,%edx
1:	outb %al,$0x20
	call *%edx		# "interesting" way of handling intr.
	pop %fs
	pop %es
	pop %ds
	popl %edx
	popl %ecx
	popl %eax
	iret
// ll_rw_blk.c
int * blk_size[NR_BLK_DEV] = { NULL.NULL};// hd.c
static int hd_sizes[5*MAX_HD] = {0};Copy the code

9.5.3 Sending Commands to the Disk Controller

linux/kernel/blk_drv/hd.c/hd_out()

Graph LR A[hD-out] --> B

Sends a command block to the disk controller.

static void hd_out(unsigned int drive,unsigned int nsect,unsigned int sect,
		unsigned int head,unsigned int cyl,unsigned int cmd,
		void (*intr_addr)(void))	// Hard disk number; Number of read-write sectors; Initial sector; Magnetic head; Cylinder number; The command code; Pointer to the C handler function to be called in the disk interrupt handler
{
	register int port asm("dx");	// Define local register variables and place them in the specified register dx
	if (drive>1 || head>15)	// The initiator number and magnetic head number
		panic("Trying to write bad sector");
	if (!controller_ready())	// Wait for the drive to be ready
		panic("HD controller not ready");
	SET_INTR(intr_addr);	// do_hd=intr_addr is called in the interrupt, and intr_addr() is the pointer to the C handler that will be called in the disk interrupt handler
	outb_p(hd_info[drive].ctl,HD_CMD);	// Output control bytes to the control register #define HD_CMD 0x3f6
	port=HD_DATA;	#define HD_DATA 0x1f0 /* _CTL when writing */
	outb_p(hd_info[drive].wpcom>>2,++port);	// Write the pre-compensated cylinder number
	outb_p(nsect,++port);	// Total number of read/write sectors
	outb_p(sect,++port);	// Start sector
	outb_p(cyl,++port);	// The cylinder number is 8 bits lower
	outb_p(cyl>>8,++port);	// The cylinder number is 8 digits high
	outb_p(0xA0|(drive<<4)|head,++port);	// Drive letter + magnetic head
	outb(cmd,++port);	// Drive control command
}
Copy the code

Function execution flow:

  1. Check the validity of parameters. As shown in lines 6 to 9, check whether the drive letter is 0 or 1 and the magnetic head number is less than or equal to 15. Otherwise, the program does not support the drive letter. Call controller_Ready () to wait for the drive to be ready, and if it is not ready after some time, an error occurs and the drive is stopped.
  2. Sets the C function pointer do_hd that will be called when a hard disk interrupt occurs. See line 10 of code.
  3. Sends control bytes to the disk controller command port, as shown in line 11 of code, to establish the control mode for the specified disk.
  4. Send a 7-byte argument command block to the controller port, as shown in lines 13 to 19 of code.

9.5.4 Processing Disk Requests

linux/kernel/blk_drv/hd.c/do_hd_request()

Graph LR A[do-hD-request] --> B[do hD-request] --> C[do hD-request] --> D[write/read]

Disk read/write request operations. Based on the device NUMBER and the start sector number in the current request, the cylinder number, sector number of the current track, and head number of the disk are calculated. Then, read/write commands are sent to the disk according to the commands in the request.

void do_hd_request(void)
{
	int i,r;
	unsigned int block,dev;
	unsigned int sec,head,cyl;
	unsigned int nsect;
	INIT_REQUEST;	// Check the validity of the request
	dev = MINOR(CURRENT->dev);	// Get the device number from the request item
	block = CURRENT->sector;	// The starting sector number to request from the request item
	if (dev >= 5*NR_HD || block+2 > hd[dev].nr_sects) {
		end_request(0);
		goto repeat;	// blk.h INIT_REQUEST
	}
	block += hd[dev].start_sect;	// The absolute sector number of the disk to be read or written is block, plus the start sector number of the corresponding partition
	dev /= 5;	// Corresponds to the disk id
	__asm__("divl %4":"=a" (block),"=d" (sec):"0" (block),"1" (0),
		"r" (hd_info[dev].sect));
	__asm__("divl %4":"=a" (cyl),"=d" (head):"0" (block),"1" (0),
		"r" (hd_info[dev].head));
	sec++;	// Adjust the calculated current track sector number
	nsect = CURRENT->nr_sectors;	// The number of sectors to read/write
	if (reset) {	// static int reset = 0; Reset flag. This flag is set when a read/write error occurs and the associated reset function is called
		recalibrate = 1;	// recalibrate the flag
		reset_hd(a);// Send reset command to disk controller, send disk controller command to set up drive parameters
		return;
	}
	if (recalibrate) {	// static int recalibrate = 0; Recalibrate flags. When this flag is set, recal_intr() is called in the program to move the head to cylinder 0
		recalibrate = 0;
		hd_out(dev,hd_info[CURRENT_DEV].sect,0.0.0,
			WIN_RESTORE,&recal_intr);	// Perform a seek operation to move the head anywhere to cylinder 0
		return;
	}	
	if (CURRENT->cmd == WRITE) {
		hd_out(dev,nsect,sec,head,cyl,WIN_WRITE,&write_intr);	// Send write command
		for(i=0 ; i<10000 && !(r=inb_p(HD_STATUS)&DRQ_STAT) ; i++)// Loop to read status register information
			/* nothing */ ;
		if(! r) {bad_rw_intr(a);// Failed to write disk
			goto repeat;
		}
		port_write(HD_DATA,CURRENT->buffer,256);	// Send 1 sector data 256 (word)
	} else if (CURRENT->cmd == READ) {
		hd_out(dev,nsect,sec,head,cyl,WIN_READ,&read_intr);	/ / read command
	} else
		panic("unknown hd-command");
}
Copy the code

Function execution flow:

  1. Check parameter validity. As shown in lines 7 to 15, line 7 of the code detects the validity of the request item. If there is no request item in the request queue, exit (BLk. h, containing label repeat). Line 8 to 9 of the code takes the sub-device number (each partition on the hard disk) in the device number and the start sector number in the current request item of the device, line 10 to 13 of the code. Check whether the subdevice number exists and whether the start sector number is greater than the partition sector number -2 (since one piece of data (2 functions) is required to be read and written at a time, the requested sector number cannot be greater than the next-to-last sector number in the partition).

  2. Find the absolute sector number and the disk number. For example, in lines 14 to 15, add the start sector number of the partition corresponding to the subdevice NUMBER. Divide the subdevice number by 5 to obtain the disk number.

  3. Solve for sector number, cylinder number, and head number. As shown in lines 16 through 19.

    Sector Number/Number of sectors per track = Total number of tracks (all heads) ··· Sector number on the current track (SEC) Total Number of tracks/Total number of heads on the disk = Cylinder Number (CYL) ··· Line 20 to 21 of the current head number, adjust the sector number on the track and the number of preread/write sectors

  4. Check the disk controller and disk reset. First reset the controller state as shown in lines 22 to 26 to recalibrate the flag, then set the recalibrate flag to recalibrate the hard disk so that the head moves to cylinder 0.

  5. Sends I/O operation information to the disk controller. As shown in lines 33 to 45, if it is a write sector command, send the write command to the disk controller, and then read the status register repeatedly to determine whether the request service flag (DRQ_STAT) is set. If it is not set, the jump execution error processing, if the data can be written, The port_write() function is called to write 1 sector of data to the disk controller data register port HD_DATA (line 41, 256 refers to memory words, 512 bytes). If it is a read command, as shown on line 43, send the read sector command to the disk controller.

// blk.h
#define INIT_REQUEST \
repeat: \
	if (!CURRENT) {\
		CLEAR_DEVICE_INTR \
		CLEAR_DEVICE_TIMEOUT \
		return; \
	} \
	if(MAJOR(CURRENT->dev) ! = MAJOR_NR) \ panic(DEVICE_NAME": request list destroyed"); \
	if (CURRENT->bh) { \
		if(! CURRENT->bh->b_lock) \ panic(DEVICE_NAME": block not locked"); The \}
// blk.h
// Unlock the request handling macro
extern inline void end_request(int uptodate)
{
	DEVICE_OFF(CURRENT->dev);   // Shut down the device
	if (CURRENT->bh) {  // CURRENT is the CURRENT request structure item pointer
		CURRENT->bh->b_uptodate = uptodate; // set the update flag
		unlock_buffer(CURRENT->bh); // Unlock the buffer
	}
	if(! uptodate) {// This request operation failed
		printk(DEVICE_NAME " I/O error\n\r");   // Displays an I/O error message about the block device
		printk("dev %04x, block %d\n\r",CURRENT->dev,
			CURRENT->bh->b_blocknr);
	}
	wake_up(&CURRENT->waiting); // Wake up the process waiting for the request
	wake_up(&wait_for_request); // Wake up a process waiting for an idle request
	CURRENT->dev = - 1;  // Release the request
	CURRENT = CURRENT->next;    // points to the next request item
}
// The read port is embedded in the assembler macro read port port, which is used to read nr words and stored in buF
#define port_read(port,buf,nr) \
__asm__("cld; rep; insw": :"d" (port),"D" (buf),"c" (nr):"cx"."di")
// Write port embed assembler macro write port port, total write nr words (32 bits), get data from buF
#define port_write(port,buf,nr) \
__asm__("cld; rep; outsw": :"d" (port),"S" (buf),"c" (nr):"cx"."si")
Copy the code

9.5.5 C Function Called When A Disk Is Interrupted

linux/kernel/blk_drv/hd.c/bad_rw_intr()

Failed to read or write the disk.

static void bad_rw_intr(void)
{
	if (++CURRENT->errors >= MAX_ERRORS) // MAX_ERRORS=7 hd.c
		end_request(0);	// Terminates the current request and wakes up the waiting process
	if (CURRENT->errors > MAX_ERRORS/2)
		reset = 1;	// Reset the hard disk controller
}
Copy the code

Function execution flow:

  1. If the number of sector reading errors is greater than or equal to 7, end the current request and wake up the process waiting for the request.
  2. If the number of sector reading errors is greater than or equal to 3, set the reset flag to indicate that the data is not updated.

linux/kernel/blk_drv/hd.c/read_intr()

Read sector interrupt call function. This function is called during a disk interrupt that is caused when the disk read command ends. After the read command is executed, the disk controller will generate a disk interrupt request signal and execute the interrupt handler. At this point, the C function pointer do_hd called in the interrupt handler already points to read_intr(), so the function is executed after a sector read operation completes (or fails).

static void read_intr(void)
{
	if (win_result()) {	// The controller is busy, read/write errors, or command execution errors
		bad_rw_intr(a);// Failed to read or write the disk
		do_hd_request(a);// Request the hard disk to be reset again
		return;
	}
	port_read(HD_DATA,CURRENT->buffer,256);	// Read data to the request structure buffer
	CURRENT->errors = 0;	// Clear the number of errors
	CURRENT->buffer += 512;	// Adjust the buffer pointer to point to the new empty space
	CURRENT->sector++;	// Start sector number increment by 1
	if (--CURRENT->nr_sectors) {	Read_intr (); // Read_intr ()
		SET_INTR(&read_intr);	// Set hard disk to call C function pointer
		return;
	}
	end_request(1);
	do_hd_request(a); }Copy the code

Function execution flow:

  1. If the controller is busy, read/write errors occur, or command execution errors occur. As shown in lines 3 to 6, the bad_rw_intr() function fails to read or write the disk, and then requests the disk to be reset again.
  2. If the read operation is correct. As shown in lines 8 to 17, read 1 sector of data from the data register port into the buffer of the request item, and decrement the value of the sector that the request item needs to read. If the decrement is not equal to 0, it means that the request item still has data to complete, so again interrupt call C function pointer do_hd as read_intr() and return directly. Wait for the disk to interrupt after reading another sector data and call this function again.
  3. As shown in lines 16 to 17, after reading all the sector data of the request item, call end_quest() to handle the termination of the request item, and call do_hd_request() again to handle the other disk requests.
// The read port is embedded in the assembler macro read port port, which is used to read nr words and stored in buF
#define port_read(port,buf,nr) \
__asm__("cld; rep; insw": :"d" (port),"D" (buf),"c" (nr):"cx"."di")
Copy the code

linux/kernel/blk_drv/hd.c/write_intr()

Write sector interrupts call function. This function is called during a disk interrupt that is caused when the disk write command ends. After the write command is executed, the disk controller will generate a disk interrupt request signal and execute the interrupt handler. At this point, the C function pointer do_hd called in the interrupt handler already points to write_intr(), so the function is executed after a sector read operation completes (or fails).

static void write_intr(void)
{
	if (win_result()) {	The disk controller returns an error message
		bad_rw_intr(a);// Handle the disk read/write failure first
		do_hd_request(a);// Request the hard disk again for corresponding processing
		return;	
	}
	if (--CURRENT->nr_sectors) {	// If there are more sectors to write
		CURRENT->sector++;	// The sector number of the current request is +1
		CURRENT->buffer += 512;	// Adjust the request buffer pointer
		SET_INTR(&write_intr);	// do_hd 置函数指针为 write_intr()
		port_write(HD_DATA,CURRENT->buffer,256);	/ / write 256 words
		return;
	}
	end_request(1);
	do_hd_request(a); }Copy the code
// Write port embed assembler macro write port port, write nr word, get data from buF
#define port_write(port,buf,nr) \
__asm__("cld; rep; outsw": :"d" (port),"S" (buf),"c" (nr):"cx"."si")
Copy the code

linux/kernel/blk_drv/hd.c/recal_intr()

The reset function called in the disk interrupt service routine. If the disk controller returns an error message, the function handles the disk read/write failure and then requests the disk to reset.

static void recal_intr(void)
{
	if (win_result())
		bad_rw_intr(a);do_hd_request(a); }Copy the code

9.5.6 Auxiliary Functions of disk Controller Operations

The name of the function role note
int controller_ready(void) Judge and cycle until the hard disk controller is ready. Returns the number of remaining waits
int win_result(void) Detects the status of the hard disk after the command is executed and reads the command execution result in the status register. Zero: normal

Error 1:
int drive_busy(void) Wait until the hard disk is ready. 0: ready

1: the wait times out
void reset_controller(void) Recalibrate the disk controller.
void reset_hd(void) The hard disk is reset.
void unexpected_hd_interrupt(void) Default function called by unexpected disk interruption.
void hd_times_out(void) Disk operation timeout handling function.
graph LR
B[win-result]
C[drive-busy]
D[reset-controller]
E[reset-hd]
F[unexpected-hd-interrupt]
G[hd-times-out]

D --> C

E -- reset=1 --> D
E -- reset=0 --> B
B --> 1(bad-rw-intr)
E -- i --> 2(hd-out)
E --> 3(do-hd-request)

F --> 3(do-hd-request)

G --> 3(do-hd-request)

linux/kernel/blk_drv/hd.c/controller_ready()

Judge and cycle until the hard disk controller is ready.

static int controller_ready(void)
{
	int retries = 100000;	// Loop wait times
	while (--retries && (inb_p(HD_STATUS)&0xc0)! =0x40);	// Drive ready (set) Controller busy (reset) HD_STATUS- HDD controller status register port
	return (retries);	// If the return value is not 0, the controller is idle during the waiting period
}
Copy the code

(inb_p(HD_STATUS)&0xc0)! =0x40: Read the HDD controller status register port HD_STATUS and check whether the drive ready bit (bit 6) is 1 and the controller busy bit (bit 7) is 0. If the return value retries is 0, an error occurs because the waiting time for the controller to be idle has expired. If the value is not 0, it indicates that the controller returns to the idle state during the waiting period.

linux/kernel/blk_drv/hd.c/win_result()

Detects the status of the hard disk after the command is executed and reads the command execution result in the status register.

static int win_result(void)
{
	int i=inb_p(HD_STATUS);	// Read the status information
	if ((i & (BUSY_STAT | READY_STAT | WRERR_STAT | SEEK_STAT | ERR_STAT)) / / 1111 0001
		== (READY_STAT | SEEK_STAT)) / / 0101 0000
		return(0); /* ok */
	if (i&1) i=inb(HD_ERROR);	// If ERR_STAT is 1, the error register #define HD_ERROR 0x1F1 is read
	return (1);
}
// hdreg.h HD_STATUS
#define ERR_STAT	0x01	// Command execution error
#define INDEX_STAT	0x02	// The index is received
#define ECC_STAT	0x04	/* Corrected error */ // ECC check error
#define DRQ_STAT	0x08	// Request service
#define SEEK_STAT	0x10	// The search is complete
#define WRERR_STAT	0x20	// The drive is faulty
#define READY_STAT	0x40	// The drive is ready
#define BUSY_STAT	0x80	// Controller is busy
// HD_ERROR
// Run controller diagnostic commands Run other commands
// 0x01: No error data flag is lost
// 0x02: Controller error Track 0 is wrong
// 0x03: Sector buffer error
// 0x04: ECC component error command abandoned
// 0x05: Control processor error
// 0x10: ID was not found
// 0x40: ECC error
// 0x80: Bad sector
Copy the code

linux/kernel/blk_drv/hd.c/drive_busy()

Wait until the hard disk is ready.

static int drive_busy(void)
{
	unsigned int i;
	unsigned char c;
	for (i = 0; i < 50000; i++) {
		c = inb_p(HD_STATUS);	// Take the master controller status byte
		c &= (BUSY_STAT | READY_STAT | SEEK_STAT);
		if (c == (READY_STAT | SEEK_STAT))	// If the ready or end of seek flag is set, the hard disk is ready
			return 0;
	}
	printk("HD controller times out\n\r");	// Wait timeout, the information is displayed
	return(1);
}
Copy the code

Function execution process: read the controller’s main status register HD_STATUS loop, detect the busy bit, ready bit and end of seek bit. If only the ready bit and the end of seek flag bit is 1, the hard disk is ready and 0 is returned. Otherwise, the waiting time times out at the end of the loop. Displays a warning message and returns 1.

linux/kernel/blk_drv/hd.c/reset_controller()

graph LR
D[reset-controller] --> C[driver-busy]

Recalibrate the disk controller.

static void reset_controller(void)
{
	int	i;
	outb(4,HD_CMD);	// Send reset (4) control byte #define HD_CMD 0x3f6 to control register port HD_CMD
	for(i = 0; i < 1000; i++) nop(a);// Wait a while
	outb(hd_info[0].ctl & 0x0f ,HD_CMD);	// Send normal control bytes (retry, reread allowed)
	if (drive_busy())	// Wait for the hard disk to be ready
		printk("HD-controller still busy\n\r");	// Ready timed out
	if ((i = inb(HD_ERROR)) ! =1)	// Read the contents of the error register. If the value is 1, there is no error
		printk("HD-controller reset failed: %02x\n\r",i);
}
Copy the code

Code flow:

  1. In line 5, the loop waits for the empty operation to reset the controller.
  2. At line 6, bytes are sent to the controller port to allow retries and rereads.
  3. Lines 7 to 10 wait until the hard disk is ready. If it times out, a busy warning message is displayed. Read the contents of the error register. If the value is not equal to 1, the disk controller reset failure message is displayed.

linux/kernel/blk_drv/hd.c/reset_hd()

graph LR
E[reset-hd] -- reset=1 --> D[reset-controller]
E -- reset=0 --> B[win-result]
B --> 1[bad-rw-intr]
E -- i --> 2[hd-out]
2 --> 3[do-hd-request]

The hard disk is reset.

static void reset_hd(void)
{
	static int i;
repeat:
	if (reset) {
		reset = 0;
		i = - 1;
		reset_controller(a);// Reset the hard disk controller
	} else if (win_result()) {	// Check whether the command to reset the disk controller is executed properly
		bad_rw_intr(a);// Count the number of errors and determine whether to set the reset flag again
		if (reset)
			goto repeat;
	}
	i++;
	if (i < NR_HD) {
		hd_out(i,hd_info[i].sect,hd_info[i].sect,hd_info[i].head- 1,
			hd_info[i].cyl,WIN_SPECIFY,&reset_hd);// Create drive parameters for each hard disk
	} else
		do_hd_request(a); }Copy the code

Function execution flow:

  1. Reset a hard disk controller. As shown in lines 5 to 8, reset the hard disk controller after the reset flag is cleared. Sending the “set up drive parameters” command to the controller to the ith hard disk, as shown in line 16 to 17, will send a hard disk interrupt signal and call this function again until all NR_HD hard disks in the system have normally executed the sent command. Call do_hd_request() to start processing the request items, as shown in line 19.
  2. When this function is called again, since reset is 0, the code will execute the statement in line 9 to determine whether the command execution is normal. If an error occurs, the bad_rw_intr() function will be called to determine whether the reset flag is set again. If the reset flag is set, the repeat will be jumped to re-execute this function.

linux/kernel/blk_drv/hd.c/unexpected_hd_interrupt()

graph LR
F[unexpected-hd-interrupt] --> 1[do-hd-request]

Default function called by unexpected disk interruption. The default C handler called in the interrupt handler when an unexpected hard disk interruption occurs. This function is called when the called function pointer is NULL. Set the reset flag and call the request item function go_hd_request(), where the reset processing operation is performed.

void unexpected_hd_interrupt(void) // The function called by the interrupt handler when sending an unexpected hard disk interrupt
{
	printk("Unexpected HD interrupt\n\r");
	reset = 1;
	do_hd_request(a); }Copy the code

linux/kernel/blk_drv/hd.c/hd_times_out()

graph LR
G[hd-times-out] --> 1[do-hd-request]

Disk operation timeout handling function. The do_timer() function calls this function to set the reset flag, and do_hd_request() to perform the reset processing.

void hd_times_out(void)
{
	if(! CURRENT)// No request
		return;
	printk("HD timeout");
	if (++CURRENT->errors >= MAX_ERRORS)
		end_request(0);	// The number of errors in the execution of the request item has exceeded the set value
	SET_INTR(NULL);	// Failure terminates the C function called during the interrupt
	reset = 1;	// Set the reset flag
	do_hd_request(a);// Perform the reset operation
}
Copy the code

Function execution flow:

  1. Determines whether a request item is currently being processed. As shown in lines 3 to 4, if there are no requests to process, there is no timeout and returns directly.
  2. Determine whether the number of errors during the execution of the current request, as shown in line 6 to 7 of the code, has exceeded the set value. If yes, end the processing of the request.
  3. The C function pointer do_hd called during the interrupt is null and the reset flag is set to reset. Do_hd_request () is called to perform the reset operation.

9.6 Memory Virtual Disk Driver ramdisk.c

linux/kernel/blk_drv/ramdisk.c

This file is a driver for an in-memory virtual disk, a device that uses physical memory to store data on a physical disk. Advantages: Improves the data read and write speed of the disk. Disadvantages: When the system crashes, all data in the virtual disk disappears. The kernel initializer allocates a designated area of memory to hold virtual disk data.

The symbol RAMDISK is defined in the Linux /Makefile. The kernel initializer allocates a designated area of memory to hold virtual disk data. If RAMDISK is 512, the size of the virtual disk is 512KB. The location of the virtual disk in the physical memory is determined during kernel initialization (init/main.c). The location is between the kernel cache and main memory, as shown in The figure.

The read/write operations on a virtual disk device are similar to those on a block device. Data is transferred between the storage system and the storage system only through memory block replication because synchronization with external controllers or devices is not required.

Ramdisk.c contains three functions.

  • Rd_init () : called by init/main.c to determine the location and size of the virtual disk in the physical memory.
  • Do_rd_request () : request operation function of the virtual disk device.
  • Rd_load () : virtual drive root file system load function. Check whether there is a root file system at the beginning of block 256 on the boot disk, read the root file system super block from block 257 on the disk, if successful, read the root file image file to the virtual disk in memory, and use it as the root file system. You can then use a boot disk that integrates the root file system to boot the system to the shell prompt state, setting the root file system device flag ROOT_DEV to the virtual disk device, or exit the function.
#include <string.h>
#include <linux/config.h>
#include <linux/sched.h>
#include <linux/fs.h>
#include <linux/kernel.h>
#include <asm/system.h>
#include <asm/segment.h>
#include <asm/memory.h>
#define MAJOR_NR 1
#include "blk.h"
char	*rd_start;	// Start address of the virtual disk in memory
int	rd_length = 0;	// Memory size of the virtual disk
void do_rd_request(void)	// The low-level block device interface function ll_rw_block() creates a virtual disk request and adds it to the RD list. This function is called to process the current request
{
	int	len;
	char	*addr;
	INIT_REQUEST;
	addr = rd_start + (CURRENT->sector << 9);	Len CURRENT:blk_dev[MAJOR_NR].current_REQUEST Len CURRENT:blk_dev[MAJOR_NR].current_request
	len = CURRENT->nr_sectors << 9;
	if ((MINOR(CURRENT->dev) ! =1) || (addr+len > rd_start+rd_length)) {	// Do not label 1(floppy disk only 1 partition?) Or greater than the end of the virtual disk, end the request
		end_request(0);
		goto repeat;
	}
	if (CURRENT-> cmd == WRITE) {	// Assign the contents of the buffer to address addr of length len
		(void ) memcpy(addr,
			      CURRENT->buffer,
			      len);
	} else if (CURRENT->cmd == READ) {
		(void) memcpy(CURRENT->buffer, 
			      addr,
			      len);
	} else
		panic("unknown ramdisk-command");
	end_request(1);
	goto repeat;
}
long rd_init(long mem_start, int length)	Length = ramdisk *1024 Specifies the memory size required by the virtual disk ramdisk
{
	int	i;
	char	*cp;
	blk_dev[MAJOR_NR].request_fn = DEVICE_REQUEST;	// do_rd_request()
	rd_start = (char *) mem_start;	// 16MB memory system. The value is the start address of the physical memory of the 4MB virtual disk
	rd_length = length;	// Length of the virtual disk area Length of the occupied bytes
	cp = rd_start;	// Clear the entire virtual drive
	for (i=0; i < length; i++)
		*cp++ = '\ 0';
	return(length); // Memory reserved for the virtual disk
}
void rd_load(void)	// Try to load the root file system to the virtual disk
{
	struct buffer_head *bh;	// Cache bulk files
	struct super_block	s;	// File superblock structure
	int		block = 256;	// Cache the bulk pointer
	int		i = 1;
	int		nblocks;	// Total number of file system disk blocks
	char		*cp;		/* Move pointer */
	if(! rd_length)return;
	printk("Ram disk: %d bytes, starting at 0x%x\n", rd_length,
		(int) rd_start);
	if (MAJOR(ROOT_DEV) ! =2)	// Is it a floppy disk
		return;
	bh = breada(ROOT_DEV,block+1,block,block+2.- 1);	// Read the basic parameters of the root file system from floppy disk blocks 256+1, 256, and 256+2
	if(! bh) {printk("Disk error while looking for ramdisk! \n");
		return;
	}
	*((struct d_super_block *) &s) = *((struct d_super_block *) bh->b_data);
	brelse(bh);
	if(s.s_magic ! = SUPER_MAGIC)// Indicates a non-minix file system
		return;
	nblocks = s.s_nzones << s.s_log_zone_size;
	if (nblocks > (rd_length >> BLOCK_SIZE_BITS)) {	// The total number of data blocks in the file system is greater than the number of blocks that can be held by the virtual disk in memory
		printk("Ram disk image too big! (%d blocks, %d avail)\n", 
			nblocks, rd_length >> BLOCK_SIZE_BITS);
		return;
	}
	printk("Loading %d bytes into ram disk... 0000k", 
		nblocks << BLOCK_SIZE_BITS);
	cp = rd_start;
	while (nblocks) {	// Loop the root file system image file on the disk to the virtual disk
		if (nblocks > 2) 
			bh = breada(ROOT_DEV, block, block+1, block+2.- 1);
		else
			bh = bread(ROOT_DEV, block);
		if(! bh) {printk("I/O error on block %d, aborting load\n", 
				block);
			return;
		}
		(void) memcpy(cp, bh->b_data, BLOCK_SIZE);
		brelse(bh);
		printk("\010\010\010\010\010%4dk",i);
		cp += BLOCK_SIZE;
		block++;
		nblocks--;
		i++;
	}
	printk("\010\010\010\010\010done \n");
	ROOT_DEV=0x0101;	// The device ID of the root file system is changed to the device ID of the virtual disk 0x0101
}
Copy the code

9.5.1 Processing current Requests for Memory Virtual Disks

linux/kernel/blk_drv/ramdisk.c/do_rd_request()

Operation function of the current request item of the virtual disk. After the low-level block device interface function ll_rw_block() creates requests for virtual disks and adds them to the RD linked list, this function is called to process the current requests of rd. Function execution flow:

  1. Verify the validity of the request. As shown in lines 17 to 23, lines 18 and 19 of the code calculate the address ADDR corresponding to the physical memory of the start sector of the virtual disk where the request item is processed and the memory byte length len. The left shift of 9 bits is for sector* 512, converted to byte values. CURRENT is defined in blk_dev[MAJOR_NR].current_request. For example, in lines 20 to 23 of code, determine whether the neutron device number of the current request item is 1 and whether the corresponding starting position is greater than the end of the virtual disk. If so, end the request item and then jump to repeat to process the next virtual disk request item.
  2. Then read and write. As shown in lines 24 to 35, copy the contents of the buffer in the request to address ADDR with len bytes of length, based on the read/write command. Line 34, set the update flag and proceed to the next request.

9.5.2 Initializing memory Virtual Disks

linux/kernel/blk_drv/ramdisk.c/rd_init()

Virtual disk initialization function, which returns the amount of memory required by the virtual disk Ramdisk. Function input:

  • Mem_start: the start location of the main memory area.
  • Length: RAMDISK* 1024 (bytes). Function execution flow:
  1. Set the virtual disk device request handler pointer. As shown on line 41, point the request item handler to do_rd_request().
  2. Set parameter values. Determine the start address of the virtual disk in physical memory and the length of bytes occupied by the virtual disk, as shown in lines 42 to 43.
  3. The virtual drive area is cleared. Read lines 44 to 46.

9.5.3 Loading the Root File System

linux/kernel/blk_drv/ramdisk.c/rd_load()

If the root file system device is a ramdisk, this function is called and attempts to load the root file system onto the virtual disk. This function is called in hD.c. (1 disk block =1024B)

Function execution flow:

  1. Check the validity and integrity of the virtual disk. As shown in lines 57 through 62, the ramdisk is checked to see if the length of the ramdisk is 0 and if it is a floppy disk device.
  2. Read basic parameters of the root file system. As shown in lines 63 to 71, the breada() function reads the specified data block from the disk, marks the blocks that still need to be read, and returns a pointer to the buffer that contains the data block. If the pointer is NULL, the block is unreadable. It copies the disk superblock in the buffer into the S variable and releases the buffer. Then judge the validity of the superblock, such as lines 70 to 71 of the code, determine the magic number of the file system in the superblock, and determine whether the loaded data block is a MINIX file system.
  3. An attempt was made to read the root file system into the memory virtual disk. Based on the relationship between the total number of data blocks in the file system and the number of blocks that can be held by virtual disks in the memory. As shown in lines 72 to 77, the s_nzones field of a file system superblock structure holds the total number of logical blocks. The total number of data blocks in a logical block is specified by the s_log_zone_size field. Therefore, the total number of data blocks in a file system nblocks is equal to the total number of logical blocks x 2^(number of data blocks per block). The total number of data blocks in a file system and the number of data blocks that can be accommodated by virtual disks in the memory are determined.
  4. Load data block information. As shown in lines 78 to 98, the loading data block information is displayed, with cp pointing to the beginning of the virtual disk in memory. Perform a loop to load the root file system image file on the disk onto the virtual disk. If more than two fast disks need to be loaded at a time, the read-ahead function is used. Otherwise, the data is read in a single block. If an I/O error occurs, abort the load process returns. Copy the read disk blocks from the cache to the corresponding location of the virtual disk in memory using the memcp() function, and display the number of loaded blocks.
  5. After the root file system is loaded, the done message is displayed, as shown in lines 99 to 100, and the device ID of the root file system is changed to 0x0101 of the virtual disk.