High-level languages such as IO and Java use Syscall to call Linux system functions to achieve network communication

Knowledge to prepare

  1. In Linux, all types are abstracted into files: plain files, directories, character devices, block devices, sockets, and so on
  2. The memory is divided into the kernel and user states. Data is copied between the user and kernel states. The kernel state can access the user state data, and the vice versa
  3. Only the kernel can operate hardware resources (network cards, disks, etc.), and the kernel provides the syscall function

File descriptor

  1. The file descriptor is an index created by the kernel to facilitate the management of open files and refers to the opened file. When a program opens an existing file or creates a new file, the kernel returns a file descriptor to the process.

  1. All system calls that perform I/O operations go through file descriptors

In Linux, the system creates 0, 1, and 2 fd for each process by default after logging in to /proc

User mode and kernel mode

  1. The memory is divided into the kernel and user states. Data is copied between the user and kernel states. The kernel state can access the user state data, and the vice versa
  2. You cannot access disks and network adapters in user mode. You must invoke system functions in the syscall mode provided by the system

The system calls

Let’s execute the following Java code to see what happens to the system:

import java.io.IOException; import java.net.ServerSocket; public class BIOServer { public static void main(String[] args) throws IOException { ServerSocket server = new ServerSocket(8080); server.accept(); }}Copy the code

Use Strace to get the system function call stack:

strace -ff -o out java BIOServer
Copy the code

There are three system functions: socket, bind, and listen. We looked at the Linux manual separately:

  • socket

NAME
       socket - create an endpoint for communication
       
DESCRIPTION
       socket() creates an endpoint for communication and returns a descriptor.

RETURN VALUE
       On  success,  a  file  descriptor for the new socket is returned.  On error, -1 is returned, and errno is set appropri-
       ately.
Copy the code

Socket () provides an endpoint for communication and returns a file descriptor fd, otherwise -1

  • bind

NAME
       bind - bind a name to a socket
       
SYNOPSIS
       #include 
      
        /* See NOTES */
      
       #include <sys/socket.h>

       int bind(int sockfd, const struct sockaddr *addr,
                socklen_t addrlen);

DESCRIPTION
       When  a socket is created with socket(2), it exists in a name space (address family) but has no address assigned to it.
       bind() assigns the address specified by addr to the socket referred to by the file descriptor sockfd.   addrlen  speci-
       fies the size, in bytes, of the address structure pointed to by addr.  Traditionally, this operation is called "assign- ing a name to a socket".
       
RETURN VALUE
       On success, zero is returned.  On error, -1 is returned, and errno is set appropriately.
Copy the code

Bind (), which takes three arguments (the file descriptor returned by the socket, the socket address structure, and the socket address length), returns 0 on success

  • listent

NAME
       listen - listen for connections on a socket

SYNOPSIS
       #include 
      
        /* See NOTES */
      
       #include <sys/socket.h>

       int listen(int sockfd, int backlog);
       
DESCRIPTION
       listen()  marks  the socket referred to by sockfd as a passive socket, that is, as a socket that will be used to accept
       incoming connection requests using accept(2).

       The sockfd argument is a file descriptor that refers to a socket of type SOCK_STREAM or SOCK_SEQPACKET.

       The backlog argument defines the maximum length to which the queue of pending connections for sockfd may  grow.   If  a
       connection  request  arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED
       or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connec-
       tion succeeds.

RETURN VALUE
       On success, zero is returned.  On error, -1 is returned, and errno is set appropriately.
Copy the code

Listen (), which takes two arguments (the file descriptor returned by the socket, and the size of the socket queue) and returns 0 on success and -1 on failure

To reexamine the relationship between the three function calls, it should be:

View the file descriptor under the Java process:

The following conclusions can be drawn:

1, Java through the system call to achieve network IO

2, ServerSocket server = new ServerSocket(8080); Behind a single line of Java code, there are multiple system function calls

3, network IO, not Java ability, is the operating system kernel to provide the ability

The memo

  1. Linux is full of file descriptors
  2. User-space programs that access operating system hardware and software resources by calling system functions
  3. The Linux Kernel, rather than Java/Python, provides network IO capabilities

A series of

NIO: Linux/IO fundamentals

NIO sees and says (2) – The two BIO in Java

NIO sees and says (3) — different IO models

NIO: Java NIO

NIO also said that (v) : Do it, do it today, understand Buffer

Pay attention to my

If you are reading on wechat, please click the link to follow me. If you are reading on PC, please scan the code to follow me. Welcome to communicate with me and point out mistakes at any time.

Copyright notice: This article is originally published by xiaoyan. Please contact the author for republication. It is published on InfoQ and public account.