What is the link layer

What happens if you write a letter to a distant friend and deliver it to the friend? First, you will pack the letter and deliver it to the post office. The post office will send the letter from the return address to the return address. Then, the driver carrying your letter will plan a practical route through cities and highways to the destination. Letters will be dispatched to friends at the final destination post office.

In this analogy, think of the content of a letter as information to be transmitted. The meaning of the Internet is to transmit information from one place to another. Therefore, for various necessary purposes and responsibilities, the Internet protocol stack abstracts the network into five hierarchies, namely, the application layer, the transportation layer, the network layer, the link layer and the physical layer. The corresponding functions of each layer are:

  • Application layer: the individual who sends and receives a letter is the final processor of information. Thus, the audience of the application layer is the individual applications on separate hosts, and it specifies how information is exchanged between physically distant applications. As in writing a letter, follow certain rules to fill in the content and write the necessary recipient information.
  • Transport layer: The post office where letters are collected. In contrast, the audience of the transportation layer is the host, which is responsible for collecting the information that the application sends to the network, processing it as necessary and then sending it. The transport layer, then, dictates how the hosts exchange information, just as the post office has a common way of knowing who the letter ends up with.
  • Network layer: an enabler of route planning, ranging from post offices to drivers, aims to plan and drive the route, which may be changed due to congestion in a certain section of the road. Therefore, the audience of the network layer is the functional devices or units that perform the planned path, which specifies how the various functional devices or units exchange information with each other in order to obtain a feasible and efficient path.
  • Link layer: information and vehicles required for normal driving on the road. Its audience is a functional unit that determines whether and how a link can send data, and it specifies how the units exchange information or rules to be followed in order to ultimately transmit the information. In order to drive normally, you need to use vehicles, have driving skills, recognize the left turn at the intersection ahead, speed limit 40 and other information.
  • Physical: The physical medium that becomes the path. Its audience is the physical medium through which information is transmitted. In fact, no matter what kind of information has to be carried through the physical media, like letters, drivers need to be carried on the road to get to the destination.

It follows that the link layer specifies how data is transmitted over the physical layer and avoids facing the characteristics of the physical medium. In other words, the link layer provides vehicles that travel on the roads connected by the physical layer.

Understand link layer operation

Naturally, the communication channel (connection path) between two devices connected in some way, whether through cable, broadcast or other media, is regarded as a link in the link layer, and such devices can be called nodes.

The link layer frame

The link layer encapsulates the data into link layer frames and then exchanges the frame data between the two nodes over the link. Frame data represents a specific data structure and the information carried by that index. In the case of an electrical signal, a stream of bits has to be translated into a specific electrical signal for output, which the sender interprets and the receiver interprets.

A brief frame format should include at least:

| | | frame head data informationCopy the code

Then, to indicate the beginning and end of a frame, specific characters are added at the beginning and end of each frame, such as 0X7E, “0111 1110”. When the sender sends data, it immediately fills in a 0 whenever it finds five consecutive ones. Each time the receiver finds five consecutive ones, it immediately deletes the following zeros. This lets the receiver know when each frame ends.

Link layer protocol

Link layer protocols exist to regulate transmission behavior and achieve efficiency. Just imagine how bad the whole traffic would be if there were no traffic rules. To achieve this, a link layer protocol includes:

  • Framing: Network layer data is sent through the link layer in frames that can be interpreted according to specific rules.
  • Link access: The sender can send frames whenever the link is idle.
  • Reliable delivery: Because of the physical properties of the transport medium, some information may be in error, and some link layer protocols will ensure that the correct frame is delivered, while another link layer protocol considers that the transport layer or the application layer has already done the processing and does not deliver reliably, resulting in unnecessary overhead.
  • Error checking and correction: Data errors caused by the physical medium. Before the data is sent, a value is calculated for the data in some way. Along with the data, the frame is loaded. After receiving the frame, the receiver compares the calculated value in the same way to determine whether the data is wrong or even to correct it. The test methods include parity check, test and method, cyclic redundancy detection, etc. In short, the more complex the method is, the more expensive it will be, but the greater the probability of error detection will be.

Multiple access links

The links involved can be roughly divided into two types, point-to-point links and broadcast links. The former indicates that both ends of the link correspond to a single receiver and a single sender. The latter means that multiple senders and multiple receivers are connected to a shared link.

Due to the characteristics of physical media, if more than one sender sends signals at the same time, they will interfere with each other, and no one receiver can receive valid information. So how to coordinate the sender’s access to the shared link to avoid collisions as much as possible, effectively handing the data to the receiver is called the multiaccess problem.

The protocols designed for different link types are point-to-point protocol and multiple access protocol.

Furthermore, multiple access protocols can be divided into channel division protocol, random access protocol and rotation protocol according to the characteristics of access link mode. For each protocol, when on a link at rate R, we want:

  • When only one node accesses the link, the throughput of R rate is achieved.
  • When there are M nodes accessing the link, the throughput of each node has R/M throughput in the cycle of when.
  • The protocol is decentralized and does not stall due to a node failure.
  • The protocol is simple and inexpensive to implement.

Channel division protocol

FDM, frequency division multiplexing

In FDM, the channel is divided into several segments according to the spectrum, and each node uses a different frequency band to transmit data. The disadvantage is that the frequency band is wasted due to idling when it is not used, and the rate cannot reach R.

TDM, time division multiplexing

In TDM technology, time is divided into certain units, called time frames, and further, time frames are divided into multiple time slots. Each node can be assigned time slots to propagate data over the time slots of the cycle. The rate of each node under TDM is limited to R/M.

CMDA Code Division Multiple Access

The reason nodes cannot send messages at the same time is that the overlapping signals make it difficult for the receiver to interpret them. Under CMDA, each node is assigned a special code, which is used by the node to process and then send the data. As long as the receiving end chooses carefully, it can filter out its own valid information from the superimposed information.

Its principle can be roughly understood as shown in the figure, a superposition information “BCDEMTUA” after coding processing, the receiver filter out their own information “CDMA”. CDMA anti – interference ability is particularly strong, and is closely related to wireless channel.

Random access protocol

The core of random access protocol is that a node always sends data at the rate R of the channel, and resend frames according to certain rules when collisions occur.

Time slot ALOHA

One of the simplest random access protocols is the time slot ALOHA protocol, which can be described as follows:

  • All frames consist of a specified fixed length
  • The time is eventually divided into time slots, each of which can complete the transmission of a frame
  • Nodes are synchronized, each node knows when each slot starts, and only transmits data when the slot starts, right
  • If a collision occurs in a time slot, all nodes can detect the collision before the end of the time slot. After the collision, the node decides whether to retransmit according to a probability p at each subsequent time slot until the retransmission succeeds.

Unlike channel division, in time slot ALOHA, if only one node is transmitting data, it works well and can achieve throughput R. But its efficiency is affected by the probability of a collision. Efficiency refers to the proportion of successfully transmitted slots to total slots in a period of time. It is easy to deduce from probability that in the case that N nodes continuously have data to transfer, the probability of successful transmission of one node is P (1-P)^n-1, and the probability of successful transmission of any node is Np(1-p)^n-1, so the efficiency is Np(1-p)^n-1. The limit as N approaches infinity is 1/e = 0.37.

Then, on busy days, the actual throughput of an R throughput channel is only 37%.

Carrier sense Multiple Access with Collision Protocol (CSMA/CD)

In time slot ALOHA, each node uses the channel in its own case. For CSMA/CD, it’s more gentlemanly:

  • Before transmission, the channel is always monitored to see if other nodes are using the channel. This behavior is called carrier sense. If the channel is in use, wait until the channel is idle, and then transmission begins.
  • During the transmission, the channel is continuously monitored. If a collision is detected, the node knows that the transmission must be inefficient and stops the transmission. Wait for a random period of time, if the channel is idle then transmission, otherwise continue to wait for a random period of time, repeat the process until transmission succeeds.

The reason collisions can still occur under carrier sense is that there is a delay in propagation. At time T, the broadcast transmission of A in the channel takes time. At the same time, B has not heard the broadcast pass and broadcasts to the same channel. At time Tk, AB’s broadcast messages meet and AB detects the collision. Therefore, the longer the propagation time, the greater the chance that a node will not be able to listen to another node that is using the channel, resulting in a collision.

The efficiency of CSMA/CD (the derivation process is complex, and the approximate formula is directly used) = 1 / (1 + 5(dProp/Dtrans)), where DProp represents the maximum propagation time of a broadcast signal between any two nodes, and Dtrans represents the time required to transmit a maximum length frame.

Take turns to deal

There are many types of rotation agreements, and there are many variations. One of the simplest rotation training protocols is: select a master node, ask other nodes in turn if there is any data to transmit, if there is, notify nodes to transmit some frames, and observe whether there is a lack of signal on the channel to judge whether the node has completed the transmission, and then continue to ask the next node.

The benefits of the alternate protocol are obvious, eliminating random access protocol collisions and empty time slots. The disadvantages are also obvious, that is, a node does not need to transmit but also needs to ask, and the query and notification cause delay, so even if only one node needs to transmit, the maximum channel throughput cannot be reached. If the primary node is faulty, the node stops working.

Based on this, a strengthened rotation protocol is the token passing protocol. In this protocol, there is no master node. Instead, a small, special frame called a token is passed between nodes in a specific order. Only the node receiving the token uses the channel to transmit data. When a node sends some frames or no frames to send, the token is passed to the next node. The token passing protocol can also stop because some nodes fail or the node does not release the token, so there are other variants. Either way, there are thorny problems.

addressing

Data always comes and goes, and in order to know where it is, you need to construct the address in some form. At the network layer, IP addresses are used to describe the location of hosts in the network, but at the link layer, IP addresses cannot be used to construct location information. At the link layer, MAC addresses (LAN addresses, physical addresses) are used to indicate the location of interfaces. For example, the MAC address corresponds to a person’s ID card, and the IP address corresponds to a person’s residential address. The MAC address does not change.

When a hardware device is being manufactured, the manufacturer assigns MAC addresses to it. The manufacturer must apply to the IEEE for an address range that can be assigned. Therefore, the MAC address is unique. Sure, there’s a tiny chance they’ll have the same MAC address, but as long as they’re not in the same subnet, it works.

A MAC Address is 6 bytes in size. The Address Resolution Protocol (ARP) specifies how to Address a MAC Address.

Each host or router may have an ARP table to map the management of IP addresses and MAC addresses, and each entry has a TTL value. ARP table, such as:

The IP address The MAC address TTL
111.111.111.111 AA-AA-AA-AA-AA-A1 11:08:00
111.111.111.112 AA-AA-AA-AA-AA-A2 11:09:03

When a host sends a datagram, if the ARP entries match the mapping, the destination MAC address is set to the corresponding address. If no mapping exists, the MAC address is obtained through the ARP module. Its working process is:

  • ARP is valid only on the same subnet.
  • The sender uses THE ARP protocol to construct special groups called ARP groups, including the IP addresses of the sender, the IP addresses of the receiver, and the MAC addresses of the sender.
  • The sender broadcasts an ARP packet, and the destination MAC address is set to the broadcast address of FF-ff-ff-FF-FF. This is like asking “Who knows the MAC address of A with IP address A” on A subnet.
  • A host or router that can process ARP packets sends back ARP response packets to the sender.
  • After receiving the response, the sender updates the ARP entry, sets the MAC address of the datagram to the corresponding value, and sends the datagram.

As shown in the figure above, two subnets with IP addresses 111.111.111/24 and 222.222.222/24 are composed. What happens if the host with IP address 111.111.111.111.111 sends datagrams to the host with IP address 222.222.222.222?

The MAC address of the host whose IP address is 222.222.222.222 is BB-BB-BB-B2, but the sender host cannot directly set the MAC address to BB-BB-BB-B2. Because there are no devices in subnet 222.222.222/24 that have interfaces with this MAC address, the group will eventually be dropped.

Note that the router can handle datagrams within 111.111.111/24 and 222.222.222/24. At first, the sender does not have an ARP entry corresponding to IP address 222.222.222.222 and sends an ARP query to the subnet. After receiving the ARP, the router finds that it can process the addresses within 222.222.222/24, replies the ARP, please send it to me, I can process, my MAC address is aA-aa-aa-aa-AA. Next, the sender sends the datagram to the router.

Through the forwarding table, the router knows that the interface with IP address 222.222.222.220 can process subnet 222.222.222/24 and forwards datagrams to this interface.

On subnet 222.222.222/24, the router uses ARP to ask who knows the MAC address of the host whose IP address is 222.222.222.222, and then receives an ARP reply from the destination host. The MAC address is BB-BB-BB-BB-B2. Finally, the adapter is encapsulated into a frame with destination MAC address BB-BB-BB-BB-B2, and the final data is delivered to the host with IP address 222.222.222.222.

switches

The purpose of the switch is to receive the link layer frame and forward it to another node, and there is no mention of the switch in the addressing content. This is what the switch is supposed to do — be transparent to the host or router.

When the switch receives a frame, it filters it and either forwards it to an interface or dismisses it. This depends on the switch table owned by the switch. The switch table contains entries of some hosts and routers on a LAN. Each entry contains the MAC address, the switch interface corresponding to the MAC address, and the time when the entry is placed in the table. Such as:

address interface time
AA-AA-AA-AA-AA-A1 1 11:00:00
AA-AA-AA-AA-AA-AA 3 11:07:00

It also shows that the switch works with MAC addresses instead of IP addresses. Each entry will expire after a certain period of time. Now, suppose interface Y receives a frame with the destination MAC address X. Depending on the existing switch table, the possible processing is as follows:

  • If there is no entry for X in the table, the switch will broadcast the frame to all interfaces except Y.
  • If there is an entry about X in the table and the associated interface is Y, the switch will not forward this frame and discard it for filtering purposes.
  • If there is an entry for X in the table and the associated interface is Z, the switch forwards the frame to interface Z.

So how do switch tables come from?

Since the study

Switch table acquisition is simple but effective, and is automatically, dynamically, and autonomously created:

  • At the beginning, the switch table is empty
  • When an interface on a switch receives a frame with an active MAC address in it, an entry can be created to associate the MAC address with the interface
  • If no frame from the MAC address is received after a certain period of time, the entry is deleted

As time flows, the switch has MAC addresses and interfaces that are active in the near future. Therefore, for lans built with switch locks, there is no need to manually configure the switch table.

Features of the switch

  • Collision elimination: In a LAN built with a switch, there is no wasted bandwidth due to collisions, and the switch does not propagate multiple frames simultaneously.
  • Heterogeneous links: The links connected to the switch are isolated from each other, so different links can operate at different rates.
  • Management: The switch is easy to manage the network. If a node may be sending frames continuously due to a failure, the switch can directly detect the problem and disconnect it.

Ethernet

Today, the most widely used technology for building wired Lans is undoubtedly Ethernet. If you think about the home networks we use, the way we connect to the network is almost Ethernet.

In the beginning, all nodes on the Ethernet connected to the bus were broadcast lans on the bus topology, and all frames to be transmitted were received by all nodes connected to the bus.

By the late 1990s, Ethernet had evolved into a central-based star topology broadcast LAN, where hosts and routers were connected directly with twisted copper wires. The hub simply amplifies the signal strength. If frames are received from different interfaces at the same time, a collision will occur, and the nodes that generated those frames must be retransmitted.

Today, most Ethernet networks are LAN based on a switch star topology. According to the characteristics of the switch, there is no collision problem. For some scenarios that need to be broadcast, each node can also be solved by CSMA/CD.

In Ethernet, Ethernet frames are transmitted as unit frames:

Destination address to synchronous code before | | | source address data type | | | | CRCCopy the code
  • Pre-sync code: 8-byte pre-sync, the first 7 bytes are 10101010, and the last byte is 10101011. The pre-synchronization code coordinates the clock synchronization between the sender and receiver. Ethernet Lans support a variety of transmission rates, and node rates are inherently ethereal, so pre-synchronization codes are based on the receiver’s sufficient adjustment period to receive Ethernet frames.
  • Type: Specifies the protocol used at the previous layer to allow the node to distinguish processing.

The other frame properties are not much different from those described above.

Ethernet supports a variety of speeds, so it’s not surprising that there are a variety of naming options. Such as 10Base-T, 10Base-2, 100Base-T, 1000Base-Lx, and 10Gbbase-t. The rule is that the first number represents the speed, such as 10Mbps, 10GB, etc., BASE refers to baseband Ethernet, and the last abbreviation represents the physical media itself.

conclusion

The purpose of the link layer is to carry data and use the physical layer to transmit data to the destination in an appropriate manner and time.

  • Due to the inherent characteristics of the information transmission mode on the physical media, information errors may be caused by weakness, interference, superposition and other factors in the transmission process. Therefore, it is necessary to check the data at the receiver by means of CRC.
  • For propagation, if multiple nodes propagate at the same time, it is inevitable that no receiving node can get valid frames, so the link layer protocol is needed to control the propagation rhythm, that is, which node should propagate at what time. Therefore, channel division protocol, random access protocol and rotation protocol are used to face the multipath access problem.
  • When a frame is to be identified, the MAC address determines the identity of the physical device, which does not change regardless of its physical location.
  • For wired lans, switches can effectively control the transmission rhythm and learn the interfaces corresponding to different MAC addresses. The switch receives various frames and guarantees that no frames are transmitted at the same time, thus avoiding collision problems.

Much of this article is from chapter 6, Computer Networking: a Top-down Approach, 7th edition.

Article portals in all layers of the Internet protocol stack

Internet Protocol Stack: What are application-layer protocols doing

Internet Protocol Stack: How does the transport layer deliver data

Internet protocol stack: the data plane of the network layer

Internet protocol stack: the control plane of the network layer

Internet Protocol Stack: Link layer overview

reference

Chap. 6 in computer networking: a top-down approach, 7th edition

Do you really understand the data link layer?

The essence of Code division multiple Access (CDMA) – the beauty of orthogonal