Into the Linux kernel Netfilter network framework

This paper is participating in the30 years of Linux”Topic essay activity

The author has done a little research on Linux kernel-related modules before, realizing kernel-level communication encryption and video stream encryption, involving Linux kernel network protocol stack, Linux kernel communication module, Linux kernel encryption module, secret key generation and distribution, etc. Consider starting a Linux kernel column later.

Without further ado, we will now take you into the Netfilter network framework of the Linux kernel.


Overview: What is Netfilter

For application layer developers who don’t often touch the Linux kernel, Netfilter may be less well known. Most Linux users have used or are aware of Iptables at some point, however, the implementation of iptables is done on top of Netfilter.

Netfilter is a framework in the Linux kernel that provides flexibility for a variety of net-related operations in the form of custom processors. Netfilter provides options for packet filtering, network address translation, and port translation.

1. net filter

Its detailed composition:

Netfilter is the main implementation framework for packet filtering, Connect Track, network address translation (NAT) and other functions in Linux kernel. The framework defines a series of Hook points in the key flow of network protocol stack processing data packets, and registers a series of functions in these Hook points to process data packets. These functions registered at the hook point are the packet traffic policies set in the network protocol stack. In other words, these functions determine whether the kernel accepts or discards a packet, and their processing results determine the “fate” of the network packet.

As can be seen from the figure, Netfilter framework adopts the modular design concept and runs through the kernel mode and user mode of Linux system.

At the user-mode level, different system call tools are provided for upper-layer users according to different protocol types, such as IPv4 iptables, IPv6 ip6tables, ARP arptables, ebtables for bridge control. Conntrack for network connection tracking, etc.

Different user-mode tools have corresponding modules in the kernel for implementation, and the bottom layer needs to call Netfilter Hook API interface for implementation.

It was also discovered that the iptables Linux firewall tool mentioned earlier is actually a component of the Netfilter framework.

2.Netfilter Packet path

Path of normal packets in Netfilter:


Second, Netfilter implementation

Netfilter Hooks in the Linux Kernel

1.Net Filter Mount point: Netfilter Places

(1) Function definition

As you can see from the network packet sending and receiving flow chart above, Nefilter’s hook functions can be registered in different places. Defined as follows:

// include/linux/netfilter.h

enum nf_inet_hooks {
        NF_INET_PRE_ROUTING,
        NF_INET_LOCAL_IN,
        NF_INET_FORWARD,
        NF_INET_LOCAL_OUT,
        NF_INET_POST_ROUTING,
        NF_INET_NUMHOOKS
};
Copy the code
  • NF_INET_PRE_ROUTING: incoming packets pass this hook in the () function before they are processed by the routing code. ip_rcv()``linux/net/ipv4/ip_input.c
  • NF_INET_LOCAL_IN: all incoming packets addressed to the local computer pass this hook in the function . ip_local_deliver()
  • NF_INET_FORWARD: incoming packets are passed this hook in the function . ip_forwared()
  • NF_INET_LOCAL_OUT: all outgoing packets created in the local computer pass this hook in the function . ip_build_and_send_pkt()
  • NF_INET_POST_ROUTING: this hook in the ipfinishoutput() function before they leave the computer.

(2) Mount point analysis

NetfilterBy registering with different locations in the kernel protocol stackHooks functionTo filter or modify packets. These locations are calledThe mount point, there are 5 main ones:PRE_ROUTING,LOCAL_IN,FORWARD,LOCAL_OUTPOST_ROUTING.

Mount point parsing:

  • PRE_ROUTING: Before the route. After the data packet enters the IP layer, but before the routing judgment of the data packet.
  • LOCAL_IN: Enter the local. After determining the route of the packet, if the packet is sent to the local device, send the packet before sending it to the upper-layer protocol.
  • FORWARD: forward. If the packet is not sent to the local device after routing determination, the packet must be forwarded before being forwarded.
  • LOCAL_OUT: Local output. Before routing judgment is performed on the output data packets.
  • POST_ROUTING: After the route. After determining the route of the output data packets.

Routing decision:

As can be seen from the figure above, routing determination is the key point of data flow direction.

  • The first routing decision is made by looking for input packetsThe IP headerThe purpose ofThe IP addressWhether it is localThe IP addressIf it is on the machineThe IP address, indicating that data is sent to the local machine. Otherwise, the data packets are sent to other hosts and are only relayed through the local host.
  • The second routing decision is based on the output packetThe IP headerThe purpose ofThe IP addressIt searches for the route information in the routing table and obtains the next-hop host (or gateway) based on the route informationThe IP addressAnd then the data is transferred.

Packet flow Direction It can be seen from the figure that packets in the three directions need to pass through different hook nodes:

  • Send to local: NF_INET_PRE_ROUTING- >NF_INET_LOCAL_IN
  • Forwarding: NF_INET_PRE_ROUTING — > NF_INET_FORWARD – > NF_INET_POST_ROUTING
  • Locally emitted: NF_INET_LOCAL_OUT–>NF_INET_POST_ROUTING

(3) Mount the linked list

By registering hook functions with these mount points, packets in different stages can be filtered or modified. Because hook functions can register more than one, the kernel uses a linked list to store these hook functions. When a packet enters the local (LOCAL_IN mount point), the IPt_hook and FW_CONFIRM hook functions are called in succession to process the packet. In addition, hook functions have priorities, and the lower the priority, the sooner they are executed. Because mount points store hook functions in linked lists, they are also called chains. The chain names for mount points are as follows:

  • LOCAL_INMount point: also known as mount pointINPUT chain.
  • LOCAL_OUTMount point: also known as mount pointThe OUTPUT chain.
  • FORWARDMount point: also known as mount pointPORWARD chain.
  • PRE_ROUTINGMount point: also known as mount pointThe PREROUTING chain.
  • POST_ROUTINGMount point: also known as mount pointPOSTOUTING chain.

Netfilter defines five constants to represent these five positions, as follows:

// File: include/ Linux /netfilter_ipv4.h
#define NF_IP_PRE_ROUTING   0
#define NF_IP_LOCAL_IN      1
#define NF_IP_FORWARD       2
#define NF_IP_LOCAL_OUT     3
#define NF_IP_POST_ROUTING  4
Copy the code

2. Register the hooks

Register and unregister hook functions: Register the hooks

(1) Register and unregister hook functions

The kernel provides the following functions to register and unhook functions.

// include/linux/netfilter.h
/* Function to register/unregister hook points. */

int nf_register_hook(struct nf_hook_ops *reg);
void nf_unregister_hook(struct nf_hook_ops *reg);
int nf_register_hooks(struct nf_hook_ops *reg, unsigned int n);
void nf_unregister_hooks(struct nf_hook_ops *reg, unsigned int n);
Copy the code

These functions are used to register custom hook operations (struct nF_HOOk_ops) with the specified hook node.

(2) Hooks operate on data structures

The structure is as follows: nF_HOOK_OPS

struct nf_hook_ops
{
        struct list_head list;

        /* User fills in from here down. */
        nf_hookfn *hook;
        struct module *owner;
        u_int8_t pf;
        unsigned int hooknum;
        /* Hooks are ordered in ascending priority. */
        int priority;
};
Copy the code

This structure stores custom hook function (NF_HOOkFN), function priority (PRIORITY), processing protocol type (PF), hook node (HOOKNum) and other information.

(3) Register hook functions

Once a hook function structure is defined, the nf_register_hook function is called to register it with the NF_hooks array.

// File: net/core/netfilter.c

int nf_register_hook(struct nf_hook_ops *reg)
{    struct list_head *i;
    br_write_lock_bh(BR_NETPROTO_LOCK); 
    // Lock nF_hooks
    The priority field indicates the priority of the hook function
    // So use the priority field to locate the appropriate hook function
    
    for(i = nf_hooks[reg->pf][reg->hooknum].next; i ! = &nf_hooks[reg->pf][reg->hooknum]; i = i->next) {if (reg->priority < ((struct nf_hook_ops *)i)->priority)
        break;
    }
    list_add(&reg->list, i->prev); // Add the hook function to the list
    br_write_unlock_bh(BR_NETPROTO_LOCK); // Unlock nF_hooks
    return 0;
}
Copy the code

The nf_register_hook function is simple to implement as follows:

  • rightnf_hooksLock for protectionnf_hooksVariables are not subject to concurrent contention.
  • The priority of the hook function is used to find the correct position in the list of hook functions.
  • Insert the hook function into the list.
  • rightnf_hooksUnlock the account.

Declare hook functions

The hook function is specified and declared as follows: nf_HOOkfn *hook

// include/linux/netfilter.h

typedef unsigned int nf_hookfn(unsigned int hooknum,
                               struct sk_buff *skb,
                               const struct net_device *in,
                               const struct net_device *out,
                               int (*okfn)(struct sk_buff *));
Copy the code

It returns one of the following results:

// <linux/netfilter.h>
#define NF_DROP 0
#define NF_ACCEPT 1
#define NF_STOLEN 2
#define NF_QUEUE 3
#define NF_REPEAT 4
#define NF_STOP 5
#define NF_MAX_VERDICT NF_STOP
Copy the code

4. Processing protocol type: PF

Protocol Family (PF) is the identifier of a protocol family.

enum {
        NFPROTO_UNSPEC =  0,
        NFPROTO_IPV4   =  2,
        NFPROTO_ARP    =  3,
        NFPROTO_BRIDGE =  7,
        NFPROTO_IPV6   = 10,
        NFPROTO_DECNET = 12,
        NFPROTO_NUMPROTO,
};
Copy the code

5. Hook identifier: Hooknum

Hook identifiers, all valid identifiers for each protocol family are defined in the header file.

For example: < Linux/netfilter_ipv4. H >

/* IP Hooks */
/* After promisc drops, checksum checks. */
#define NF_IP_PRE_ROUTING       0
/* If the packet is destined for this box. */
#define NF_IP_LOCAL_IN          1
/* If the packet is destined for another interface. */
#define NF_IP_FORWARD           2
/* Packets coming from a local process. */
#define NF_IP_LOCAL_OUT         3
/* Packets about to hit the wire. */
#define NF_IP_POST_ROUTING      4
#define NF_IP_NUMHOOKS          5
Copy the code

6. Hook priority: Priority

The priority of the hook, and all valid identifiers for each protocol family are defined in the header file.

For example: < Linux/netfilter_ipv4. H >

enum nf_ip_hook_priorities {
        NF_IP_PRI_FIRST = INT_MIN,
        NF_IP_PRI_CONNTRACK_DEFRAG = - 400.,
        NF_IP_PRI_RAW = - 300.,
        NF_IP_PRI_SELINUX_FIRST = - 225.,
        NF_IP_PRI_CONNTRACK = - 200.,
        NF_IP_PRI_MANGLE = - 150.,
        NF_IP_PRI_NAT_DST = - 100.,
        NF_IP_PRI_FILTER = 0,
        NF_IP_PRI_SECURITY = 50,
        NF_IP_PRI_NAT_SRC = 100,
        NF_IP_PRI_SELINUX_LAST = 225,
        NF_IP_PRI_CONNTRACK_CONFIRM = INT_MAX,
        NF_IP_PRI_LAST = INT_MAX,
};
Copy the code
enum {
        NFPROTO_UNSPEC =  0,
        NFPROTO_IPV4   =  2,
        NFPROTO_ARP    =  3,
        NFPROTO_BRIDGE =  7,
        NFPROTO_IPV6   = 10,
        NFPROTO_DECNET = 12,
        NFPROTO_NUMPROTO,
};
Copy the code

7. Trigger call hook function

The hook functions have been saved to different chains. When will these hook functions be invoked to process packets? To trigger calls to all hook functions on a mount point (chain), we need to use the NF_HOOK macro, which is defined as follows:

/ / file: include/Linux/netfilter. H

#define   NF_HOOK(pf, hook, skb, indev, outdev, okfn)  (list_empty(&nf_hooks[(pf)][(hook)]) ? (okfn)(skb) : nf_hook_slow((pf), (hook), (skb), (indev), (outdev), (okfn)))
Copy the code

NF_HOOK macro parameters:

  • pf: Indicates the protocol typenf_hooksThe first dimension of the array, such as IPv4, isPF_INET.
  • hook: Which chain (mount point) hook function to call, e.gNF_IP_PRE_ROUTING.
  • indev: Device object that receives packets.
  • outdev: Device object that sends packets.
  • okfnWhen all the hook functions on the chain have been processed, this function is called to continue processing the packet.

The implementation of NF_HOOK macro is also relatively simple, first check whether the hook function list is empty, if it is empty, call okFN function directly to process the packet, otherwise call nF_HOOk_slow function to process the packet. Let’s look at the implementation of the nF_hook_slow function:

// File: net/core/netfilter.c

int nf_hook_slow(int pf, unsigned int hook, struct sk_buff *skb,
                 struct net_device *indev, struct net_device *outdev,
                 int (*okfn)(struct sk_buff *))
{
    struct list_head *elem;
    unsigned int verdict;
    int ret = 0;

    elem = &nf_hooks[pf][hook]; // Get the list of hook functions to call

    // Iterate through the list of hook functions and call the hook function to process the packetverdict = nf_iterate(&nf_hooks[pf][hook], &skb, hook, indev, outdev, &elem, okfn); .// If the result is NF_ACCEPT, which means that the packet passes through all hook functions, then okfn is called to continue processing the packet
    // If the result is NF_DROP, the packet is rejected and should be discarded
    switch (verdict) {
    case NF_ACCEPT:
        ret = okfn(skb);
        break;
    case NF_DROP:
        kfree_skb(skb);
        ret = -EPERM;
        break;
    }

    return ret;
}
Copy the code

The implementation of the nF_HOOk_slow function is also relatively simple, as follows:

  • First callnf_iterateThe hook function iterates through the list of hook functions and calls the hook functions on the list to process the packet.
  • If the processing result isNF_ACCEPT, indicating that the packet is processed by all the hook functions, then calledokfnThe function continues processing the packet.
  • If the processing result isNF_DROP, indicating that the packet did not pass the hook function and should be discarded.

Since Netfilter calls the NF_HOOK macro to hook functions on the list, where does the kernel call the macro?

For example, the NF_HOOK macro is called in the ip_rcv function to process packets when they enter the IPv4 protocol layer. The code is as follows:

// File: net/ipv4/ip_input.c

int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt)
{...return NF_HOOK(PF_INET, NF_IP_PRE_ROUTING, skb, dev, NULL, ip_rcv_finish);
}
Copy the code

As shown in the code above, the NF_HOOK macro is called in the ip_rCV function to process the incoming packet, calling the hook chain (mount point) as NF_IP_PRE_ROUTING. Okfn is set to IP_RCv_FINISH, which means that when all the hook functions on the NF_IP_PRE_ROUTING chain have successfully processed the packet, the ip_RCv_FINISH function will be called to continue processing the packet.


Netfilter application case

The following is a kernel module Demo found on the network. The basic function of this module is to print the source Mac address, destination Mac address, source IP address and destination IP address of packets passing through the IPv4 network layer NF_INET_LOCAL_IN node, and download the source code package. NF_INET_LOCAL_IN

The code looks like this:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/skbuff.h>
#include <linux/ip.h>
#include <linux/udp.h>
#include <linux/tcp.h>
#include <linux/netfilter.h>
#include <linux/netfilter_ipv4.h>


MODULE_LICENSE("GPLv3");
MODULE_AUTHOR("SHI");
MODULE_DESCRIPTION("Netfliter test");

static unsigned int
nf_test_in_hook(unsigned int hook, struct sk_buff *skb, const struct net_device *in,
                const struct net_device *out, int (*okfn)(struct sk_buff*));

static struct nf_hook_ops nf_test_ops[] __read_mostly = {
  {
    .hook = nf_test_in_hook,
    .owner = THIS_MODULE,
    .pf = NFPROTO_IPV4,
    .hooknum = NF_INET_LOCAL_IN,
    .priority = NF_IP_PRI_FIRST,
  },
};

void hdr_dump(struct ethhdr *ehdr) {
    printk("[MAC_DES:%x,%x,%x,%x,%x,%x" 
           "MAC_SRC: %x,%x,%x,%x,%x,%x Prot:%x]\n",
           ehdr->h_dest[0],ehdr->h_dest[1],ehdr->h_dest[2],ehdr->h_dest[3],
           ehdr->h_dest[4],ehdr->h_dest[5],ehdr->h_source[0],ehdr->h_source[1],
           ehdr->h_source[2],ehdr->h_source[3],ehdr->h_source[4],
           ehdr->h_source[5],ehdr->h_proto);
}

#define NIPQUAD(addr) \
        ((unsigned char *)&addr)[0], \
        ((unsigned char *)&addr)[1], \
        ((unsigned char *)&addr)[2], \
        ((unsigned char *)&addr)[3]
#define NIPQUAD_FMT "%u.%u.%u.%u"

static unsigned int
nf_test_in_hook(unsigned int hook, struct sk_buff *skb, const struct net_device *in,
                const struct net_device *out, int (*okfn)(struct sk_buff*)) {
  struct ethhdr *eth_header;
  struct iphdr *ip_header;
  eth_header = (struct ethhdr *)(skb_mac_header(skb));
  ip_header = (struct iphdr *)(skb_network_header(skb));
  hdr_dump(eth_header);
  printk("src IP:'"NIPQUAD_FMT"', dst IP:'"NIPQUAD_FMT"' \n",
         NIPQUAD(ip_header->saddr), NIPQUAD(ip_header->daddr));
  return NF_ACCEPT;
}

static int __init init_nf_test(void) {
  int ret;
  ret = nf_register_hooks(nf_test_ops, ARRAY_SIZE(nf_test_ops));
  if (ret < 0) {
    printk("register nf hook fail\n");
    return ret;
  }
  printk(KERN_NOTICE "register nf test hook\n");
  return 0;
}

static void __exit exit_nf_test(void) {
  nf_unregister_hooks(nf_test_ops, ARRAY_SIZE(nf_test_ops));
}

module_init(init_nf_test);
module_exit(exit_nf_test);
Copy the code

Results: after the dmesg | tail

[452013.507230] [MAC_DES: 70, f3, 95, e, 42, faMAC_SRC: 0,f,fe, F6, 7C,13 Prot:8] [452013.507237] SRC IP:'10.6.124.55', DST IP: '10.6.124.54' [452013.944960] [MAC_DES: 70, f3, 95, e, 42, faMAC_SRC: 0,f,fe, F6, 7C,13 Prot:8] [452013.944968] SRC IP:'10.6.124.55', DST IP: '10.6.124.54' [452014.960934] [MAC_DES: 70, f3, 95, e, 42, faMAC_SRC: 0,f,fe, F6, 7C,13 Prot:8] [452014.960941] SRC IP:'10.6.124.55', DST IP: '10.6.124.54' [452015.476335] [MAC_DES: 70, f3, 95, e, 42, faMAC_SRC: 0,f,fe, F6, 7C,13 Prot:8] [452015.476342] SRC IP:'10.6.124.55', DST IP: '10.6.124.54' [452016.023311] [MAC_DES: 70, f3, 95, e, 42, faMAC_SRC: 0,f,fe, F6, 7C,13 Prot:8] [452016.023318] SRC IP:'10.6.124.55', DST IP:'10.6.124.54'Copy the code

The Demo program is a kernel module with an entry for the init_nf_test function passed in for module_init.

In the init_nF_test function, it registers the custom NF_test_opt with the hook node through the NF_register_HOOKS interface provided by Netfilter. Nf_test_opt is a struct nF_HOOk_OPS array containing all key elements, such as the register node of the hook function (NF_INET_LOCAL_IN) and the hook function (nF_test_IN_hook).

Inside the nF_test_IN_hook function, it checks each incoming packet and prints out its source Mac address, destination Mac address, source IP address, and destination IP address. Finally, NF_ACCEPT is returned, passing the packet to the next hook function for processing.


4. Linux flow control

Traffic Control HOWTO: Most use Netfilter to achieve flow Control. More detailed documentation is Linux Advanced Routing & Traffic Control HOWTO and a reduced version of Traffic Control HOWTO.


5. Read more

Monitoring and Tuning the Linux Networking Stack: Sending Data

Linux Netfilter and Traffic Control

Netfilter and iptables homepage

Illustrates the process of sending Linux network packets

Network Foundation – seven layer model

OSI seven-layer model and TCP/IP five-layer model

Linux Network layer packet receiving and sending process and Netfilter framework analysis

Principle of Netfilter & Iptables

Netfileter & Iptables implementation (1) – Netfilter implementation