5G will open a new chapter in the transformation of the industrial Internet, and it is a consensus of the industry to promote the development of 5G integrated applications. The latest White Paper on 5G Intelligent Networks released by GTI emphasizes that network intelligence is an indispensable capability for the efficient and high-quality construction, deployment and operation of 5G networks. How to provide users with higher quality and more secure communication services has become an important issue for operators and even the whole society.
New challenges in the operation and maintenance of the 5G core network
5G core network (_5GCOR_) is an important part of 5G construction for telecom operators. With new technologies, it faces new challenges in network deployment, network function and new business development. In 4G core network (_EPC, Evolved Packet Core_), network elements are carried by proprietary devices with strong hardware attributes. In the 5G core network environment, the Service Based architecture (_SBA, Service Based Architecture_) is adopted, and the design ideas of cloud native and micro-service are integrated to build the core network in a software-based, modular and service-oriented way. For the operation and maintenance guarantee of the new core network, we face the following challenges:
The decoupling of network functions increases the number of monitored objects
According to the definition of 3GPP, each Network Function (_NF, Network Function_) of 5G core Network is decoupled at the functional level, and several independent Network Function services (_NFS, Network Function Service_) are split. These Network functions operate independently. Provide standardized service interface, through mutual call access to achieve network functions. In the 5G core network scheme, the integration of virtualization and cloud native technology makes the general server replace the proprietary hardware devices. At the same time, the number of virtual network elements, virtual machines and container PODs grows rapidly. Each workload provides multiple IPv4 and IPv6 working planes at the same time.
Compared with 4G EPC, due to the superposition of many evolutions, the number of NFS instances after virtualization in the 5G core network SBA architecture increases by more than two orders of magnitude. The huge number of objects to be monitored is the first challenge in the guarantee side of the 5G core network.
Service automation makes tracking more difficult
Through the network function Repository (_NRF, NF Repository Function_), all kinds of network function services of 5G core network can be automatically managed to realize the automatic discovery, registration, update and status detection of services, so as to avoid a lot of manual configuration work in service access. Centralized control surface can transform a large number of cross-regional signaling interactions into internal traffic of data center and optimize signaling processing delay. According to the changes of business applications, the network functions and services can be rapidly expanded and reduced according to the needs, so as to improve the service response speed of the network. Automated management improves management efficiency on the production side, and at the same time adds new challenges that are dynamic and difficult to track on the core network security side.
Path optimization and interaction decoupling increase monitoring complexity
The communication between network elements of 4G core network follows the point-to-point mode of requestor and responder, which is a traditional mode of mutual coupling. Under the service-oriented architecture of the 5G core network, the network functions and services can communicate on demand. The communication mechanism between network functions and services under the 5G core network architecture is further decoupled into producer and consumer mode, which has the advantages of flexibility, arrangement, decoupling and openness, and is an important basic capability for rapidly meeting the needs of vertical industries in the 5G era. In the actual application process, the network functions avoid unnecessary network transit, but the call dependence between services, access tracking, performance analysis, fault location, etc., have also become new challenges for the operation and maintenance support side.
Practice of network function service monitoring scheme for Deepflow 5G core network
Deepflow is a software product for 5G core network, which is based on the acquisition and analysis of communication access traffic between service NFs to ensure the stable operation of core network. In the overall scheme, according to the processing logic, it can be divided into three parts: traffic acquisition, data distribution and transmission, and diagnosis and analysis. Through the abstract layer of traffic collection and preprocessing, the northward management interface of traffic collection and preprocessing is provided, so that the whole monitoring platform has the ability to expand basic data acquisition.
Generally, in the 5G core network environment, it mainly involves the network traffic acquisition of KVM virtual machine and container POD. The network function service monitoring scheme of the Deepflow 5G core network supports the IPv4 and IPv6 protocol environment, and closely combines with the HTTP V2 protocol to realize the correlation dependency monitoring among services. This paper is based on the actual 5GC operating environment of the operator, simplifies the complexity and introduces it based on the Free5GC environment.
What is free5GC?
The free5GC is an open-source project for 5th generation (5G) mobile core networks. The ultimate goal of this project is to implement the 5G core network (5GC) defined in 3GPP Release 15 (R15) and beyond.
Free5GC is an open source software project of 5G core network. Its overall architecture is based on 3GPP standard and follows the SBA framework. It realizes network functions by means of virtualization, and can run standard services of 5G core network and simulate corresponding workflow. In the real 5G environment, most vendors already use container technology to host network function services. In this paper, virtual machine is used to run the container, create Kubernetes cluster, build 5G core network verification environment, and enable each network function. Through the Deepflow platform of Spruce Network, the monitoring guarantee of each network service is realized. The components deployed during the practice include controllers, collectors, and data nodes.
Tracking web services from large to small
In the monitoring practice of 5G core network, the running status and correlation of services are shown step by step and orderly from large to small. Generally, IT can be divided into three categories according to the workflow. The larger scope is divided into the region or resource pool of the data center, followed by the network function or service type, such as AMF, UDM, SMF, etc., and finally IT will be concentrated in the IT unit, such as container POD, host, IP, etc. The Deepflow platform is divided into three types of operations, from large to small, to provide complete, step-by-step monitoring and tracking of the complex networks involved in the core network. The following figure presents a panoramic view of the operation and invocation relations of various types of network function services. The call communication among network functions in the service interface (_SBI, service-based interface_), as well as the performance indicators are automatically drawn and presented.
Practice, focus on key indicators between services, including the network layer (throughput, load), the transport layer (concurrent delay, the TCP connection, TCP jianlian system time delay, TCP retransmission, build even failure), the application layer (HTTP requests, HTTP delay, HTTP), following a visit to call relationship panoramic view drawing, with support from the knowledge mapping function, The corresponding dimensions of knowledge can be quickly enumerated.
Minute level localization of abnormal boundary range
There are a large number of complex inter-NFS service calls in 5G core network, so it is particularly important to have effective call performance tracking capability.
As shown in the figure above, in a simple logical call, the NFS in AMF (Access and Mobility Management Function) calls the NFS in UDM(Unified Data Management) to obtain user information. During this process, the NFS in UDM(Unified Data Management) is called to obtain user information. It’s not as straightforward as it would be in a traditional environment. In 5G current network environment, network virtualization implementation of host, virtual machine and container is generally involved. It is necessary to deal with the challenges of operation, maintenance and troubleshooting in the new environment to organize access calls by full stack segmentation. From the perspective of full stack, the above call can be expanded to analyze the links such as POD interface, virtual machine interface, host machine interface and gateway through which NFS initiates the call.
Aiming at the call and access between services in the cloud, full-stack tracking expands the logical communication realized by virtualization step by step, clearly and conveniently displays the network status and performance of each segment. Combined with knowledge graph and rich index data, it can quickly locate the problem scope boundary of abnormal performance. Take the above access as an example, if the fault of call delay is checked and the NFS call service is double-ended, the full-stack trace is expanded to directly locate the interface where the delay is located. For example, in the full-stack tracing example figure, it is clearly shown that the bottleneck of access delay between AMF service instance and UDM service instance is on the UDM function side, and it is focused on the virtual network interface of the virtual machine in which it is running. The POD network interface of UDM service instance and many interface paths such as virtual machine and POD covered by AMF are excluded.
In the absence of DeepFlow full-stack tracking tools, checking the performance of service access calls will be a confusing, complex and lengthy process. At the same time, front-line operation and maintenance personnel are required to master a relatively large stack of technologies and have strong comprehensive capabilities, which may delay valuable operation and maintenance window time.
conclusion
The above Free5GC example was run in the laboratory environment, and the corresponding test cases were simulated and run. Compared with the laboratory, the actual production scene is more complex in environment and large in scale, which is bound to put higher requirements on the operation and maintenance guarantee. The actual environment tests verify that the Deepflow platform can indeed fill the monitoring and security gap for the 5G core network.
Deepflow helps 5G core network to collect network traffic between services in a service-based architecture, realize comprehensive performance monitoring of access calls, and provide full-stack path tracking after containerization, fill up the gap of 5G core network service monitoring, respond to the native characteristics of cloud, and closely combine with 5G services. To solve the problems of monitoring, operation and maintenance, and security encountered in the production of 5G core network.