Serverless over Storage

What is Serverless?

Serverful = Serverful = Serverful = Serverful = Serverful = Serverful But this is obviously not possible, after all, any program will eventually be executed on the machine anyway.

To understand Serverless, it is necessary to review how our usual Serverful service runs.

The evolution from Serverful to Serverless

The way physical machines are used never seems to change, either ten years ago or now. The typical process is hardware procurement, unpacking, power on, do Raid, plug in the network cable, adjust the switch, do a comprehensive configuration check, incidentally also have to check the quality of some memory, hard disk, firmware and so on, because maybe run two days dead. The whole environment is physical work.

virtualization

Feeling the pain will prompt engineers to change.

Having worked hard to prepare the hardware, do YOU, as an IT manager, have to go through the same process again when a developer needs a machine?

IT can be said that virtualization liberates IT managers, providing great convenience to deploy and run a “machine” by running more virtual machines on a physical machine and improving resource utilization for better financial gain.

Virtualizing a set of hardware through the operating system, combined with the mechanism of vm template mirroring, means that creating and moving virtual machines on physical machines is a matter of minutes.

The heavy, repetitive physical work of the past has disappeared.

cloud

Single-node virtualization cannot meet large-scale scenarios, including scheduling and network virtualization requirements. You can choose to build a public cloud or a private cloud, such as OpenStack.

You don’t even need to care about the underlying hardware, as long as it is a common architecture, operating system, network, storage and so on can be installed and extended automatically.

However, even in the cloud era, the way applications operate is the same. The software only sees a virtual hardware environment. The only difference for consumers is that the process of preparing the infrastructure for the software is faster.

The container is changed

Although virtualization makes great use of resources and greatly improves convenience, the wheels of technology are rolling forward and engineers are always looking ahead. Virtualization still has the problem of “heavy”. The image is too large, and multiple VMS have the same operating system. Therefore, physical machines cannot run too many VMS.

Containerization, especially the popularity of Kubernetes since 2017, is once again bringing change. The container is just a lightweight process, and the software provider can simply maintain a Dockerfile, generate a much smaller image, and deploy it on the container platform. Apps go live without concerns about dependencies, conflicts, and “Everything works fine here, it must be your environment.”

Serverless, the more understandable name is functions-as-a-Services, and I think one of the reasons for that is that you don’t care about the server, you don’t have to think about them, you just execute your code.

Imagine even in the case of containers, blessing, application developers still need to be focus on how to set up such as RestAPI framework, workflow, how to deal with, pressure to how to load balance, message middleware such problems as how to deal with, may also want to care about security updates, vulnerability scanning these did not appear to be related with the business logic.

In 2019, UC Berkeley published a paper called “Cloud Programming Simplified: A Berkeley View on Serverless Computing “paper (www2.eecs.berkeley.edu/Pubs/TechRp… There is a very vivid metaphor in the paper, described as follows:

In the cloud context, serverful computing is like programming in low-level assembly language whereas serverless computing is like programming in a higher-level language such as Python. An assembly language programmer computing a simple expression such as c = a + b must select one or more registers to use, load the values into those registers, perform the arithmetic, and then store the result. This mirrors several of the steps of serverful cloud programming, where one first provisions resources or identifies available ones, then loads those resources with necessary code and data, performs the computation, returns or stores the results, and eventually manages resource release. The aim and opportunity in serverless computing is to give cloud programmers benefits similar to those in the transition to high-level programming languages.

So what is Serverless in summary? It’s just the business logic that you care about, and all the rest is handled by the platform tools that run on the periphery.

implementation

Each major public cloud has its own implementation mode, such as AWS Lambda, Alibaba’s BatchCompute, Azure Function and so on. However, each one has different usage mode, and there is the risk of lock-in. So what are the options if you implement it on a private environment?

Fn project

Fn project is an open source project of Oracle. It looks very simple and straightforward, and can be run with a Docker. The problem is that there is not enough activity and there seems to be no new commit for six months.

Kubeless

The most orthodox name is a project contributed by Bitnami, based on native Kubernetes, through the way of custom resource CRD, but due to the influence of Knative, the future is not clear, even the founder suggested to shut down the project.

A project contributed by Platform9 that can take advantage of Kubernetes’ rich features and also get better performance when needed, such as cold startup. This is also the second project of the author after Fn Project.

Fission

Knative

Produced by a famous family. Google’s open source project, currently involving Google, Pivotal, IBM, RedHat, based on Kubernetes and Istio.

Dr. Ali Wang Jian’s “Online” has a particularly good sentence, “demand is competitive”, think, do; Do it, use it. In the process of communicating with AI colleagues, squeezing every bit of performance in the whole workflow process was a big improvement to the overall result, which prompted us to think about how to store and process data more efficiently.

Let’s take the scenario of Computable Storage as an example. As a new data processing technology, AI covers four stages: collection, preparation, training, and reasoning. Each stage is accompanied by data flow and processing.

Data collection phase: Data is gathered and stored from different sources, with varying sizes and formats. Data types are often unstructured data in the form of files.

Data preparation stage: Due to the different sizes and formats of data, in order to facilitate AI model training, it must be changed to a unified format for use in the subsequent training stage, which requires standardized processing of data of different formats and sizes.

Training phase: The workload of AI training process is very intensive, and it usually requires high-performance GPU or accelerator to execute a series of mathematical functions, which has high requirements on resources. When performing specific training, the time required by AI training is more dependent on the performance of the deployed storage.

Reasoning stage: The reasoning process is the stage of testing artificial intelligence. Reasoning infrastructure The CPU, memory, and storage requirements vary according to different scenarios.

Data preparation stage, usually is to use Hadoop batch processing tools such as data cleaning, in the separation of Hadoop computing nodes and distributed data storage nodes, is a typical process, read, calculation, write, means that the data flow to storage cluster, and then into the storage cluster, can avoid the flow of data, What about bringing computing closer to storage?

Based on the Serverless framework, we can run more practical functions on the basis of YRCloudFile, such as data replication, data compression, data decompression and so on, which are more suitable for the operations on the storage side.

The following example demonstrates a request (function) to submit a copy of data to the Serverless framework, making the request automatically executed on the backend storage without the requestor caring about the processing of the backend data.

After using the corresponding framework to create functions and triggers, you can complete the corresponding actions by accessing the corresponding URL.

If you’re quick enough to go inside the corresponding Function container, you’ll see the corresponding directory reference to the store inside.

This is just a simple example of copying data. We can write more complex data processing functions and submit them directly to the Serverless framework, so that the back-end data store can optimize and process the complex operations accordingly. Engineers can quickly implement the functions required by users and even complete the workflow you Pipeline, thus giving the application more possibilities.

Serverless over Storage

What is Serverless?

The evolution from Serverful to Serverless

virtualization

cloud

The container is changed

implementation

Serverless Over storage

Related Posts

Springboot adds application JVM monitoring for data visualization

Integration test artifact Testcontainers on the go

Swagger build API document management platform