Application scenarios

  • Generic HTTP/ RPC interface
  • Universal timed tasks! Queue consumption task
  • Ability to do most of the business logic

Advantage of scene

  • BFF front end “glue layer”
  • SSR server page rendering
  • Homogeneous Web applications
  • Real-time Communication Service (WebSocket)

Stability and performance

Stability statistics: SLA

  • 3 9:9 9.9%=8760 × 0.1%=8760 × 0.001=8.76 hours
  • 4 9:9 9.99%=8760 × 0.0001%=8760 × 0.001=52.6 minutes
  • 5 9:9 9.999%=8760 × 0.00001%=8760 × 0.001=5.26 minutes

Performance statistics:

  • Response time RTT
  • Processing capacity per unit time QPS/TPS
  • Concurrency Concurrency
  • Error Rate Error Rate

Statistical indicators of resources:

  • CPU Load
  • Memory Usage
  • FD Count
  • Disk Read/Write
  • Network Send/Recv

Stability guarantee

Do exception catching on process

Both unhandledRejection and uncaughtException are exceptions caught

Automatic restart

Based on the cluster module, after the child process exits the capture, the automatic fork restarts the child process, to ensure that in the multi-process architecture mode, after the single process hangs, you can quickly start a new process through the fork to avoid server downtime.

Health check

A proactive way to periodically check the server.

Node.js Indicator statistics tool

Get metrics using the Node.js API

CPU Usage

  • user cpu time
  • system cpu time

Memory Usage

  • heapTotal/heapUsed
  • external
  • array buffers
  • rss

IO Usage

  • fsRead/fsWrite
  • ipcSent/ipcReceived

Get service container metrics from system directives

  • top/htop

  • lsof -p xxx

  • vmstat 1

  • iostat 1

You can easily obtain and monitor the memory, and constantly monitor the stable status of the server, so as to take measures quickly

Data mobile && visualization

Encapsulate the common real-time reporting Metrics in the service framework or container image, collect data and display it through visual kanban, which helps us keep track of the current service status

  • Grabfana example of buried kanban

Release deployment

Early service deployment scenarios

  • Purchase or lease a server/public IP address

  • Install the operating system, set up the internal network environment, and a series of infrastructure tools

  • Upload the production environment code package through FTP or RSYNC

  • Run the startup command in the corresponding path

    • The early implementation ran as a daemon through Nohup

- Early implementation runs as a daemon through NohupCopy the code

  • Purchase a domain name, configure DNS, and reverse proxy to the service

IaaS: The early stage of cloud computing

Infrastructure as a Service (laaS) is a cloud Service vendor that provides consumer processing, storage, networking, and various basic computing resources to deploy and execute various software such as operating systems or applications.

LaaS are the lowest layer of cloud services and provide basic resources. Users can deploy and run processing, storage, network and other basic computing resources at will without purchasing network devices such as servers and software. They cannot control or control the underlying infrastructure, but can control operating systems, storage devices and deployed applications.

Virtual host /VPS represents the product

  • AWS EC2
  • Aliyun ECS
  • Tencent Cloud server

The technical implementation

  • Virtual machine (KVM/OpenVZ/the Hyper – V
  • OpenStack
  • Docker

PaaS: Mainstream application hosting distribution mode

Platform as a Service (PaaS) is a cloud computing Service that provides computing platforms and solutions

PaaS provides a software deployment platform (runtime) that abstracts hardware and operating system details and allows for seamless scaling. Developers only need to focus on their own business logic, not the underlying layer.

PaaS stands for product

  • Google AppEngine
  • Heroku
  • AWS Elastic Beanstalk
  • Vercel

The technical implementation

  • Docker Swarm /Docker Swarm

  • Kubernetes

    • Service choreography
    • Elastic expansion and contraction capacity
    • .

Paas-based publishing process

Most PaaS platforms provide support for running node.js services

The application is abbreviated according to the Node.js Runtime specification provided by the PaaS platform

Build script (NPM install), startup script (NPM start), and application configuration script (app.yml)

In the case of Vercel, you can bind Git Repository and release it directly

When the PaaS platform does not meet application requirements, you can use Dockerfile to customize special functions and start commands. The Following uses Heroku as an example to describe how to publish a Container on the CLI

The PaaS and conversation

DevOps (a portmanteal of Development and Operations) is a culture, movement, or practice that values communication and cooperation between “software developers (Dev)” and “IT Operations technicians (Ops). Build, test, and release software faster, more frequently, and more reliably by automating the software delivery and architecture change processes.

Modern PaaS platforms provide a basic DevOps process that greatly simplifies the process of bringing release tests online by binding Git Branch to automate the integration of release to the Preview environment. 850

PaaS and automatic scaling capacity

Thanks to the capabilities of Kubernetes and the common PaaS application Runtime

Modern PaaS services reduce o&M and service costs by defining instance performance and rapidly scaling up instances when requests surge, CPU/ memory is tight, and scaling down instances when requests are low and resources are abundant

To rapidly expand capacity, ensure that performance indicators are monitored and data is reported on the Runtime

Serverless concept

“Serverless computing is a cloud computing execution model in which the cloudprovider allocates machine resources on demand, taking care of the servers onbehalf of their customers.

Serverless: PaaS and BaaS

  • FaaS (Lambda): “Functions as services” is an event-driven computational execution model running in stateless containers where functions leverage services to manage server-side logic and state. It allows developers to build, run, and manage these application packages as features without having to maintain their own infrastructure.

    • AWS Lambda
    • Google Cloud Function
    • The Aliyun function computes FC
    • Tencent Cloud Function
  • BaaS: Backend as a Service (BaaS) allows developers to focus on the front end of an application, especially when building or maintaining back-end services.

    • Google Firebase

Limitations of the FaaS implementation (Lambda)

The principle of FaaS charging by volume is difficult to implement in traditional container deployment solutions

  • Container cold start + service start takes seconds
  • Resident instance + standby mode can ensure the efficiency of the first visit, but the demand of charging by volume cannot be met

Node.js implements “high density deployment” based on VM modules

  • Isolation between functions?
  • Recovery of the loop?

Alternative: WASM/V8 Worker instead of Node.js

  • Deno Deploy
  • Cloudflare Workers
  • Wasm Edge

FaaS vs PaaS

FaaS vs. PaaS in terms of development experience

  • The functional model is too simple
  • Writing multiple cloud functions is not engineering friendly
  • App developers want to write more! Publish a complete Node.js WebApp

Jamstack mode and Vercel exploration: Split the PaaS application into static resource CDN+ several FaaS functions

Monitor the operational

Log burying point and monitoring alarm

The log

  • process.stdout/process.stderr
  • send through udp socket

Burial point/alarm

  • Metrics
  • Span
  • Trace

Online Troubleshooting

Pre-process: Instances are pulled out of the cluster

Before diagnosis, pull out the cluster to prevent external users from being affected

Node.js Inspector

Node.js provides the Inspector module, which enables debugging of running services

Inspector also supports a HeapSnapshot/CPUProfile at runtime to troubleshoot CPU/ memory problems

Strace and tcpdump: a more general system diagnostic tool

  • Tcpdump captures data that is actually transmitted over the network, which is useful for Web Server development scenarios
  • Strace can clearly output the parameters and return results of each syscall between applications and kernel, and is a universal tool for understanding system calls

Strace: Check out syscall’s tools

Write an HTTP server and view and analyze the system calls through strace -p after starting

Tcpdump: a universal packet capture tool

Tcpdump is a cross-platform packet capture tool that allows you to see every request packet transmitted on a network device and is available on Windows, Mac, and Linux

Common Filter commands

  • Host /net Specifies the request host/ IP address
  • Port Specifies the requested port
  • DST and SRC are used to specify whether rules are used to specify the source or destination of a packet
  • And and or logical and, or relationships used to combine multiple sets of filtering rules.

Tcpdump: Use the Wireshark to view captured packets

Tcpdump can write captured packets to files and view the packets in the Wireshark