The author | & TongJian such as tan favored

A, Apollo,

Apollo is a fully open autonomous driving platform independently developed by Baidu. It will help partners in the automotive industry and autonomous driving field combine vehicles and hardware systems to quickly build a set of autonomous driving systems of their own.

Apollo, as a complex autonomous driving system, contains the following important components:

  • Perception: Obtain environmental data around the vehicle through various sensors installed in the car body, such as lidar, camera and millimeter wave radar. Using the multi-sensor fusion technology, the end-sensing algorithm can calculate the location, category and speed orientation of traffic participants in the environment in real time. Big data and deep learning technology accumulated for many years are behind this automatic driving awareness system. Massive real road test data are marked by professionals and turned into learning samples that can be understood by machines. Large-scale deep learning platform and GPU cluster greatly shorten the time spent in offline learning of large amounts of data. The latest models are updated online from the cloud to the in-car brain. The ai + data-driven solution enables Baidu’s unmanned vehicle awareness system to continuously improve detection and recognition capabilities, and provide accurate, stable and reliable input for the decision-making, planning and control module of autonomous driving.

  • Simulation: As an important part of Apollo, the simulation service has a large amount of actual road conditions and automatic driving scene data. Based on large-scale cloud computing capacity, the virtual operation capacity of millions of kilometers per day is created. With open simulation services, Apollo partners can access a wide range of autonomous driving scenarios to quickly complete testing, validation and model optimization in a comprehensive, safe and efficient manner.

  • High-precision map and positioning: Baidu pioneered the extensive application of deep learning and artificial intelligence technology in map data production, and is one of the few high-precision map data providers with mass production capacity in China. Baidu self-positioning system based on GPS, IMU, high-precision map and a variety of sensor data can provide centimeter-level comprehensive positioning solutions. It aims to provide customized software and hardware integration products according to different application scenarios, while ensuring controllable cost and adjustable precision.

  • End-to-end: This autonomous driving solution is being explored due To its low cost and low engineering complexity. Through the use of a large number of real road data collected by the map acquisition vehicle, completely based on deep learning to construct the lateral and longitudinal driving model, fast practice in the real car. This open horizontal, vertical model source code and 10 thousand kilometers of data.

  • Decision planning: As the vehicle is equipped with a comprehensive prediction, decision and planning system, Baidu autonomous driving vehicle can make corresponding trajectory prediction and intelligent planning according to real-time road conditions, speed limit and other conditions, while taking into account safety and comfort, improve driving efficiency. Autonomous driving capabilities that fix roads day and night are now open.

  • Intelligent control: The control and chassis interaction system of Baidu autonomous driving vehicle is accurate, universal and adaptive. Can adapt to different road conditions, different speeds, different models and chassis interaction protocols. Apollo open tracking autopilot capability, control accuracy will be 10cm class.

  • Data open platform: Apollo data open platform will be open by open source code and data, and form a “car side + cloud” full open ecological, strong software, algorithms for research and development ability, but the lack of data accumulation or computing power developers and partners, to provide a variety of flexible data, calculation and marked ability quickly. By opening up relevant technologies and resources, we hope to bring together global developers and partners from various industries to build an open ecosystem of autonomous driving, empower every participant and promote the popularization of autonomous driving technology.

  • Reference hardware: On-board hardware is an essential part of autonomous driving. Apollo provides a complete hardware reference for developers worldwide, including vehicle selection, core hardware selection, and ancillary hardware. We also provide a detailed hardware installation guide to ensure that developers do hardware assembly without obstacles, providing reliable guarantee for software integration and vehicle on-road.

  • MAP Engine: It is a high-precision MAP data management service for vehicle-mounted terminals. It encapsulates the organization and management mechanism of MAP data, shields the details of underlying data, and provides a unified data query interface for application layer modules. It contains elements retrieval, spatial retrieval, format adaptation, cache management and other core capabilities, and provides a modular, hierarchical, highly customized, flexible and efficient programming interface, users can easily build exclusive terminal high-precision map solutions based on this.

  • DuerOS: For connected cars, DuerOS provides a full voice interactive intelligent vehicle link solution. It is committed to providing users with one-stop vehicle life services, such as map navigation, intelligent question-and-answer and personalized audio content recommendation, and continuously enhancing the multidimensional capabilities of the vehicle scene through the open platform to empower car enterprises.

  • Security: Apollo, based on the isolation and credible security system provides a perfect security framework and security system components, Apollo security system through vehicle security firewall, GeLiChe inside and outside network, car body and machine network, guarantee the system independent of network boundary firewall by car can also be filtered into every vehicle of instruction, Ensure that only trusted instructions are properly executed; Apollo security provides source authentication, confidentiality, and trust for every piece of information generated by autonomous driving systems through network security components deeply embedded in the kernel. Apollo security measures and monitors the safety of the system from the start of system boot to ensure that every function running on the autonomous driving system is legitimate and reliable. Apollo security also provides complete OTA security capabilities to ensure that every system upgrade does not result in a hacking incident. Apollo security ensures the safe and orderly operation of functional components by ensuring network, OS, cloud, and OTA security.

On July 5, 2017, Baidu Developer Conference officially released Version 1.0 of Apollo open platform, and simultaneously opened the source code on GitHub. According to the opening roadmap of Apollo, Apollo 2.0 has been released on Gitub, which can realize automatic driving on simple urban roads. It can realize cruise control, automatic avoidance of obstacles, make judgments according to traffic lights, change lanes and other important functions. The capabilities that Apollo 2.0 has opened up can be shown in the figure below.

With more than 70 partners participating in the Apollo program, we can see how developers can meet the needs of different application scenarios through the technology capabilities opened up on the Apollo platform.

Second, the PaddlePaddle

PaddlePaddle, a deep learning platform independently developed by Baidu, is a massively parallel distributed deep learning framework that is easy to learn, easy to use, efficient and flexible, and supports business needs in multiple fields such as massive image recognition and classification, machine translation and automatic driving. It is now fully open source.

PadddlePaddle has been applied in many businesses of Baidu since 2013. It has been developed in Baidu for about 3 years and officially opened source in September 2016. Then PaddlePaddle underwent a process of high-speed iterative optimization. The latest version is PaddlePaddle Fluid, released in November 2017, which gives PaddlePaddle more powerful features and advantages.

  • Scalability: Paddle officially supports multiple cluster frameworks, MPI, Kubernetes, and dynamic allocation of GPU cluster resources;

  • High efficiency: For example, PaddlePaddle can achieve 1-2 times faster than the mainstream DL framework on the premise of occupying less video memory resources.

  • Flexibility: PaddlePaddle can be rapidly expanded from single-machine training to large-scale cluster training because for PaddlePaddle, the single-machine and multi-card codes are exactly the same;

  • Fluid: Make the development process of deep learning closer to high-level language, expose more information of training process, and greatly help the optimization of training process;

  • EDL: Elastic deep learning, which can adjust the number of distributed tasks according to existing computing resources to ensure that those tasks with the risk of lack of computing resources can be executed.

  • Cloud: Developers do not need to consider writing and executing distributed programs, and can directly submit tasks to run on multiple Gpus on multiple servers, greatly expanding the use of computing resources;

On the level of openness to the outside world, PaddlePaddle deep learning framework only cannot effectively help the majority of developers to train models. Baidu is fully open in the four major elements to facilitate developers to build a closed loop of AI technology research and development.

  • Big data: Relying on the National Engineering Laboratory, Baidu will work with more third parties to build open data sets and solve data source problems for more developers and enterprises.

  • Computing power: On the engineering platform of national Engineering Laboratory, we provide free computing resources for qualified enterprises and developers;

  • Algorithms & Models: PaddlePaddle for this level, as mentioned earlier, is fully open source on GitHub;

  • Scenario: Paddle is working with a number of AI technology vendors and companies to open up the application scenarios implemented through Paddle to let people know in which scenarios deep learning can be of value and how to use Paddle to implement such applications;

PaddlePaddle has also received 6K+ Star and 1.6K + fork on GitHub. More developers will make use of PaddlePaddle deep learning framework and open capabilities such as data and computing resources to implement their own AI application scenarios.

Third, ECharts

2017 is a year of small explosion in the domestic data visualization industry. As one of the leading products, ECharts completed a large number of product optimizations and released a series of new innovative features in this year, gradually completing its product matrix layout in the field of visualization.

ECharts has released 17 versions of its main products in 2017, adding new chart types such as pictograph, theme river chart, calendar chart, tree chart, etc. The echarts-Stat plug-in was released, which was the first to support built-in statistics capabilities in the chart product; The customization and extension mechanism of charts has been greatly enriched and enhanced, allowing users to develop more personalized visual presentation and interaction based on ECharts. In cases where built-in chart types do not suffice, the powerful custom family allows users to design special chart rendering to maximize their customization needs.

Outside of the main release, EChartsX from two years ago represented what we had hoped ECharts would do in 3D, and EChartsGL was the result of a product we could actually use in our production, and the culmination of more than five years of WebGL research by the team. In addition, ECharts’ underlying dependency ZRender was revolutionized by the end of the year with support for both Canvas and SVG, allowing ECharts developers to choose the underlying rendering technology that is more appropriate for their own chart application scenarios.

In addition to product upgrades, ECharts also actively participates in the industry community, becoming a member of the Visualization and Visualization Analysis Committee of the Chinese Society of Image and Graphics.

In early 2018, ECharts will release its first major overhaul in two years, which it hopes will bring more valuable products and features to developers and users.

Related links:

http://echarts.baidu.com

https://github.com/ecomfe/echarts

Fourth, the MIP

MIP (Mobile Instant Pages – Mobile Web Accelerator) is an open technical standard applied to Mobile web Pages. By providing MIP-HTML specification, MIP-JS running environment and MIP-cache page caching system, mobile web page acceleration is realized.

MIP is organized in three main parts:

1. MIPHTML

2. MIPJS

3. MIPCache

MIP HTML developed a new standard based on the basic tags in HTML. By limiting the use of some basic tags or extending functions, HTML can display richer content. MIP JS can ensure fast rendering of MIP HTML pages. MIP Cache is used to Cache MIP pages to further improve page performance.

By the end of 2017, the number of Internet pages using MIP technology in China reached 1.4 billion, and the number of sites had exceeded 10,000. After actual verification, the page access speed increased by 30%~80% after the transformation, realizing the second opening of the web page, greatly enhancing the user experience, so how does MIP do this?

  • Well-designed JS: To remove bloated client scripts, MIP files do not allow custom JavaScript; For functions that rely heavily on JavaScript, such as advertising, statistics, and interaction, MIP provides packaged components that are compatible with the MIP Runtime. JavaScript reference rules:

  • Currently, MIP does not allow users to customize JavaScript, which needs to be imported in the form of MIP components to ensure security and performance.

  • Mip-iframe can be referenced to introduce functionality for partially rich interaction, so that the main page rendering will not be affected even if the most performance-sensitive Document.write is used for development;

  • The MIP component is open source, allowing developers to customize functional components, and the project will continue to provide a variety of components to meet different needs;

  • All static resources need to be dimensioned: During page development, resources are often not set in width or height, especially when injected using ads or through calls to document.write(). Pages are often redrawn repeatedly because of uncertain resource sizes; MIP now requires all resources (ads, images, audio, and video) to be sized. When the resource is actually loaded, all resource sizes can be immediately inferred and quickly used to calculate the layout of the page. The loaded resource is rendered seamlessly, and the user’s reading experience is not affected by the frequent layout changes of the page.

  • Do not allow any mechanism to block page rendering: any custom script created by the developer needs to be fed back to the MIP with the MIP tag, such as MIP-AD, MIP-iframe, etc. These methods do not block page layout and rendering.

  • Control the loading of external resources: THE MIP Runtime controls the loading of external resources to ensure that it is efficient so that what the user wants to read appears on the screen as quickly as possible.

  • Encapsulating interaction: MIP advocates web pages that give users a straightforward and simple experience, but this does not mean that MIP limits the vividness and fun of the page. MIP Runtime provides a highly optimized, packaged JavaScript that requires less effort to implement complex interactions.

  • Inline CSS is recommended: CSS loading prevents page rendering, and CSS inlining reduces client overhead.

  • Gpu-only animations: MIP only allows animation effects to be completed using transforms and opacity, only triggering layer merges when animations can be executed on the GPU.

  • MIP cache: Another important significance of MIP is to help webmasters speed up web pages. Baidu will cache MIP web pages into Baidu CDN. The MIP cache can be used as long as it meets the MIP standard.

Related links:

MIP website: https://www.mipengine.org/

GitHub:https://github.com/mipengine/

Fifth, Bigflow

BaiduBigflow (hereinafter referred to as Bigflow) is a set of computing framework of Baidu. It is committed to providing a set of simple and easy to use interfaces to describe users’ computing tasks, and enables the same set of codes to run on different execution engines.

Many of its design ideas are borrowed from Google Flume Java and Google Cloud Dataflow, and some of its interface design is borrowed from Apache Spark. Users can write their own logic as if they were writing a stand-alone application, and Bigflow distributes the computation to the corresponding execution engine.

The goal of Bigflow is to make distributed programs easier to write, easier to test, more efficient to run, easier to maintain, and cheaper to migrate. Bigflow is currently connected with Baidu’s internal batch computing engine DCE (similar to Tez), the iterative engine Spark, and the company’s internal streaming computing engine Gemini.

Bigflow has the following advantages:

  • High performance: The interface design of Bigflow enables Bigflow to perceive more detailed attributes of user needs, and Bigflow will optimize jobs according to calculated attributes; In addition, its execution layer is implemented by C++, and some of the user’s code logic will be translated into C++ execution, which has a great performance improvement. In the company’s actual business tests, the performance is much better than the user’s handwritten job. On average, some of the jobs rewritten from the existing business improved performance by 100%+ over the original user code. An open source version of Benchmark is being prepared.

  • Easy to use: Bigflow’s interface looks a lot like Spark on the surface, but when used in practice, Bigflow uses some unique designs that make Bigflow code more like a stand-alone application. For example, it shields the Partitioner concept, supports nested distributed data sets, and makes its interface much easier to understand. And it has stronger code reusability. In particular, because Bigflow can automatically optimize performance and memory usage in many scenarios that need optimization, users can avoid many of the optimization work that must be done due to OOM or lack of performance, reducing the user’s usage cost.

  • Python friendly: Python is a first-class citizen here and the language we currently support natively is Python. When using PySpark, there are many users who are troubled by the inefficiency of PySpark, or by the fact that it does not support certain CPython libraries, or by the fact that some features are only available in Scala and Java and are temporarily unavailable in PySpark. In Bigflow, none of the above problems are a problem. The performance, functionality, and ease of use are all Python friendly.

Bigflow has been officially open source on GitHub since November 27, 2017. Currently, a version of Bigflow has been released. In just over a month, it has gained nearly a thousand stars and nearly a hundred forks.

Six, BRPC

The most commonly used industrial-level RPC framework in Baidu, with over 600,000 instances (excluding clients, currently over 1 million) and over 500 services, is called “Bidu-RPC” in Baidu. Currently only open source C++ version. You can use it:

  • Build a service that supports multiple protocols on the same port, or access HTTP/HTTPS, h2/ H2C, redis and memcached, RTMP/FLV/HLS, hadoop_RPC, RDMA, baidu_std, streaming_rpc, Hulu_pbrpc, SOFA_pBRPC, NOVA_pBRPC, public_pBRPC, UBRPC and other protocols used inside and outside Baidu.

  • Access protobuf-based protocols from other languages via HTTP+ JSON.

  • Build highly available distributed system based on industrial-grade RAFT algorithm implementation (to be open source in BRAFT)

  • Create rich access patterns: services can handle requests synchronously or asynchronously; Access services synchronously, asynchronously, or semi-synchronously; Use combined channels to declaratively simplify complex branching or concurrent access.

  • Through HTTP debugging services, use CPU, Heap, and Contention profilers.

  • Better latency and throughput.

  • Quickly incorporate the protocols used in your organization into BRPC, or customize various components, including name services (DNS, ZK, ETCD), load balancing (RR, Random, consistent hashing)

Friendly interface

There are only three (main) user classes: Server, Channel, and Controller, which correspond to the Server side, client side, and parameter set respectively. Building services? Include BRPC /server.h and refer to notes or examples; Access to services? Include BRPC /channel.h and refer to notes or examples; Adjust parameters? Let’s look at BRPC /controller.h. We’re trying to make things even simpler. In other RPC implementations, you might have to copy a long piece of obscure code to access BNS. DNS is Init (” http://domain-name “,…). Is “file:///home/work/server.list”, local file list.

Improve Service Reliability

BRPC is widely used in Baidu: Map-reduce service and table storage; High performance computing and model training; Various indexing and sorting services… It is a tried and tested implementation. You can use a browser or curl to view the internal state of the server, analyze the CPU hot spot, memory allocation and lock contention of online services, calculate various metrics through BVAR and view them through /vars.

Better latency and throughput

While most RPC implementations claim to be “high performance,” numbers are just numbers, and it is difficult to achieve high performance in a wide range of scenarios. In order to unify the communication architecture within Baidu, BRPC goes further than other RPCS in terms of performance.

In BRPC, the reads of different FDS are completely concurrent, as are the parsing of different messages in the same FD. Parsing a particularly large Protobuf message does not affect other messages from the same client, let alone other clients.

  • Writing to the same FD and to different FD is highly concurrent. When multiple threads write to a single FD (as is common with single connections), the first thread writes directly to the original thread, and the other threads commit their write requests in wait-free mode. Multiple threads can still write 5 million 16-byte messages to the same FD in 1 second, even with a high level of contention.

  • Lock as few as possible. High QPS services can make full use of a machine’s CPU. Such as creating bThreads for processing requests, setting timeouts, finding RPC context based on replies, and recording performance counters are highly concurrent. Even if the service has more than 500,000 QPS, users rarely see contention for locks caused by the framework in contention profilers.

  • The number of server threads is automatically adjusted. Traditional servers need to adjust their thread count for downstream latency, or throughput may suffer. In BRPC, each request runs in a newly created BThread, and the thread ends when the request ends, so naturally the number of threads is automatically adjusted based on the load.

BRPC opened source on GitHub on September 15, 2017. In just a few months, it has gained nearly 6000 stars, 1200+ forks and 150+ issues, showing its popularity in the open source community.

A link to the

BRPC:https://github.com/brpc/brpc

Seven, San

San is a set of MVVM component framework for the Web, with the following characteristics:

  • HTML templates: Declarative templates that write views as if they were writing a normal page, more in line with HTML developers’ habits.

  • Data-driven: When you modify the data, the view engine automatically refreshes the view based on the binding relationship, eliminating the tedious manual calls to the DOM API and possible omissions.

  • Componentization: Components are aggregates of data, logic, and views. With components, we encapsulate individual chunks of functionality, ranging from a combination of inputs to a page.

  • High-performance view: By modifying data, the view engine can directly refresh the view area that needs to be changed without any detection, resulting in higher performance.

  • Compact size: 12K (Gzipped) size, no need to worry about the burden of page download. The Gospel of volumetric obsessive-compulsive disorder, even when applied to mobile devices, is not a big burden.

  • Good compatibility: Another benefit of modifying data through methods is better browser compatibility. After all, sometimes the audience environment for our products is not that advanced.

  • Freedom of use: You can choose ESNext Module or AMD management Module in your project. Of course, global variables are also supported if you want to use them.

In addition, we have provided some necessary tools for Web application development, and we are still enriching them:

  • Router: Router that supports hash and HTML5 modes and is typically required for single-page or isomorphic Web applications.

  • Store: Application state management suite, the idea being a one-way flow similar to Flux.

  • Update: an Immutable object Update library. It works with san-store to Update application status data.

  • DevTool: developer tool based on Chrome extension

  • MUI: Component library based on the Material – Design specification

San has been open source since 2016, and 16 core contributors have contributed 846 commits and released 71 releases. In terms of impact, San currently accumulates nearly a thousand stars, 100+ forks. Learn more about the characteristics of San, documentation and its surrounding, you can refer to website https://ecomfe.github.io/san/

Eight, Palo

Palo is an MPP database for reporting and interactive analysis that integrates technologies such as Google Mesa and Apache Impala. Unlike other popular SQL-on-Hadoop systems, Palo is designed to be a simple, single, tightly coupled system that doesn’t depend on other systems. Palo not only provides high concurrency and low latency point query performance, but also provides high throughput AD hoc analysis queries; Provides not only bulk data loading, but also near-real-time small-batch data loading. Palo also features high availability, reliability, fault tolerance, and scalability.

The implementation of Palo includes two daemons: the front end (FE) and the back end (BE). The architecture is shown below:

Based on the above architecture, the main functions of Palo are as follows:

  • High-performance column storage engine;

  • Small batch update, batch atomic submission, multi-version support;

  • Efficient distributed data import;

  • Supports RollupTable, Scheme Change, and Data Recovery.

  • A relatively complete distributed management framework makes the whole PALO easy to operate and maintain;

  • Supports two-layer partitioning to reduce I/O overhead.

  • Storage tiers support SATA for old data and SSD for hot new data.

  • MPP Query Engine — Large Query with low concurrency + small Query with high concurrency and low latency

  • Mysql network protocol, can easily get through with a variety of upper tools;

  • Support for multiple table joins (one large table and multiple small tables; Large tables versus large tables that can be loaded into distributed memory);

  • Rollup table intelligent selection;

  • Support predicate push-down;

At present, Palo has been conducted on making open source, click on https://github.com/baidu/palo to view the Palo related information.

Recommended reading

(Click the picture to jump to)

Apple is no longer quietly working on Service workers. In Safari 46 technical Preview, the new desktop Safari will now have Service workers enabled by default!

Apple’s attitude was not only clear, but it moved with surprising speed,The PWA outbreak is supposed to happen this year!

Service workers will be supported by iOS11.3 and MacOS10.13.4 by default.

Brilliant Open Web 

The BOW (Brilliant Open Web) team is a dedicated Web technology building group dedicated to promoting Open Web technology and bringing the Web back to the forefront of developers.

BOW focuses on the front end, on the Web; Analyze technology and share practice; Talk about learning. Talk about management.

Follow OpenWeb developers, reply to “Add group”, let’s promote the development of OpenWeb technology together!


OpenWeb developers

ID: BrilliantOpenWeb

Technology connects the world, openness wins the future