Hadoop in action

Hadoop is an open source distributed computing platform owned by the Apache Software Foundation. Based on Hadoop Distributed Filesystem (HDFS) and MapReduce (the open source implementation of Google MapReduce), Hadoop provides users with a Distributed infrastructure with transparent details at the bottom of the system. HDFS has the advantages of high fault tolerance and high scalability, allowing users to deploy Hadoop on inexpensive hardware to form distributed systems.

The MapReduce distributed programming model allows users to develop parallel applications without understanding the underlying details of a distributed system. Therefore, users can easily organize computer resources with Hadoop to build their own distributed computing platform, and make full use of the computing and storage capabilities of clusters to complete the processing of massive data.

This book is a systematic and practical Hadoop reference book and reference book. The content is comprehensive, giving a comprehensive explanation of the entire technical system of Hadoop, including not only the two core contents of HDFS and MapReduce, It also includes Hadoop related sub-projects such as Hive, HBase, Mahout, Pig, ZooKeeper, Avro, and Chukwa. Strong practical, for each knowledge point carefully designed a large number of classic small cases, easy to understand, strong operability.

Spark big data analysis

Spark big data technology is still developing in full bloom. Spark China Summit was held and meetup was held in various places, and the open source software Spark is also gaining momentum. Many companies have implemented and applied Spark on a large scale. The requirements of Spark users have changed from initial deployment, installation, and running instances to building rich data analysis applications using Spark. Writing a technical book on Spark, a practical case study, is – an idea that has been around for a long time. Due to my intense work, I only summarized the Spark related cases I participated in or learned at first. However, as time goes by, I intend to abstract and simplify the common algorithms, system architecture and application scenarios, which can also be regarded as a summary and sharing.

Third, Hbase

HBase is a NoSQL storage system designed to quickly and randomly read and write large-scale data. HBase runs on common commercial servers and can scale smoothly to support data sets with billions of rows and millions of columns. .

This book is an experience-based guide that teaches you how to design, build, and run big data application systems using HBase. The book is divided into four parts. The first two parts respectively introduce the development history of distributed systems and large-scale data processing, and explain the basic principles and patterns of HBase and how to use advanced features of HBase. In the third part, some practical HBase technologies are further explored through real applications and code examples as well as theoretical knowledge supporting these practical techniques. Part four explains how to upgrade a prototype development system to a full-fledged production system.

Principle, design and practice of distributed service architecture

This book takes the current popular distributed service architecture as the main line, explains the principle, design and practice of distributed service architecture.

This book first introduces the background and evolution of distributed service architecture, and then deeply expounds the design ideas and implementable schemes to ensure the uniformity, high performance and high availability of distributed service. Then, the emergency flow and technical tackling process of large-scale and high-concurrency online service are introduced, and the effective and common tool set for finding and locating problems is given. Finally, the tools of containerization process analysis, agile development and online in distributed service architecture are introduced in detail, which provide convenience for developers engaged in high concurrency service architecture

5. Netty actual combat Principle

Netty is a Java framework for rapid development of high-performance Web applications. It encapsulates the complexity of network programming and makes the latest advances in network programming and Web technology accessible to a wider range of developers than ever before. Netty is more than just a collection of interfaces and classes; It also defines an architectural model and a rich set of design patterns. But until now, the lack of a comprehensive, systematic user guide has been a barrier to getting started with Netty, something this book aims to change. In addition to explaining the details of the framework’s components and APIS, the book shows how Netty can help developers write more efficient, reusable, and maintainable code.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

That’s awesome! Finally, someone explained Hadoop+Spark+HBase+Netty+ distributed

Hadoop in action

directory

Spark big data analysis

directory

Third, Hbase

directory

Principle, design and practice of distributed service architecture

directory

5. Netty actual combat Principle

directory

That’s awesome! Finally, someone explained Hadoop+Spark+HBase+Netty+ distributed

Hadoop in action

directory

Spark big data analysis

directory

Third, Hbase

directory

Principle, design and practice of distributed service architecture

directory

5. Netty actual combat Principle

directory

Related Posts

Illustration of Redis integer intset upgrade process

LOF outlier detection algorithm and PYTHon3 implementation

Prefix tree specific