Tag: Big data

How do NameNode and SecondNameNode work?

January 12, 2024

by Catherine Mann

No Comments

In the previous study, we know that Namenode is responsible for the metadata management of HDFS data rectification, while Secnodnamenode is an auxiliary role of...

Read More

Artificial intelligence (ai)

Another update to the big data ecosystem! Containerization into a major trend

January 12, 2024

by Anya Dass

No Comments

Recently, something happened in the big ecosystem data systems space: Cisco (CSCO) is combining an AI hardware framework with a new deep learning server powered...

Read More

The code of life

The construction of the Hadoop

January 12, 2024

by 王佳慧

No Comments

The pseudo-distributed build of Hadoop requires the installation of JDK1.8 in advance, and the selection of hadoop3.0.0. https://archive.apache.org/dist/hadoop/common/hadoop-3.0.0/, the building USES binary package installation, does...

Read More

From Hadoop to Spark and Flink, big data processing framework has been developing for ten years!

January 12, 2024

by Tessa McIntyre

No Comments

In the current data era, a large amount of data is generated in various fields and business scenarios all the time. How to understand big...

Read More

Canal: Mysql incremental data synchronization tool, a detailed explanation of the core knowledge

January 12, 2024

by Ms. Sheila Price

No Comments

Data sources in the field of big data include business database data, mobile embedded data and log data generated on the server. When we collect...

Read More

Application practice of Flink in artificial intelligence field

January 12, 2024

by Kerry Rees

No Comments

Artificial intelligence is the most important technological revolution and driving force in the next decade, playing an increasingly important role in all walks of life....

Read More

Flink 1.11 released, talk about your work and open source

January 12, 2024

by Mishti Sibal

No Comments

Flink version 1.11 was released. I was lucky enough to contribute some PR to Flink and get my name on the Apache website for the...

Read More

Environment construction of Hadoop3.2.1 version

January 12, 2024

by 曾哲瑋

No Comments

Recently, someone asked if they could post some knowledge about big data. No problem! Today, start from the installation environment, build up their own learning...

Read More

【 lash learning 】 Notes to share Flink- middle Volume (Advanced Chapter)

January 12, 2024

by Christopher Marsh

No Comments

One of the most extensive applications of distributed caching in our actual production environment is in the Join operation between tables. If one table is...

Read More

The code of life

Spark source code analysis (2) Stage division and scheduling mechanism exploration

January 12, 2024

by James Gonzalez

No Comments

A Spark application consists of three concepts: Job, Stage, and Task. 1 Job is bounded by an Action method. An Action method triggers a Job....

Read More

Tencent this product, let me can not help but hit a cold shiver

January 12, 2024

by Robert Cole

No Comments

Last night when I was working overtime, my colleagues were sitting together eating and chatting. For me, I couldn't get a word in edgeways about...

Read More

Agora Tutorial | How to build an education platform for 15mins

January 12, 2024

by Mr. John Morton

No Comments

For educational institutions, there are two common options for building teaching platforms: SaaS platform that can be used directly and PaaS platform that is independently...

Read More

Big Data growth path — Hadoop word frequency statistics

January 12, 2024

by Shray Uppal

No Comments

Small knowledge, big challenge! This article is participating in the "Essentials for Programmers" creative activity. This article has participated in the "Digitalstar Project" to win...

Read More

【 lash learning 】 Notes to share “bombing “Flink- vol. 2 (production chapter)

January 12, 2024

by 楊佳慧

No Comments

By default, we have only one instance of JobManager per cluster, and if the JobManager crashes, our jobs will fail and we will not be...

Read More

Get started with ClickHouse quickly

January 12, 2024

by Chelsea Martin

No Comments

This article is based on a compilation shared with the public a few months ago, with some expansions. If you have a potential data analysis...

Read More

Metadata – Consanguinity analysis practice

January 12, 2024

by Reece Iqbal

No Comments

The introduction of large data or contact with the number of warehouse students, I believe that they have heard of data governance, blood analysis of...

Read More

Hive use must know must know series

January 12, 2024

by Hilary Reed

No Comments

Note: Partitioned tables are usually divided into static partitioned tables and dynamic partitioned tables. The former can be partitioned statically when data needs to be...

Read More

Elasticsearch+Fluentd+Kafka Creates the log system

January 12, 2024

by Hayley Price

No Comments

ELK is gradually being replaced by EFK due to the large memory footprint of Logstash and relatively poor flexibility. Elasticsearch+Fluentd+Kfka is the EFK that Kibana...

Read More

Hive friend recommendation

January 12, 2024

by Sarah McKay

No Comments

Requirement Description In the recommendation service scenario, some applications recommended by friends will appear, such as QQ friend recommendation. So in earlier years, the algorithm...

Read More

Heart finishing | Spring AOP dry articles, illustrated, with AOP example ~

January 12, 2024

by Cameron Buchanan

No Comments

Why use AOP? During actual development, our application will be divided into many layers. Generally speaking, a Java Web application will have the following layers:...

Read More

Big Data (4) — Resource scheduling framework Yarn

January 12, 2024

by Debra Hill

No Comments

In MapReduce, many people may have this question: after MR is written, how do map tasks and Reduce tasks execute in parallel on multiple nodes,...

Read More

Data skew? Spark 3.0 AQE is for all kinds of unsatisfied

January 12, 2024

by Biju Acharya

No Comments

Spark3.0 has been around for half a year now, and this major update is focused on performance optimizations and documentation. 46% of the optimizations are...

Read More

Spring uses @scheduled to create Scheduled tasks

January 12, 2024

by Matthew Brennan

No Comments

Add a @enablesChedulingannotation to the Application startup class. Add a @Component annotation to the class that contains the scheduled task class. 3,

Read More

Application and Operation practice of Elasticsearch in log analysis field

January 12, 2024

by Inaaya Barman

No Comments

This sharing was brought to you by Zhao Hanqing, a senior engineer from Alibaba. Elasticsearch optimization experience Elasticsearch operation and maintenance practice Elasticsearch distribution

Read More

Dry goods | how to modify the existing ambari cluster host name

January 12, 2024

by John Taylor

No Comments

> < p style = "max-width: 100%; clear: both; min-height: 1em; The IP has been changed before

Read More

APaaS product Dynamic classroom: “low code” development, 15 minutes fast online

January 11, 2024

by Antonio Robertson PhD

No Comments

Sonnet has been committed to reducing the burden and threshold for developers through low-code, making the development of real-time interactive scenes more universal and convenient.

Read More

[Spark kernel] Resolve the Spark kernel

January 11, 2024

by Linda Benson

No Comments

The Spark kernel refers to the core operating mechanism of Spark, including the operating mechanism of Spark core components, Spark task scheduling mechanism, Spark memory...

Read More

I met KAFKA

January 11, 2024

by Leanne Russell

No Comments

KAFKA is an essential component of today's big data systems. This article will take you to get a preliminary understanding of KAFKA, understand the background...

Read More

Building user Portrait Big Data Environment — Building real-time user Portrait from scratch (IV)

January 11, 2024

by Alexis Lewis

No Comments

In this chapter, we start to set up the big data environment formally. The goal is to build a stable big data environment that can...

Read More

The code of life

Source code analysis submitted by Spark

January 10, 2024

by Arhaan Chowdhury

No Comments

Park submits tasks in client mode and cluster mode. We will focus on cluster mode

Read More