I am participating in the 2022 Spring Recruitment series – experience review, click here for more details
Title source
Author: is a pot of porridge links: www.nowcoder.com/discuss/745… Source: Niuke.com
The interview questions
-
What do you understand about big data?
Definition of big data: Technical tools primarily used to process massive amounts of data.
Big data started in 2012, when big data just became popular and the domestic ecology was not perfect. Pig and Storm, now declining, were still popular at that time.
Government support: In 2017 and 2018, the government strongly supported development and provided policy funding support, and in recent years, it has been promoted to national strategy.
With the current advantages and disadvantages of big data, many things have been gradually improved and entered a post-Red Sea era with higher requirements on real-time performance. Therefore, Spark Flink has become very popular in the past two years. Individuals can still enter the industry, but the threshold of the industry is gradually increasing.
The future of big data
Sql is a trend in the future. With the rise of the Internet of Things, more data means more big data, so it is necessary to charge supplementary technology.
-
What do you know about Hadoop?
Hadoop is the cornerstone of big data
Hadoop is divided into HDFS, MapReduce, and Yarn. HDFS stores data, MapReduce calculates data, and Yarn schedules resources.
In terms of the whole big data system, Hadoop is responsible for big data storage (mainly HDFS), and Spark or Flink is used for data processing. Spark+Hadoop for offline projects and Flink+Hadoop for real-time projects
From the perspective of version update, Hadoop3 has optimized many bugs of Shell compared with Hadoop2, and added erasable codes to compress the original space of 3 times into 1.5 times to provide more space for storage
-
What do you think are the strengths and weaknesses of Hadoop
Advantage:
(1) MapReduce is suitable for massive offline data storage and data processing with a certain tolerance for delay. (2) MapReduce is more stable than Spark and will continue to execute when a fault occurs. (3) Data storage is more secure than SparkCopy the code
Disadvantage:
(1) Data processing is 20 times slower than Spark. (2) Multiple tasks are not supported at the same time. (3) Disk based rather than memory based is also slowCopy the code
-
Talk about high availability
High availability is also the HA of NameNode, which adopts the double NameNode policy. When one NameNode fails, the other NameNode can still work, ensuring the security of stored data.Copy the code
-
Describe the application scenarios and characteristics of the different databases you have used
(1) Hive: offline mass database (2) Hbase: real-time mass database (3) ClickHouse: real-time mass database, supported SQL (4) ElasticSearch: full-text search engine (5) Mysql: RDBMS traditional data databaseCopy the code
-
Dig for resumes, dig for resumes
-
SQL problem, retention
Blog.csdn.net/Captain_DUD…
- In return, I learned that little red book bar is very high and HC is not much
After a detailed
Head side, 60min
- What do you understand about big data?
- What do you know about Hadoop?
- What do you think are the strengths and weaknesses of Hadoop
- Talk about high availability
- Describe the application scenarios and characteristics of the different databases you have used
- Dig for resumes, dig for resumes
- SQL problem, retention
- In return, I learned that little red book bar is very high and HC is not much
The interviewer is very nice, but the questions are very high-level, and I don’t know where to start
The second day received the second face invitation mail do not know the second face will be what kind of person face, responsible person?
— — — — –
The second interview was a week apart
The content of the second interview is really open, and I can’t remember it at all
10.11 Finally received three invitation
Three, the head of the data department for 30 minutes
He was interested in one of my experiences and started talking about that part of my job and that was it
Hr side
I got a lot of things like what was a stressful time in your life
The problem
10.25 Letter of intent lottery SP