I am participating in the 2022 Spring Recruitment series – experience review, click here for more details
The interview lasted about 50min, and the whole experience was good. The interviewer also explained that he was late and remembered the wrong time.
The following 👇 is the interview question: Since most of my project preparation for the interview was about data warehouse development, and I was interviewing for a big data development position, so the interviewer repeatedly confirmed with me during the whole interview whether to interview application development or data warehouse development…
1. Self-introduction 2. Briefly introduce the two projects + project selection + responsible for which part is the part of offline data warehouse 3. Ask questions around the project: the data model used by the project (the difference between star model and snowflake model is asked)
What is the difference between ClickHouse and Hbase what is the difference between Hive and Hbase
•Hive implements offline big data warehouses by building metadata and mapping HDFS files to build tables
•Hbase is a NoSQL database that uses upper-layer distributed memory and lower-layer HDFS to store big data in real time
What are the differences between Hive and Mysql?
How to Handle Hive Data Skew For details about how to solve the problem of data skew, see Hive Data Skew Solution
This section describes the MapReduce process of Hive
There are three stages of MapReduce:
Map phase: parallel processing phase shuffle phase: indicates the phase from leaving Mapper to the phase before reducing. Reduce phase: indicates the summary and collation phase
Eight steps of MapReduce
Set the InputFormat type of MapReduce. The default value is TextInputFormat. Customize the map function and obtain the K1 and v1 of The TextInputFormat. After processing, k2, v2 partition is transmitted — by default, k2 determines which reduce data in map should be sent to for sorting — by default, dictionary sorting protocol is carried out according to K2 — by default, there is no such stage, which is an optimization method. Groups can be merged in advance — the values of the same K2 will be put into the same set to customize reduce functions. The k2, v2 obtained by grouping will be converted into K3 and V3 output, and the output OutputFormat will be set. TextOutputFormat is adopted by default, and the result will be output to a plain text file
Hbase design principles (incomplete answer, I just wrote the interview question, the clown is my own ~)
Service rule: Match services and ensure that the prefix is the most commonly used query field unique rule: Each Rowkey represents only one data combination rule: Common query conditions are combined as rowkey hash rule: Rowkeys cannot be constructed with consecutive lengths Rule: To meet service requirements, the shorter the length, the better
Last question: data flow for the project
What are the basic data types of Java?
Java basic data types include Boolean, byte, short, int, long, char, float, double, and so on
Let’s talk about Java polymorphism and inheritance
Inherited subclasses can directly implement methods in the parent class, and selective extension polymorphisms that call the same method show different ways.
What is the difference between String, StringBuilder, and StringBuffer?
Char [] is final. The contents of the String cannot be modified
StringBuffer: mutable string, inefficient, thread-safe; StringBuilder: variable character sequences, high efficiency, thread unsafe;
Left join vs. right join inner join vs. outer join What is the leftmost prefix principle? What is the leftmost matching principle
As the name implies, left-most first. When creating a multi-column index, the most frequently used column in the WHERE clause is placed on the left-most, depending on business requirements. A = 1 and b = 2 and C > 3 and d = 4; a = 2 and C > 3 and d = 4; D (a,b,d,c); d (a, B,d,c); d (a, B,d); = and in can be out of order, such as a = 1 and b = 2 and c = 3. Create (a,b,c) indexes in any order. Mysql’s query optimizer will help you optimize them into a form that can be recognized by the indexes
InnoDB what… Did not review the entire army annihilation…
All in all, this is my first serious interview, and I will grade myself after the interview whether I fail or I am too lame…