“This article has participated in the good article call order activity, click to see: back end, big front end double track submission, 20,000 yuan prize pool for you to challenge!”
primers
Xiaoming is a Java backend programmer with 3 years of working experience. Some time ago, I saw the following news:
Here comes the full text of the 14th Five-Year Plan and the outline of 2035.
At first glance, Xiao Ming felt that this was big news for all Chinese people.
As a programmer, Xiaoming is concerned about whether there is any part of the program that involves computer software.
After reading through the text, Xiao Ming found several key words mentioned in it
- Artificial intelligence (ai)
- Big data
- Cloud computing
- Block chain
- Network security
Xiao Ming thought:
“That’s a lot more money than if I just worked on the Java back end.”
“But now that I’ve been working on the Java backend for a few years, can I move on to one of these promising industries?”
Xiao Ming generally checked a ha on the Internet, found that big data seems to be the most reliable, most of the big data technology framework is written in Java, and the key is that a small number of enterprises also welcome the Java backend to switch to big data.
However, Xiao Ming was still hesitant:
“I don’t know what the requirements of big data should be? Is Java only ok?”
Xiao Ming remembered that he had a distant cousin Daming, daming is engaged in big data, so Xiao Ming asked daming for his opinion.
Says:
“The transition from Java back end to big data is quite easy, but there are certain basic conditions.”
“In addition to programming with basic Java syntax, you should be familiar with Both Java concurrency and the JVM, because most of the big data technology frameworks are written in Java or Scala, high concurrency scenarios and JVM tuning scenarios are an integral part of big data development, and, You also need to be familiar with SQL and Linux, after all, it is popular to use SQL for data analysis, like I usually write MORE SQL code than Java, and basically all the big data technology framework deployed on Linux operating system, Linux importance. Finally, and most importantly, you have to think big data!”
Xiao Ming immediately asked:
“What is big data thinking?”
Daming did not answer, handed Xiao Ming a test paper:
“You have a look, there are 66 questions, you fully understand these 66 questions, then you switch to big data there is no problem!”
The paper
- Can you talk about your understanding of big data? What is big Data?
- Do you know the characteristics of big data?
- What does big Data have to do with cloud computing?
- What does big data have to do with ARTIFICIAL intelligence?
- Have you ever studied the past and present life of big data? How did big data develop?
- What is the basic process of big data processing?
- What does big data development mainly do?
- Do you think data quality is important? From what angles can the quality of the data be measured?
- Do you know what types of big data technology frameworks are?
- Why do you say the data doesn’t move the code moves? Or why is mobile computing more cost-effective than mobile data?
- Do you know what memory computing is? What are the advantages over hard disk computing?
- Do you think big data tuning is hardware resource tuning? Do you agree with this statement?
- Do you know what batch processing and stream processing are? What about bounded data and unbounded data?
- Do you know what the event time and processing time are?
- Do you know what ETL is?
- What are the benefits of DAG for big data processing?
- Have you heard of the Workflow design pattern?
- Do you know Google Dataflow?
- What is a distributed lock? How do you do that?
- What is a distributed transaction? How do you do that?
- Do you know the difference between a distributed lock and a distributed transaction?
- Do you know what the CAP theorem is?
- Do you know what BASE theory is?
- What are the metrics of distributed systems?
- Do you know what consistent models are?
- Do you know what SLA is?
- How do you estimate the QPS of a system?
- What do you think of the publish-subscribe model?
- What’s the difference between the publish and subscribe pattern and the observer pattern?
- What are the ways to slice data in a distributed system?
- Do you know the consistency hash?
- Why serialize data?
- How should we choose the data compression algorithm?
- How to choose a serialization framework in a distributed system?
- What’s the difference between column storage and row storage?
- How should we choose the column storage format?
- Do you know data warehouse?
- Do you know the difference between a data warehouse and a database?
- Have you heard of OLTP and OLAP? What’s the difference?
- How should a data warehouse be layered?
- How should the data warehouse be modeled?
- Do you know what a fact table and a dimension table are?
- Have you heard of business intelligence (BI)?
- Have you ever heard of MPP?
- Do you think MPP architecture is suitable for data warehouse?
- What are the parallel computing models?
- Do you know NoSQL?
- What do you think of load balancing?
- Do you know what the load balancing algorithms are?
- How do you implement forwarding in a distributed system?
- Do you know why you need a big data resource scheduling framework?
- What are the technical difficulties of resource scheduling?
- Do you know multi-tenancy technology?
- Do you know the inverted index?
- What kind of data do you think is enterprise data?
- Do you know anything about Data Lake? Why do you need a data lake?
- What’s the difference between a data warehouse, a data mart, and a data lake?
- Do you know the Lambda architecture?
- Do you know Kappa architecture?
- How do you apply the Lambda architecture to a data lake?
- What are the challenges facing enterprise Data Lake?
- Do you know RAID technology?
- Do you know why you need a workflow scheduling system?
- Do you know why message queues are needed?
- Have you heard of cloud native databases?
- What do you see as the future of databases?
The end of the
Click follow and all of the above questions will be updated in future columns