Lucene is the most popular search engine in the industry. Solr and ElasticSearch are both implemented based on Lucene. But as the data volume continues to increase, when under the scenario of the trillions of data, all the normal operation of huge amounts of data to bring the huge pressure, how to retain the Lucene efficient full-text retrieval ability of cases to the challenge of trillions of data, at the same time to break the big data technology stack components function of a single, fit complicated problem. With that in mind, we will be sharing the problems and solutions we have encountered in implementing the Trillion-data Challenge based on Lucene over the years at QCon.
01 Introduction to lecturer
Zheng Qihua, Technical director of Xinshu Software
-
Former senior engineer of FNST (Fujitsu Nanda), Fujitsu System monitoring middleware product project manager, more than 10 years of software development and maintenance experience
-
Fujitsu Middleware Lifecycle Management and Job Management certified expert
-
Responsible for huawei RTOS (real-time embedded operating system) maintenance, has rich experience in Linux kernel, system monitoring and other aspects
-
Invited speaker of 2020 Automotive Enterprise Digitization Seminar of China Automotive Research Institute
02 Content Notice
The challenge and implementation of terabyte data
One trillion challenges: data storage
How to solve the problem of unbalanced read/write and automatically balance disks?
How to solve data security problems to prevent disk damage and deletion from affecting production?
How to solve the hardware problem of high data storage cost and heavy dependence on SSDS?
The second trillion challenge, retrieval performance
How to achieve second-level response in full text retrieval of trillions of data?
Trillion challenge number three, multidimensional statistics
How to reduce I/O consumption and export millions of data instantly?
Trillionchallenge number four: region search
How to improve the ability and accuracy of geographic location retrieval?
Trillionchallenge number five, computing frameworks
How can I improve Spark performance to greatly improve the system response time?
See you in Beijing on May 29th!
QCon Global Software Development Conference (Beijing)