Why use the Lambda schema?
In order to solve three problems caused by big data
- Accuracy (good)
- Delay (fast)
- Throughput (multiple)
For example: the problem of extending web browsing data records in the traditional way
- Start with a traditional relational database
- Then add a Publish/subscribe schema queue
- Then scale it up by horizontal partitioning or sharding
- Fault tolerance issues began to arise
- Data corruption is emerging
The key problem is that in the AKF extension cube, it is not enough to segment only one dimension horizontally from the X axis, and we also need to introduce the functional decomposition of the Y axis. The Lambda architecture can guide how to implement extensions for a data system.
What is the Lambda schema
If we define a data system as follows:
Query=function(all data)
Copy the code
Then a LAMda architecture is
batch view = function(all data at the batching job's execution time)
realtime view = function(realtime view, new data)
query = function(batch view. realtime view)
Copy the code
Lambda architecture Read/write separation (batch layer + service layer) + real-time processing layer
This article was originally published by Silicon Valley’s IO