No advantage
SQL is more than just a database
- Massive scalability
- Read and write performance
- It complements a relational database (RDBMS)
No product
-
Key-value Redis/Codis
-
Column storage HBase
Hbase data analysis is often used
Copy the code
- Graph database
Neo4j knowledge pictures are used more
Copy the code
- Document type mongo
Directing a concept
Example: Describe people
Copy the code
- Relational database
- MongoDB
Mongo features
- Extensible (scalable)
- High performance (high – performance)
- Open Source NoSQL Database
- Written in C++
- Document-Oriented Storage
- Full Index Support
- Replication & High Availability
- Auto-Sharding
- Rich Querying
- Updates
- Map/Reduce
- GridFS stores binary files
Mongo stability
How to resolve data loss
- Recovery Log (Journal)
- Write on
Write to most nodes
Directing a high availability
How to achieve the core business SLASLA 99.99%?
Copy the code
MongoDB Replica Set
- Multiple data redundancy
- Cross-switch Deployment
- Faster way to vote (see raft protocol)
architecture
Master/Slave replication + HIGH availability solution
Copy the code
shard
architecture
1. For the business side, there is no concept of separate database and separate table
No matter how large the amount of data is, it is a single database and single table for the business side
For relational databases such as Myql, there is the concept of separate libraries and separate tables
For example, 1 TB of data MongoDB is divided into 2 slices. Each slice stores 500 GB of data
2. There can be many Routers
3. Fragment information is stored in config Server
4. A fragment is a replica setset)
5. The Router first accesses the Config server to obtain the shard information, and then the Router accesses the replicaset)
Copy the code
MongoDB Collection of tables
Fragmentation rules
Sharding Range-based
Range-based sharding is more commonly used
Mysql B+ Tree is also range-based in nature
Copy the code
Sharding Hash-based
Modulus shard
Java HashMap is Hash - -based
The query speed is fast
Range query is not supported
It is certainly not appropriate for a database not to support range queries
Copy the code
Sharded cluster architecture
1. All 3 configuration nodes and 3 routing nodes (Mongos) are stateless.
2. Each fragment is a replicaset(one master, two slave) 3 machines in one sharding
Cnofig and Mongos can also be deployed on a sharding
Copy the code
Application scenarios
1. Location is GPS or LBS
2, small and medium-sized companies with it no problem large amount certainly not
3. Anything that is not trading related can be used
A. Weak transaction support
B. Mongodb 4.0 already supports transactions and cross-row transactions
Copy the code
Expandable storage
1. The earliest operation engine was MMAP, which supported table-level locking as a built-in mechanism of the operating system
The disadvantage is low memory utilization
WireTigger supports row-level locking
Copy the code
Document the design
_ID Globally uniquely identifies the table granularity
If this parameter is not specified, a 12-byte objectiD will be generated by default
It takes up a lot of space
It is usually replaced by a service primary key such as uid. Otherwise, it is meaningless to increment a service primary key
Copy the code
Default _id generation rule (not recommended)
1, Collection is table level
2, a, 1 hexadecimal 4 bytes, that is, 4 bytes represent a character string
B, a byte of two bits equals two strings
The value of c and _id contains 12 bytes, that is, 24 character strings
3, the readability is very poor and takes up a lot of space
Copy the code
_ID Rule generation is recommended
1) Uint64_t actually uses the long type as 8 bytes
The objectid default type is also an integer, a long, and a floating point type
Copy the code
Free Shcama
Meaning repeated Schema, All Schema
How to deal with
Field name selection
Simplified field names how to ensure readability
MongoDB data volume limit
MongoDB limits each doc to 16MB
Small Data Scenario
Test scores, personal information
Copy the code
Large Data Scenario
The Killing of The Three Kingdoms and the Generals
Copy the code
Regular reference association
The embedded document
Logging scenarios reference Host&Log more effectively
Machines and logs cannot be embedded, only referenced
Locking mechanism
Pessimistic locking
Pessimistic lock concurrency control
- Use write locks to protect resources from simultaneous access
- Read and write operations are mutually exclusive
Pessimistic lock range
1, SQL parsing
Create a write command
3. Read a document from disk to read data into memory
4. Execute write commands for memory data
5. Make changes to the memory structure (lock only in this step)
6. Return the result
Copy the code
Optimistic locking
MongoDB 3.0wt MVCC mechanism
Optimists lock-free concurrency control
Meaning of optimistic lock
Compression algorithm
Snappy
Zlib (common)
Potholes encountered and solutions
Mass deletion of data and solutions
background
The solution
Massive data fragmentation (voids) and solutions
background
Massive data deletion causes memory fragmentation
After data is deleted, it leaves a lot of holes that are not immediately recycled
MMAP was used early on
Empty data is also loaded into memory
For example, 64 GB of memory contains a large amount of empty data and only 10 GB of valid data
Copy the code
The solution
Defragmentation scheme
- Online data compression
- Shrink the database
Specific data shrinkage steps
Compression effect comparison
85G before search and 34G after compression saves 51G memory, greatly improving performance
Copy the code
The above steps outline the process
1, from the library rm -rf *
2. Restart the slave library
3. Resynchronize a copy from the master with no fragments at all on the slave
4. Perform stepDown on the master database to change the master database to slave database
5, delete from the library to restart the synchronization from the library to become a new synchronization
(Data fragments will not be synchronized after deletion)
Note:
A. One master and two slave databases reduce the authority of the latest slave database to the master database
B. The service system saves MongoDB through compensation mechanism for data lost between master and slave synchronization
Copy the code