No advantage

SQL is more than just a database

  • Massive scalability
  • Read and write performance
  • It complements a relational database (RDBMS)

No product

  • Key-value Redis/Codis

  • Column storage HBase

Hbase data analysis is often used

Copy the code
  • Graph database
Neo4j knowledge pictures are used more

Copy the code
  • Document type mongo

Directing a concept

Example: Describe people

Copy the code
  • Relational database

  • MongoDB

Mongo features

  • Extensible (scalable)
  • High performance (high – performance)
  • Open Source NoSQL Database
  • Written in C++
  • Document-Oriented Storage
  • Full Index Support
  • Replication & High Availability
  • Auto-Sharding
  • Rich Querying
  • Updates
  • Map/Reduce
  • GridFS stores binary files

Mongo stability

How to resolve data loss

  • Recovery Log (Journal)

  • Write on

Write to most nodes


Directing a high availability

How to achieve the core business SLASLA 99.99%?

Copy the code

MongoDB Replica Set

  • Multiple data redundancy
  • Cross-switch Deployment
  • Faster way to vote (see raft protocol)

architecture

Master/Slave replication + HIGH availability solution

Copy the code

shard

architecture


1. For the business side, there is no concept of separate database and separate table

No matter how large the amount of data is, it is a single database and single table for the business side

For relational databases such as Myql, there is the concept of separate libraries and separate tables

For example, 1 TB of data MongoDB is divided into 2 slices. Each slice stores 500 GB of data



2. There can be many Routers

3. Fragment information is stored in config Server

4. A fragment is a replica setset)

5. The Router first accesses the Config server to obtain the shard information, and then the Router accesses the replicaset)

Copy the code

MongoDB Collection of tables


Fragmentation rules

Sharding Range-based

Range-based sharding is more commonly used

Mysql B+ Tree is also range-based in nature

Copy the code

Sharding Hash-based

Modulus shard

Java HashMap is Hash - -based

The query speed is fast

Range query is not supported

It is certainly not appropriate for a database not to support range queries

Copy the code

Sharded cluster architecture


1. All 3 configuration nodes and 3 routing nodes (Mongos) are stateless.

2. Each fragment is a replicaset(one master, two slave) 3 machines in one sharding

Cnofig and Mongos can also be deployed on a sharding

Copy the code

Application scenarios


1. Location is GPS or LBS

2, small and medium-sized companies with it no problem large amount certainly not

3. Anything that is not trading related can be used

A. Weak transaction support

B. Mongodb 4.0 already supports transactions and cross-row transactions

Copy the code

Expandable storage


1. The earliest operation engine was MMAP, which supported table-level locking as a built-in mechanism of the operating system

The disadvantage is low memory utilization



WireTigger supports row-level locking

Copy the code

Document the design


_ID Globally uniquely identifies the table granularity

If this parameter is not specified, a 12-byte objectiD will be generated by default

It takes up a lot of space

It is usually replaced by a service primary key such as uid. Otherwise, it is meaningless to increment a service primary key

Copy the code

Default _id generation rule (not recommended)


1, Collection is table level



2, a, 1 hexadecimal 4 bytes, that is, 4 bytes represent a character string

B, a byte of two bits equals two strings

The value of c and _id contains 12 bytes, that is, 24 character strings



3, the readability is very poor and takes up a lot of space

Copy the code

_ID Rule generation is recommended


1) Uint64_t actually uses the long type as 8 bytes

The objectid default type is also an integer, a long, and a floating point type

Copy the code

Free Shcama

Meaning repeated Schema, All Schema


How to deal with


Field name selection


Simplified field names how to ensure readability


MongoDB data volume limit

MongoDB limits each doc to 16MB

Small Data Scenario

Test scores, personal information

Copy the code

Large Data Scenario

The Killing of The Three Kingdoms and the Generals

Copy the code

Regular reference association


The embedded document


Logging scenarios reference Host&Log more effectively

Machines and logs cannot be embedded, only referenced


Locking mechanism

Pessimistic locking

Pessimistic lock concurrency control

  • Use write locks to protect resources from simultaneous access
  • Read and write operations are mutually exclusive

Pessimistic lock range


1, SQL parsing

Create a write command

3. Read a document from disk to read data into memory

4. Execute write commands for memory data

5. Make changes to the memory structure (lock only in this step)

6. Return the result

Copy the code

Optimistic locking

MongoDB 3.0wt MVCC mechanism

Optimists lock-free concurrency control


Meaning of optimistic lock


Compression algorithm

Snappy


Zlib (common)


Potholes encountered and solutions

Mass deletion of data and solutions

background


The solution


Massive data fragmentation (voids) and solutions

background


Massive data deletion causes memory fragmentation


After data is deleted, it leaves a lot of holes that are not immediately recycled

MMAP was used early on

Empty data is also loaded into memory

For example, 64 GB of memory contains a large amount of empty data and only 10 GB of valid data

Copy the code

The solution


Defragmentation scheme

  • Online data compression

  • Shrink the database

Specific data shrinkage steps


Compression effect comparison


85G before search and 34G after compression saves 51G memory, greatly improving performance

Copy the code

The above steps outline the process

1, from the library rm -rf *

2. Restart the slave library

3. Resynchronize a copy from the master with no fragments at all on the slave

4. Perform stepDown on the master database to change the master database to slave database

5, delete from the library to restart the synchronization from the library to become a new synchronization

(Data fragments will not be synchronized after deletion)



Note:

A. One master and two slave databases reduce the authority of the latest slave database to the master database

B. The service system saves MongoDB through compensation mechanism for data lost between master and slave synchronization

Copy the code