introduce

MMKV is an efficient, small, easy-to-use mobile key-value storage framework used in the WeChat application. It’s currently available on Android, iOS/macOS, Win32 and POSIX.

As a compact, easy-to-use and powerful full-platform K-V storage framework, MMKV has the following features:

  • efficient:
    • Mmap is used to map files directly to memory.
    • Protobuf is used to encode and decode key values.
    • Multi-process concurrency;
  • Easy to use: Manual operation is not requiredsynchronizeAnd configuration, full automatic synchronization;
  • streamline.
    • A small number of files: only includes codec tool classes and MMAP logic code, no redundant dependencies;
    • A binary file must be smaller than 30K. For example, an IPA file is smaller.

For specific performance, wechat team provides a simple Benchmark. In short, it beats apple’s NSUserDefaults by a factor of 100.

Note, now you see this article is a rewrite of 2.0 version. Not long ago, MMKV quietly released the main version update V1.1.0, and the original implementation is completely unrecognizable 💔, for reasons below:

We refactor the whole MMKV project and unify the cross-platform Core library. From now on, MMKV on iOS/macOS, Android, Win32 all share the same core logic code.

In addition, this article covers a large number of C++ implementations, so let me know if the description is wrong.

The preparatory work

Before we begin, we need to understand a few concepts, those of you who are familiar with them may pass.

mmap

Mmap is a method of memory mapping files, that is, a file or other objects are mapped to the process address space, to achieve a mapping between the file disk address and a virtual address in the process virtual address space. After such mapping is realized, the process can read and write the memory by means of pointer, and the system will automatically write the dirty page back to the corresponding file disk, that is, the operation on the file is completed without calling the system call functions such as read and write. In contrast, kernel space changes to this region directly reflect user space, allowing file sharing between different processes.

In general, our file reads and writes require the page cache to act as a relay between the kernel and application layers. Thus, a file operation requires two copies of data (kernel to page cache and page to application layer), and MMAP enables direct interaction between user-space and kernel-space data without the need for page caching. Of course, there are pros and cons. As the Apple documentation states, to use MMAP effectively, you need to meet the following scenarios:

  • You have a large file whose contents you want to access randomly one or more times.
  • You have a small file whose contents you want to read into memory all at once and access frequently. This technique is best for files that are no more than a few virtual memory pages in size.
  • You want to cache specific portions of a file in memory. File mapping eliminates the need to cache the data at all, which leaves more room in the system disk caches for other data.

Therefore, MMAP is most efficient when we need to access a small portion of a larger file at a high frequency.

In fact, not only MMKV including wechat’s XLog and Meituan’s Logan log tool, but also SQLight use MMAP to improve file access efficiency in high-frequency update scenarios.

Protocol Buffer

Protobuf is a method of serializing structured data. It is useful in developing programs to communicate with each other over a wire or for storing data. The method involves an interface description language that describes the structure of some data and a program that generates source code from that description for generating or parsing a stream of bytes that represents the structured data.

Protobuf is a way to serialize structured data. It was originally created to solve compatibility issues on the server side between old and new protocols (both high and low versions). Therefore, it is called “protocol buffer”, but later gradually developed to be used for data transfer and storage, etc.

MMKV takes into account the performance and space of protobuf and uses a simplified version of Protobuf as the serialization scheme. MMKV also extends protobuf’s incremental update capability by serializing k-V objects directly to the end of memory.

So how does Protobuf implement efficient coding?

  1. In order toTag - Value(Tag-length-value) encoding method implementation. Reduced use of separators, more compact data storage;
  2. usingbase 128 varint(Variable length coding) Principle After compression, binary data is very compact, pb volume is smaller. But pb is not compressed to the limit. Float and double are not compressed;
  3. Less than JSON and XML{,}, :These symbols, the volume is reduced a little bit. Plus varint, GZIP compression after smaller volume.

CRC check

Cyclic Redundancy check (Cyclic Redundancy Check) is a hash function that generates short fixed-bit check codes based on data such as network packets or computer files. It is mainly used to detect or verify errors that may occur after data transmission or saving. The resulting numbers are computed before transmission or storage and appended to the data, which is then checked by the receiver to determine whether the data has changed.

CRC is also used to calculate checksum values. Compared with MD5 or SHA1, CRC is more efficient and less secure. Considering the instability of file system and operating system, THE CRC check is added to MMKV to identify invalid data.

In the iOS wechat live network environment, there are about 700,000 times a day of data verification failed.

Initialize the

In v1.1.0, Tencent team rewrote the entire MMVK project to unify the cross-platform core library. Since then MMKV has shared the same core logic on iOS/macOS, Android, and Win32. To a certain extent, it improves maintainability and advantage sharing. Because of this, multi-Process Access is available on iOS/macOS.

In terms of code structure, MMKV is independent of MMVKCore, and Apple platform has made a layer of Objc encapsulation based on MMKV Core.

The original implementation was basically migrated to MMKV Core and replaced with C++ implementation. Platform-specific apis or logic are isolated by different file names and macros. Take MemoryFile as an example:

MemoryFile.h
MemoryFile.cpp
MemoryFile_Android.cpp
MemoryFile_OSX.cpp
MemoryFile_Win32.cpp
Copy the code

In this article we focus on Apple-related logic.

Class Initialize

The preparation of MMKV before use is divided into two stages:

  1. Initialize theg_instanceDicEqual static variables. It is applied in the MMKV class before the pre_main function at startup+ initializeIn.
  2. The user needs to manually execute the command+initializeMMKVTo complete theg_basePathThat is, the root directory of MMKV.
+ (void)initialize { if (self == MMKV.class) { g_instanceDic = [[NSMutableDictionary alloc] init]; g_lock = new mmkv::ThreadLock(); g_lock->initialize(); mmkv::MMKV::minimalInit([self mmkvBasePath].UTF8String); /* Register startup notification */}}Copy the code

In class initialization, four things are done:

  • g_instanceDic: container for global MMKV instances. Key is generated by mixing multiple fields, described later.
  • g_lock: in order tog_instanceDicWith a thread lock;
  • minimalInit: Initializes global variables in MMKV Core using the default MMKV root directory (~ /Document/ MMKV);
  • Sign up for notifications related to the App lifecycle (iOS App body only)

I was wondering why I didn’t use dispatch_once to guarantee non-reentrant.

When you look at the history of this file, it turns out that earlier versions did use dispatch_once to avoid reentrant. And now if I write it this way it’s

Are there any new features?

We know that +initialize is possible to be called multiple times, but there is a misunderstanding about how and by whom it is called multiple times.

Take MMKV as an example. Suppose we declare MMKVTest as a subclass of MMKV, but we do not implement +initialize or MMKVTest explicitly calls the [super Initialize] method in its +initialize. MMKV +initialize will be called multiple times.

+initialize is a class method. It is possible to avoid reentrant by determining the class type. This is also the first line to determine the importance and function of self == mmKV.class.

MinimalInit

protect from some old code that don’t call initializeMMKV()

To ensure that relevant properties are initialized when they are accessed, global variables need to be prepared during class initialization.

void MMKV::minimalInit(MMKVPath_t defaultRootDir) {
    ThreadLock::ThreadOnce(&once_control, initialize);

    int device = 0, version = 0;
    GetAppleMachineInfo(device, version);
#    ifdef __aarch64__
    if ((device == iPhone && version >= 9) || (device == iPad && version >= 7)) {
        CRC32 = mmkv::armv8_crc32;
    }
#    endif

    g_rootDir = defaultRootDir;
    mkPath(g_rootDir);
}
Copy the code

This approach minimizes the number of things that must be done before the application is launched. The main three things are:

  1. Execute initialize to complete init for the global variable.
  2. Determine the CRC check algorithm;
  3. Generate MMKV root directory;

Initialize

Pthread_once is used behind ThreadLock::ThreadOnce to ensure that the word is called to complete initialize(), and g_rootDir is used to create the corresponding file directory. Let’s see what the private initialize method does:

void initialize(a) {
    g_instanceDic = new unordered_map<string, MMKV *>;
    g_instanceLock = new ThreadLock();
    g_instanceLock->initialize();

    mmkv::DEFAULT_MMAP_SIZE = mmkv::getPageSize();
}
Copy the code

G_instanceDic and g_instanceLock are also maintained in the MMKV Core implementation. See this doesn’t make sense, so why are these two still retained on iOS/MacOS? Please let us know.

static NSMutableDictionary *g_instanceDic = nil;
static mmkv::ThreadLock *g_lock;
Copy the code

CRC32

This method is used to obtain the digest verification value of a file.

typedef uint32_t (*CRC32_Func_t)(uint32_t crc, const unsigned char *buf, size_t len);
extern CRC32_Func_t CRC32;
Copy the code

The default function pointer to CRC32 is:

static inline uint32_t _crc32Wrap(uint32_t crc, const unsigned char *buf, size_t len) {
    return static_cast<uint32_t>(::crc32(crc, buf, static_cast<uInt>(len)));
}
Copy the code

When the CPU architecture is AARCH64, MMKV :: ARMV8_CRC32 is changed. Because the CRC32 commands require the A10 chip, which is the sixth generation of the iPhone 7 or iPad. Therefore, this is determined by GetAppleMachineInfo to get the device and system version.

As a final step, the size of the memory page is obtained for subsequent file access calculations and stored in DEFAULT_MMAP_SIZE.

registration

MMKV defines macros for each platform in Core/ mmKvpredef.h, where only the iOS application body is registered for Notification:

#if defined(MMKV_IOS) && ! defined(MMKV_IOS_EXTENSION)
if[[[[NSBundle mainBundle] bundlePath] hasSuffix:@".appex"]) {
     g_isRunningInAppExtension = YES;
}
Copy the code

The g_isRunningInAppExtension static variable is added here for fear of missing the MMKV_IOS_EXTENSION judgment;

The methods for registering two Notifications are: didEnterBackground and didBecomeActive, which are used to listen for changes in UIApplicationState before and after. When registering the notification, it also gets the current applicationState and uses the method:

void MMKV::setIsInBackground(bool isInBackground)
Copy the code

To update g_isInBackground. This is done to ensure that file writes can be performed safely in the background.

InitializeMMKV

+initializeMMKV: logLevel: or its related convene method needs to be called manually once before using it.

The method uses static BOOL g_hasCalledInitializeMMKV internally to prevent being called more than once:

if (g_hasCalledInitializeMMKV) {
    MMKVWarning("already called +initializeMMKV before, ignore this request");
    return [self mmkvBasePath];
}
g_hasCalledInitializeMMKV = YES;
Copy the code

InitializeMMKV: The first argument is rootDir to update g_basePath. If empty, use the default value. Then pass in logLevel and execute the initialization method provided by MMKV Core:

void MMKV::initializeMMKV(const MMKVPath_t &rootDir, MMKVLogLevel logLevel) {
    g_currentLogLevel = logLevel;

    ThreadLock::ThreadOnce(&once_control, initialize);

    g_rootDir = rootDir;
    mkPath(g_rootDir);
}
Copy the code

ThreadLock::ThreadOnce is also called to ensure that the MMKV Core is successfully initialized.

In version 1.1, because of the unification of the underlying implementation, the iOS side can support multi-process invocation. Here is an extra control parameter, and the corresponding method is:

+ initializeMMKV: groupDir: logLevel:. Inside is also a method on top, but with an extra global variable:

g_groupPath = [groupDir stringByAppendingPathComponent:@"mmkv"];
Copy the code

Instance Initialize

mmkvWithID

Obtain instance MMKV also provides multiple coax methods. The final coax private class method is as follows:

+ (instancetype)mmkvWithID:(NSString *)mmapID 
                  cryptKey:(NSData *)cryptKey
              relativePath:(nullable NSString *)relativePath 
                      mode:(MMKVMode)mode
Copy the code

Note that because relativePath and mode are mutually exclusive and cannot be set at the same time, this is considered private. Let’s find out.

First, g_hasCalledInitializeMMKV is checked to see if +initializeMMKV has been executed: and if mmapID is valid.

After locking SCOPED_LOCK(g_lock), it’s time to deal with relativePath and mode:

if (mode == MMKVMultiProcess) {
    if(! relativePath) { relativePath = g_groupPath; }if(! relativePath) { MMKVError("Getting a multi-process MMKV [%@] without setting groupDir makes no sense", mmapID);
        MMKV_ASSERT(0); }}Copy the code

G_groupPath itself serves multi-processes, and for single processes the value of g_groupPath is naturally nil, so there is no conflict. What this logic does is also clear: under multi-process, it overwrites relativePath and guarantees that it cannot be null.

Why can’t it be empty? This is clearly stated in MMKVError.

Initialize the MMKV instance

NSString *kvKey = [MMKV mmapKeyWithMMapID:mmapID relativePath:relativePath]; MMKV *kv = [g_instanceDic objectForKey:kvKey]; if (kv == nil) { kv = [[MMKV alloc] initWithMMapID:mmapID cryptKey:cryptKey relativePath:relativePath mode:mode]; if (! kv->m_mmkv) { return nil; } kv->m_mmapKey = kvKey; [g_instanceDic setObject:kv forKey:kvKey]; }Copy the code

First, a kvKey is generated from mmapID and relativePath to correlate the generated MMKV instance, which is eventually stored in g_instanceDic. If relativePath is a valid string, the key value is the MD5 value of the combination of relativePath and mmapID.

Next, try to get the instance by key. If not, you need to initialize and save the MMKV instance to g_instanceDic.

Here each instance itself will also save its key in m_mmapKey and remove itself from g_instanceDic when it finishes.

initWithMMapID

Obtain the m_MMKV instance by using the MMKV:: MMKV::mmkvWithID method of the MMKV Core. The parameters are passed in with mmapID, cryptKey, and relativePath as c String.

The MMKV Core constructor is implemented in the same way as in iOS, but it is done in C++. In addition to assigning the default value to variable, loadFromFile() is finally called to load the MMKV file and CRC style. We won’t post the full implementation of MMKV’s constructor, but let’s take a look at the declaration:

#ifndef MMKV_ANDROID
    MMKV(const std: :string &mmapID, MMKVMode mode, std: :string *cryptKey, MMKVPath_t *relativePath);
    std: :string m_mmapKey;
#else // defined(MMKV_ANDROID)
    MMKV(const std: :string &mmapID, int size, MMKVMode mode, std: :string *cryptKey, MMKVPath_t *relativePath);

    MMKV(const std: :string &mmapID, int ashmemFD, int ashmemMetaFd, std: :string *cryptKey = nullptr);
#endif
Copy the code

Data Structure

In this section, we will give a brief introduction to the related data structures and some of the variables used in MMKV. Having a basic understanding of the main data structures allows us to Focus more on the core logic when explaining the implementation.

Let’s start with the instance variables of MMKV:

mmkv::MMKVMap m_dic; /// Save the current k-V mapped to memory
std: :string m_mmapID;
MMKVPath_t m_path; // mmkv path
MMKVPath_t m_crcPath; // crc file path

mmkv::MemoryFile *m_file; // mmap maps information about real data files, including file descrpitot, etc
size_t m_actualSize; // The current memory size occupied by k-V
mmkv::CodedOutputData *m_output; // Map the remaining memory space

bool m_needLoadFromFile; // Mark whether m_file needs to be reloaded
bool m_hasFullWriteback; // Whether to write back. For example, m_file fails to be read, memory is abnormal, etc

uint32_t m_crcDigest;
mmkv::MemoryFile *m_metaFile; // mmap Maps information about CRC files, including file descrpitot etc.
mmkv::MMKVMetaInfo *m_metaInfo; // Save the digest and size etc of the CRC file.

mmkv::AESCrypt *m_crypter; // Encryptor, which recalculates the encryption value after the file content is updated

mmkv::ThreadLock *m_lock; // k-v File lock
mmkv::FileLock *m_fileLock; // CRC file lock
mmkv::InterProcessLock *m_sharedProcessLock; / / read lock
mmkv::InterProcessLock *m_exclusiveProcessLock; / / write locks
Copy the code

The above variables are initialized when the MMKV constructor is called.

MMKVMap

The first is MMKVMap, which distinguishes Apple from other systems. In the Apple series, NSString is used as the key, and value is not only of the MMBuffer type, but the KeyHasher and KeyEqualer protocols need to be implemented. After all, unordered_map is a C++ generic.

struct KeyHasher {
    size_t operator()(NSString *key) const { returnkey.hash; }};struct KeyEqualer {
    bool operator()(NSString *left, NSString *right) const { /* left isEqual right */}};#ifdef MMKV_APPLE
using MMKVMap = std::unordered_map<NSString *, mmkv::MMBuffer, KeyHasher, KeyEqualer>;
#else
using MMKVMap = std::unordered_map<std::string, mmkv::MMBuffer>;
#endif
Copy the code

Note that the data type stored in our M_DIC is MMBuffer and not the real data type. We encode/decode it only when we Access it through Access.

MMKVKey_t

#ifdef MMKV_APPLE
    using MMKVKey_t = NSString *__unsafe_unretained;
    static bool isKeyEmpty(MMKVKey_t key) { return key.length <= 0; }
#else
    using MMKVKey_t = const std: :string &;
    static bool isKeyEmpty(MMKVKey_t key) { return key.empty(); }
#endif
Copy the code

Note that mmkvcore.cpp is the only file in the MMKVCore source code that uses MRC for memory management. The rest of the C++ classes use ARC.

s.requires_arc = ['Core/MemoryFile.cpp'. ]Copy the code

The mmKV.cpp file is not found here, as will be explained in the subsequent code.

MMKVPath_t

using MMKVPath_t = std: :string;
Copy the code

MemoryFile

class MemoryFile {
    MMKVPath_t m_name;
    MMKVFileHandle_t m_fd; // File descriptior (different platform)
#ifdef MMKV_WIN32
    HANDLE m_fileMapping;
#endif
    void *m_ptr; // Point to the start address of mmap memory
    size_t m_size; // represents the size of the file truncated by an integer memory page.

    bool mmap(a);
    void doCleanMemoryCache(bool forceClean);
public:
#ifndef MMKV_ANDROID
    explicit MemoryFile(const MMKVPath_t &path);
#else
    MemoryFile(const MMKVPath_t &path, size_t size, FileType fileType);
    explicit MemoryFile(MMKVFileHandle_t ashmemFD);

    const FileType m_fileType;
#endif // MMKV_ANDROID
    
   /* methods ... * /
}
Copy the code

MMKV is efficient because of MMAP. MemoryFile encapsulates MMAP, MUmap, msync, and so on.

Non-android constructors need only filePath, and the rest of the variables are fetched by reloadFromFile(). FileType:

enum FileType : bool { MMFILE_TYPE_FILE = false, MMFILE_TYPE_ASHMEM = true };
Copy the code

MMFILE_TYPE_ASHMEM refers to the anonymous shared memory (ASharedMemory) method unique to Android, which is also mMAP.

reloadFromFile

void MemoryFile::reloadFromFile() {
#    ifdef MMKV_ANDROID
    if (m_fileType == MMFILE_TYPE_ASHMEM) {
        return;
    }
#    endif
    if (isFileValid()) {
        MMKVWarning("calling reloadFromFile while the cache [%s] is still valid", m_name.c_str());
        MMKV_ASSERT(0);
        clearMemoryCache();
    }

    m_fd = open(m_name.c_str(), O_RDWR | O_CREAT | O_CLOEXEC, S_IRWXU);
    if (m_fd < 0) {
        MMKVError("fail to open:%s, %s", m_name.c_str(), strerror(errno));
    } else {
        FileLock fileLock(m_fd);
        InterProcessLock lock(&fileLock, ExclusiveLockType);
        SCOPED_LOCK(&lock);

        mmkv::getFileSize(m_fd, m_size);
        // round up to (n * pagesize)
        if(m_size < DEFAULT_MMAP_SIZE || (m_size % DEFAULT_MMAP_SIZE ! =0)) {
            size_t roundSize = ((m_size / DEFAULT_MMAP_SIZE) + 1) * DEFAULT_MMAP_SIZE;
            truncate(roundSize);
        } else {
            auto ret = mmap();
            if(! ret) { doCleanMemoryCache(true); }}#    ifdef MMKV_IOS
        tryResetFileProtection(m_name);
#    endif}}Copy the code

The first step is to determine m_fileType, and if it is MMFILE_TYPE_ASHMEM, return it directly to complete the memory mapping via ASharedMemory_create.

Next, determine if fd refers to valid memory:

#ifndef MMKV_WIN32
    bool isFileValid(a) { return m_fd >= 0 && m_size > 0 && m_ptr; }
#else
    bool isFileValid(a) { returnm_fd ! = INVALID_HANDLE_VALUE && m_size >0 && m_fileMapping && m_ptr; }
#endif
Copy the code

If effective, will perform MemoryFile: : clearMemoryCache (), internal calls first mumap (m_ptr, m_size) clear the memory cache, and then close the file access close reduction m_fd and m_size (m_fd).

An internal access check is performed before mmap to ensure that the mapped data is an integer multiple of the memory page DEFAULT_MMAP_SIZE to reduce memory fragmentation.

Finally, file read and write protection will be adjusted on iOS, as mentioned in the registration notification before, in order to ensure that the application can safely access files in the background, and not be killed by the system ⚠️.

truncate

The memory area is rounded.

bool MemoryFile::truncate(size_t size) {
    if (m_fd < 0) {
        return false;
    }
    if (size == m_size) {
        return true;
    }
#    ifdef MMKV_ANDROID.#    endif // MMKV_ANDROID

    auto oldSize = m_size;
    m_size = size;
    // round up to (n * pagesize)
    if(m_size < DEFAULT_MMAP_SIZE || (m_size % DEFAULT_MMAP_SIZE ! =0)) {
        m_size = ((m_size / DEFAULT_MMAP_SIZE) + 1) * DEFAULT_MMAP_SIZE;
    }

    if (::ftruncate(m_fd, static_cast<off_t>(m_size)) ! =0) {
        MMKVError("fail to truncate [%s] to size %zu, %s", m_name.c_str(), m_size, strerror(errno));
        m_size = oldSize;
        return false;
    }
    if (m_size > oldSize) {
        if(! zeroFillFile(m_fd, oldSize, m_size - oldSize)) { MMKVError("fail to zeroFile [%s] to size %zu, %s", m_name.c_str(), m_size, strerror(errno));
            m_size = oldSize;
            return false; }}if (m_ptr) {
        if(munmap(m_ptr, oldSize) ! =0) {
            MMKVError("fail to munmap [%s], %s", m_name.c_str(), strerror(errno)); }}auto ret = mmap();
    if(! ret) { doCleanMemoryCache(true);
    }
    return ret;
}
Copy the code

To ensure the accuracy of the size, round up to (n * pagesize) again before rounding. Two-step:

ftruncate + lseek

Expand or crop the file and update the file offset to the last position of the current capacity. Truncate does not operate on file offset, so lSEEK is used, and the rest is written with ‘\0’.

munmap + mmap

Since mmap is associated with oldSize memory, and now we have resized m_size, we need to rebind the file-memory relationship.

MMBuffer

class MMBuffer {
private:
    void *ptr;
    size_t size;
    MMBufferCopyFlag isNoCopy;
#ifdef MMKV_APPLE
    NSData *m_data = nil;
#endif

public:
    explicit MMBuffer(size_t length = 0);
    MMBuffer(void *source, size_t length, MMBufferCopyFlag flag = MMBufferCopy);
#ifdef MMKV_APPLE
    explicit MMBuffer(NSData *data, MMBufferCopyFlag flag = MMBufferCopy);
#endif
   // Data read-write method...
}
Copy the code

It’s a contiguity of memory addresses that are referred to by NSData on Apple and PTR + SIZE on other platforms.

Whether data is written to or read from a file in the MMKV, it is uniformly converted to the MMBuffer as a transition.

CodedOutputData

class CodedOutputData {
    uint8_t *const m_ptr;
    size_t m_size;
    size_t m_position;

public:
    CodedOutputData(void *ptr, size_t len);
    size_t spaceLeft();
    uint8_t *curWritePointer();
    void seek(size_t addedSize);
    void writeRawByte(uint8_t value);
    /// Other basic data types are written...
}
Copy the code

CodedOutputData

class CodedInputData {
    uint8_t *const m_ptr;
	 size_t m_size;
    size_t m_position;

    int8_t readRawByte();

public:
    CodedInputData(const void *oData, size_t length);
    bool isAtEnd(a) { return m_position == m_size; };
    /// Other basic data types read...
}
Copy the code

CodedInputData and CodedOutputData are mainly used to convert between real data types and mmBuffers. The relationship is as follows:

MMBuffer -> Input -> Real data -> Output -> MMBufferCopy the code

CodedInputData reads binary Data from MMBuffer and converts it to real Data type.

CodedOutputData converts the real data type to binaryData and outputs it to the MMBuffer.

It can be seen that they play the role of a bridge to complete the conversion of real data and MMBuffer.

InterProcessLock

MMKV uses file locks to handle file access in multiple processes. Use exclusive locks as write locks and shared locks as read locks. Instead of using system Flock directly, you wrap it in FileLock, and both read and write locks are InterProcessLock and essentially FileLock.

class InterProcessLock {
    FileLock *m_fileLock;
    LockType m_lockType;

public:
    InterProcessLock(FileLock *fileLock, LockType lockType)
        : m_fileLock(fileLock), m_lockType(lockType), m_enable(true) {
        MMKV_ASSERT(m_fileLock);
    }

    bool m_enable;

    void lock(a) {
        if(m_enable) { m_fileLock->lock(m_lockType); }}bool try_lock(a) {
        if (m_enable) {
            return m_fileLock->try_lock(m_lockType);
        }
        return false;
    }

    void unlock(a) {
        if(m_enable) { m_fileLock->unlock(m_lockType); }}};Copy the code

The variable m_isInterProcess is also declared in mmvk.h to control the lock function switch. For MMKV that supports multiple processes, m_isInterProcess represents the read/write mode used by the current instance: MMKVMode:

enum MMKVMode : uint32_t {
    MMKV_SINGLE_PROCESS = 0x1,
    MMKV_MULTI_PROCESS = 0x2,
#ifdef MMKV_ANDROID
    CONTEXT_MODE_MULTI_PROCESS = 0x4.// in case someone mistakenly pass Context.MODE_MULTI_PROCESS
    MMKV_ASHMEM = 0x8,
#endif
};
Copy the code

For more information on locking, check out this article: Flock File Locking.

Due to the length of this article, many of the (actually very important) locking details have been left out of the description and will be covered separately later.

LoadData

This section describes how MMKV reads data from files, handles abnormal data, and uses CRC to verify file integrity.

LoadFromFile () is executed to refresh the corresponding data in memory in the application of initial initialization, data exception, memory warning, and data clearing to ensure its accuracy. The entire m_file is loaded in three main steps:

  1. Check the validity of the CRC file, m_file, and initialize AESCrypter.
  2. Check the validity of the internal data in the file;
  3. Load data into memory.

File validity

At the time of execution of the MMKV constructor, m_metaFile is a memory map of the local CRC file, while m_metaInfo records the CRC check value associated with the current memory data, which defaults to null.

struct MMKVMetaInfo {
    uint32_t m_crcDigest = 0;
    uint32_t m_version = MMKVVersionSequence;
    uint32_t m_sequence = 0; // full write-back count
    unsigned char m_vector[AES_KEY_LEN] = {};
    uint32_t m_actualSize = 0;

    // confirmed info: it's been synced to file
    struct {
        uint32_t lastActualSize = 0;
        uint32_t lastCRCDigest = 0;
        uint32_t __reserved__[16] = {};
    } m_lastConfirmedMetaInfo;

    void write(void *ptr) {
        MMKV_ASSERT(ptr);
        memcpy(ptr, this.sizeof(MMKVMetaInfo));
    }

    void writeCRCAndActualSizeOnly(void *ptr) {
        MMKV_ASSERT(ptr);
        auto other = (MMKVMetaInfo *) ptr;
        other->m_crcDigest = m_crcDigest;
        other->m_actualSize = m_actualSize;
    }

    void read(const void *ptr) {
        MMKV_ASSERT(ptr);
        memcpy(this, ptr, sizeof(MMKVMetaInfo)); }};Copy the code

Therefore, MMKV will load the CRC check value into m_metaInfo before loading m_file. Before loading, MMKV will confirm that CRC has completed the mapping:

if (m_metaFile->isFileValid()) {
    m_metaInfo->read(m_metaFile->getMemory());
}
Copy the code

Note that m_version represents the state of the content data currently cached, with an initial value of MMKVVersionSequence. There are the following:

enum MMKVVersion : uint32_t {
    MMKVVersionDefault = 0.// The number of times the complete write back was recorded
    MMKVVersionSequence = 1.// Stored encrypted random iv
    MMKVVersionRandomIV = 2.// The actual size and CRC checksum are stored to reduce file corruption
    MMKVVersionActualSize = 3};Copy the code

AESCrypter

if (m_crypter) {
    if (m_metaInfo->m_version >= MMKVVersionRandomIV) {
        m_crypter->resetIV(m_metaInfo->m_vector, sizeof(m_metaInfo->m_vector)); }}Copy the code

When the MMKV is initialized, if the AES Key is passed to the user, the AES is initialized through resetIV.

AES belongs to block encryption and there are multiple encryption modes. The MMKV uses CFB-128 mode. This mode requires the use of KEY and IV to complete the data encryption.

For an introduction to AES, see the WiKi. Here is just a brief introduction to the role of IV vectors.

IV is called the initial vector, and different IVs encrypt different strings. Encryption and decrypting require the same IV. Since IV looks the same as key, there is one more purpose of IV. IV is 16 bytes long. Over or under, the possible libraries will be completed or truncated. But since the length of the block is 16 bytes, you can generally assume that the IV required is 16 bytes.

So metaInfo->m_vector records the IV vector of AES with length AES_KEY_LEN of 16.

Next comes the M_file validity check isFileValid. Pass to the next stage, otherwise try reloadFromFile.

Data validity

The entire data validation is done in checkDataValid, first reading m_actualSize.

readActualSize

size_t MMKV::readActualSize() {
    MMKV_ASSERT(m_file->getMemory());
    MMKV_ASSERT(m_metaFile->isFileValid());

    uint32_t actualSize = 0;
    memcpy(&actualSize, m_file->getMemory(), Fixed32Size);

    if (m_metaInfo->m_version >= MMKVVersionActualSize) {
        if(m_metaInfo->m_actualSize ! = actualSize) { MMKVWarning("[%s] actual size %u, meta actual size %u",...). ; }return m_metaInfo->m_actualSize;
    } else {
        returnactualSize; }}Copy the code

If m_metaInfo records m_actualSize, return it first. Otherwise, the file record value prevails. Here, actualSize reads the fixed length Fixed32Size in the M_file header.

constexpr uint32_t LittleEdian32Size = 4;

constexpr uint32_t pbFixed32Size(a) {
    return LittleEdian32Size;
}

constexpr uint32_t Fixed32Size = pbFixed32Size();
Copy the code

Second, verify that the current file has enough space. As mentioned earlier, the default padding for unstored data is \0, so here you need to compare the file size to the real data size.

void MMKV::checkDataValid(bool &loadFromFile, bool &needFullWriteback) {
    // try auto recover from last confirmed location
    auto fileSize = m_file->getFileSize();
    auto checkLastConfirmedInfo = [&] { ... }

    m_actualSize = readActualSize();

    if (m_actualSize < fileSize && (m_actualSize + Fixed32Size) <= fileSize) {
        if (checkFileCRCValid(m_actualSize, m_metaInfo->m_crcDigest)) {
            loadFromFile = true; /// The data is correct and the free space is sufficient
        } else {
            checkLastConfirmedInfo();

           if(! LoadFromFile) {⚠ ️ Handler3: Data is abnormal}}else {
        checkLastConfirmedInfo();

        if(! LoadFromFile) {⚠ ️ Handler4: Insufficient space}}}Copy the code

If space is sufficient, the CRC Digest for the real data of the current M_file is calculated and compared with the m_crcDigest for m_metaInfo.

bool MMKV::checkFileCRCValid(size_t actualSize, uint32_t crcDigest) {
    auto ptr = (uint8_t *) m_file->getMemory();
    if (ptr) {
        m_crcDigest = (uint32_t) CRC32(0, (const uint8_t *) ptr + Fixed32Size, (uint32_t) actualSize);

        if (m_crcDigest == crcDigest) {
            return true;
        }
        MMKVError("check crc [%s] fail, crc32:%u, m_crcDigest:%u",...). ; }return false;
}
Copy the code

In addition, about CRC error detection ability, step encyclopedia.

When the validation is successful, the m_file contents are loaded.

checkLastConfirmedInfo

If the data is abnormal or there is not enough space, checkLastConfirmedInfo is called to reconfirm the loadFromFile status. CheckLastConfirmedInfo is a lambda function in C++ that is declared in checkDataValid. The logic is as follows:

if (m_metaInfo->m_version >= MMKVVersionActualSize) {
    // downgrade & upgrade support
    uint32_t oldStyleActualSize = 0;
    memcpy(&oldStyleActualSize, m_file->getMemory(), Fixed32Size);
    if(oldStyleActualSize ! = m_actualSize) { MMKVWarning("oldStyleActualSize not equal to meta actual size"...). ;if (oldStyleActualSize < fileSize && (oldStyleActualSize + Fixed32Size) <= fileSize) {
            ifCheckFileCRCValid (oldStyleActualSize, m_metaInfo->m_crcDigest)) {⚠️ Handler1
                MMKVInfo("looks like [%s] been downgrade & upgrade again"...). ; loadFromFile =true;
                writeActualSize(oldStyleActualSize, m_metaInfo->m_crcDigest, nullptr, KeepSequence);
                return; }}else {
            MMKVWarning("oldStyleActualSize greater than file size"...). ; }}auto lastActualSize = m_metaInfo->m_lastConfirmedMetaInfo.lastActualSize;
    if (lastActualSize < fileSize && (lastActualSize + Fixed32Size) <= fileSize) {
        auto lastCRCDigest = m_metaInfo->m_lastConfirmedMetaInfo.lastCRCDigest;
        if(checkFileCRCValid(lastActualSize, lastCRCDigest)) {⚠️ Handler2
            loadFromFile = true;
            writeActualSize(lastActualSize, lastCRCDigest, nullptr, KeepSequence);
        } else {
            MMKVError("check lastActualSize, lastActualCRC error"...). ; }}else {
        MMKVError("check lastActualSize, file size error"...). ; }}Copy the code

M_lastConfirmedMetaInfo in MMKVMetaInfo may record the last checked metaInfo, and only if m_version is MMKVVersionActualSize, M_lastConfirmedMetaInfo has data. Therefore, the precondition for check is >= MMKVVersionActualSize.

There were two chances to restore the correct metaInfo during the inspection:

Handler 1

OldStyleActualSize records the size of the content data whose value is m_file. If the value is not equal to m_metaInfo->m_actualSize, try to update the information of metaInfo based on oldStyleActualSize. After the CRC check is passed, mark loadFromFile to true and call writeActualSize to restore metaInfo.

Handler 2

The last lifeline for m_metaInfo – > m_lastConfirmedMetaInfo. LastActualSize. Use it to check Handler 1 again.

writeActualSize

Use to update m_metaInfo information, including actualSize, crcDigest, IV, lastConfrimInfo.

bool MMKV::writeActualSize(size_t size, uint32_t crcDigest, const void *iv, bool increaseSequence) {
   // backward compatibility
   oldStyleWriteActualSize(size);

   if(! m_metaFile->isFileValid()) {return false;
   }

   bool needsFullWrite = false;
   m_actualSize = size;
   m_metaInfo->m_actualSize = static_cast<uint32_t>(size);
   m_crcDigest = crcDigest;
   m_metaInfo->m_crcDigest = crcDigest;
   if (m_metaInfo->m_version < MMKVVersionSequence) {
       m_metaInfo->m_version = MMKVVersionSequence;
       needsFullWrite = true;
   }
   if (unlikely(iv)) {
       memcpy(m_metaInfo->m_vector, iv, sizeof(m_metaInfo->m_vector));
       if (m_metaInfo->m_version < MMKVVersionRandomIV) {
           m_metaInfo->m_version = MMKVVersionRandomIV;
       }
       needsFullWrite = true;
   }
   if (unlikely(increaseSequence)) {
       m_metaInfo->m_sequence++;
       m_metaInfo->m_lastConfirmedMetaInfo.lastActualSize = static_cast<uint32_t>(size);
       m_metaInfo->m_lastConfirmedMetaInfo.lastCRCDigest = crcDigest;
       if (m_metaInfo->m_version < MMKVVersionActualSize) {
           m_metaInfo->m_version = MMKVVersionActualSize;
       }
       needsFullWrite = true;
   }
#ifdef MMKV_IOS
   return protectFromBackgroundWriting(m_metaFile->getMemory(), sizeof(MMKVMetaInfo), ^{
     if (unlikely(needsFullWrite)) {
         m_metaInfo->write(m_metaFile->getMemory());
     } else{ m_metaInfo->writeCRCAndActualSizeOnly(m_metaFile->getMemory()); }});#else.#endif
Copy the code

Do not mention the first three parameters, see the last parameter increaseSequence, type:

enum : bool {
    KeepSequence = false,
    IncreaseSequence = true};Copy the code

It is used to control whether to update the file’s full write-back count and needsFullWrite. NeedsFullWrite acts as a dirty bit. Whenever M_version is updated, needsFullWrite is marked as dirty for subsequent write-back updates.

The concept of write-back is described later.

checkDataValid

Here, the main process of data verification is finished, we go back to checkDataValid, add checkLastConfirmedInfo after the data status is still wrong, loadlFromFile is false.

Handler 3 (tag in 👆 code)

auto strategic = onMMKVCRCCheckFail(m_mmapID);
if (strategic == OnErrorRecover) {
    loadFromFile = true;
    needFullWriteback = true;
}
MMKVInfo("recover strategic for [%s] is %d", m_mmapID.c_str(), strategic);
Copy the code

Handler 4

auto strategic = onMMKVFileLengthError(m_mmapID);
if (strategic == OnErrorRecover) {
    // make sure we don't over read the file
    m_actualSize = fileSize - Fixed32Size;
    loadFromFile = true;
    needFullWriteback = true;
}
MMKVInfo("recover strategic for [%s] is %d", m_mmapID.c_str(), strategic);
Copy the code

For the exception handling strategy, MMKV provides us with modified callbacks. There are two strategies:

enum MMKVRecoverStrategic : int {
    OnErrorDiscard = 0,
    OnErrorRecover,
};
Copy the code

By default, MMKV discards the current data, clears files, and metaInfo. This can be modified using g_errorHandler:

static MMKVRecoverStrategic onMMKVCRCCheckFail(const string &mmapID) {
    if (g_errorHandler) {
        return g_errorHandler(mmapID, MMKVErrorType::MMKVCRCCheckFail);
    }
    return OnErrorDiscard;
}

static MMKVRecoverStrategic onMMKVFileLengthError(const string &mmapID) {
    if (g_errorHandler) {
        return g_errorHandler(mmapID, MMKVErrorType::MMKVFileLength);
    }
    return OnErrorDiscard;
}
Copy the code

The data processing

Verify the validity and determine subsequent operations based on the loadFromFile and needFullWriteback values. Simplified loadFromFile:

void MMKV::loadFromFile() {
    /// 1. File validity
    /// 2. Data validity.bool loadFromFile = false, needFullWriteback = false; checkDataValid(loadFromFile, needFullWriteback); .auto ptr = (uint8_t *) m_file->getMemory();

    if (loadFromFile && m_actualSize > 0) {
       MMKVInfo("loading [%s] with crc %u sequence %u version"...). ;// loading    
    } else {
       // file not valid or empty, discard everything
       SCOPED_LOCK(m_exclusiveProcessLock);

       m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
       if (m_actualSize > 0) {
           writeActualSize(0.0.nullptr, IncreaseSequence);
           sync(MMKV_SYNC);
       } else {
           writeActualSize(0.0.nullptr, KeepSequence); }}};Copy the code

Let’s start with exception handling.

If the verification fails or the file is empty, writeActualSize is directly used to clear the metaInfo cache.

If the file is abnormal, pass IncreaseSequence to set the DIRT bit until the next reload m_file.

Loading

When loadFromFile is true and the file content is not empty, data is read from the memory into the MMBuffer, AES decryption is performed, m_DIC is cleared, and buffer data is prepared for writing.

// loading
MMBuffer inputBuffer(ptr + Fixed32Size, m_actualSize, MMBufferNoCopy);
if (m_crypter) {
    decryptBuffer(*m_crypter, inputBuffer);
}
clearDictionary(m_dic);
if (needFullWriteback) {
    MiniPBCoder::greedyDecodeMap(m_dic, inputBuffer);
} else {
    MiniPBCoder::decodeMap(m_dic, inputBuffer);
}
m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
m_output->seek(m_actualSize);
if (needFullWriteback) {
    fullWriteback();
}
Copy the code

After data is written to M_DIC, the CodedOutputData object is created to record the memory pointer and file size of the current mapping, and the file location is read through seek.

Finally, when needFullWriteback is true, the file is written back to fullWriteback.

There are two types of write policies: greedy mode and common mode:

void MiniPBCoder::decodeMap(MMKVMap &dic, const MMBuffer &oData, size_t size) {
    MiniPBCoder oCoder(&oData);
    oCoder.decodeOneMap(dic, size, false);
}

void MiniPBCoder::greedyDecodeMap(MMKVMap &dic, const MMBuffer &oData, size_t size) {
    MiniPBCoder oCoder(&oData);
    oCoder.decodeOneMap(dic, size, true);
}
Copy the code

The difference is that The buffer will be converted into K-V and stored in M_DIC.

As shown in the previous data verification, needFullWriteback is marked as true only when the verification fails and the recovery policy is OnErrorRecover. That is, when the data is abnormal or the space is insufficient, the greedy strategy is used to read the data into memory first.

void MiniPBCoder::decodeOneMap(MMKVMap &dic, size_t size, bool greedy) {
    auto block = [size, this](MMKVMap &dictionary) {
        if (size == 0) {
            [[maybe_unused]] auto length = m_inputData->readInt32();
        }
        while(! m_inputData->isAtEnd()) {const auto &key = m_inputData->readString();
            if (key.length > 0) {
                auto value = m_inputData->readData();
                if (value.length() > 0) {
                    dictionary[key] = move(value);
                    [key retain];
                } else {
                    auto itr = dictionary.find(key);
                    if(itr ! = dictionary.end()) { dictionary.erase(itr); [itr->first release]; }}}}};if (greedy) {
        try {
            block(dic);
        } catch (std::exception &exception) {
            MMKVError("%s", exception.what()); }}else {
        try {
            MMKVMap tmpDic;
            block(tmpDic);
            dic.swap(tmpDic);
            for (auto&pair : tmpDic) { [pair.first release]; }}catch (std::exception &exception) {
            MMKVError("%s", exception.what()); }}}Copy the code

fullWriteback

Write-back is one of the caching strategies. The concept of write-back can be described in the wiki.

The contents of a cache block are only written to memory if it needs to be replaced back into memory. In order to reduce memory writes, a dirty bit is used to indicate whether the block has been updated since it was loaded. If a cache block has never been written before it is replaced back into memory, the write-back operation can be dispensed with.

The MMKV write – back operation is to serialize the memory data m_DIC and write it back to the file.

bool MMKV::fullWriteback() {
    ...
    auto allData = MiniPBCoder::encodeDataWithObject(m_dic);
    SCOPED_LOCK(m_exclusiveProcessLock);
    if (allData.length() > 0) {
        auto fileSize = m_file->getFileSize();
        if (allData.length() + Fixed32Size <= fileSize) {
            return doFullWriteBack(std::move(allData));
        } else {
            // ensureMemorySize will extend file & full rewrite, no need to write back again
            returnensureMemorySize(allData.length() + Fixed32Size - fileSize); }}return false;
}
Copy the code

Several states are checked before operation:

  • M_hasFullWriteback: return true directly
  • M_needLoadFromFile: direct return true
  • If isFileValid() is false, return false directly
  • M_dic. Empty () :clearAll()After the return of true

If m_DIC is null, the data may be abnormal. Temporary data and memory caches are cleared, associated flag bits are reset, and files are reloaded.

void MMKV::clearAll() {
    MMKVInfo("cleaning all key-values from [%s]", m_mmapID.c_str());
    SCOPED_LOCK(m_lock);
    SCOPED_LOCK(m_exclusiveProcessLock);

    if (m_needLoadFromFile) {
        m_file->reloadFromFile();
    }

    m_file->truncate(DEFAULT_MMAP_SIZE);
    auto ptr = m_file->getMemory();
    if (ptr) {
        memset(ptr, 0, m_file->getFileSize());
    }
    m_file->msync(MMKV_SYNC);

    unsigned char newIV[AES_KEY_LEN];
    AESCrypt::fillRandomIV(newIV);
    if (m_crypter) {
        m_crypter->resetIV(newIV, sizeof(newIV));
    }
    writeActualSize(0.0, newIV, IncreaseSequence);
    m_metaFile->msync(MMKV_SYNC);

    clearMemoryCache();
    loadFromFile();
}
Copy the code

After passing the check, the M_DIC is converted to MiniPBCoder, which is binary data. Before writing, the system checks whether the current file size is sufficient to write the current data. Otherwise, expand the file size.

doFullWriteBack

First, generate AES random IV to encrypt allData, then write the MMBuffer to M_file through CodedOutputData, and finally update the CRC checksum.

bool MMKV::doFullWriteBack(MMBuffer &&allData) {
#ifdef MMKV_IOS
    unsigned char oldIV[AES_KEY_LEN];
    unsigned char newIV[AES_KEY_LEN];
    if (m_crypter) {
        memcpy(oldIV, m_crypter->m_vector, sizeof(oldIV));
#else
    unsigned char newIV[AES_KEY_LEN];
    if (m_crypter) {
#endif
        AESCrypt::fillRandomIV(newIV);
        m_crypter->resetIV(newIV, sizeof(newIV));
        auto ptr = allData.getPtr();
        m_crypter->encrypt(ptr, ptr, allData.length());
    }

    auto ptr = (uint8_t *) m_file->getMemory();
    delete m_output;
    m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
#ifdef MMKV_IOS
    auto ret = protectFromBackgroundWriting(m_output->curWritePointer(), allData.length(), ^{
      m_output->writeRawData(allData); // note: don't write size of data
    });
    if(! ret) {// revert everything
        if (m_crypter) {
            m_crypter->resetIV(oldIV);
        }
        delete m_output;
        m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
        m_output->seek(m_actualSize);
        return false;
    }
#else
    m_output->writeRawData(allData); // note: don't write size of data
#endif

    m_actualSize = allData.length();
    if (m_crypter) {
        recaculateCRCDigestWithIV(newIV);
    } else {
        recaculateCRCDigestWithIV(nullptr);
    }
    m_hasFullWriteback = true;
    // make sure lastConfirmedMetaInfo is saved
    sync(MMKV_SYNC);
    return true;
}
Copy the code

recaculateCRCDigestWithIV

void MMKV::recaculateCRCDigestWithIV(const void *iv) {
auto ptr = (const uint8_t *) m_file->getMemory();
if (ptr) {
    m_crcDigest = 0;
    m_crcDigest = (uint32_t) CRC32(0, ptr + Fixed32Size, (uint32_t) m_actualSize);
    writeActualSize(m_actualSize, m_crcDigest, iv, IncreaseSequence);
}
Copy the code

Note that the behavior of regenerating the CRC Digest is only called in full write-back. Although the call to writeActualSize updates m_metaInfo and adds m_sequence, the actualSize does not change.

ensureMemorySize

In addition to the full write back case, when the append data exceeds the fileSize, it will also be expanded. The expansion policy expands the fileSize size by twice the size of the original fileSize size until it is larger than the additional capacity. Finally, truncate is clipped to the integer multiple of DEFAULT_MMAP_SIZE.

The core logic is as follows:

constexpr size_t ItemSizeHolderSize = 4;
if (m_dic.empty()) {
    newSize += ItemSizeHolderSize;
}
if (newSize >= m_output->spaceLeft() || m_dic.empty()) {
    auto fileSize = m_file->getFileSize();
    MMBuffer data = MiniPBCoder::encodeDataWithObject(m_dic);
    size_t lenNeeded = data.length() + Fixed32Size + newSize;
    size_t avgItemSize = lenNeeded / std::max<size_t> (1, m_dic.size());
    size_t futureUsage = avgItemSize * std::max<size_t> (8, (m_dic.size() + 1) / 2);
	/ / space needed for > = the current file size | | space for 1.5 times the size of the current file
    if (lenNeeded >= fileSize || (lenNeeded + futureUsage) >= fileSize) {
        size_t oldSize = fileSize;
        do {
            fileSize *= 2;
        } while (lenNeeded + futureUsage >= fileSize);

        if(! m_file->truncate(fileSize)) {return false;
        }

        if(! isFileValid()) { MMKVWarning("[%s] file not valid", m_mmapID.c_str());
            return false; }}return doFullWriteBack(std::move(data));
}
Copy the code

Finally, a simplified version of the data flow:

Setter

The setters for iOS are added directly to the C++ API.

bool set(bool value, MMKVKey_t key); .// avoid unexpected type conversion (pointer to bool, etc)
template <typename T>
bool set(T value, MMKVKey_t key) = delete;
bool set(NSObject<NSCoding> *__unsafe_unretained obj, MMKVKey_t key);
Copy the code

Let’s take a bool as an example:

bool MMKV::set(bool value, MMKVKey_t key) {
    if (isKeyEmpty(key)) {
        return false;
    }
    size_t size = pbBoolSize();
    MMBuffer data(size);
    CodedOutputData output(data.getPtr(), size);
    output.writeBool(value);

    return setDataForKey(std::move(data), key);
}
Copy the code

Value is written to the MMBuffer through CodedOutputData and finally goes to setDataForKey. The same goes for other data types.

setDataForKey

The core method of updating K-V accepts all data update entries and does three things:

  1. Verify data to confirm whether the cache needs to be refreshed and the file needs to be reloaded.
  2. Write buffer data to a file;
  3. Update m_dic;
bool MMKV::setDataForKey(MMBuffer &&data, MMKVKey_t key) {
    if (data.length() == 0 || isKeyEmpty(key)) {
        return false;
    }
    SCOPED_LOCK(m_lock);
    SCOPED_LOCK(m_exclusiveProcessLock);
    checkLoadData();

    auto ret = appendDataWithKey(data, key);
    if (ret) {
        m_dic[key] = std::move(data);
        m_hasFullWriteback = false;
#ifdef MMKV_APPLE
        [key retain];
#endif
    }
    return ret;
}
Copy the code

In the whole mmKV.cpp file, a line [key retain] appears in this method, which is why MRC is used in mmKV.cpp. For more information on retain, please visit 🤔.

checkLoadData

The first step is to confirm that m_needLoadFromFile is true. If yes, loadFromFile is executed.

The next check is to prevent the file from being tampered with by other processes. For a single process, it does not need to consider this case and returns directly.

void MMKV::checkLoadData() {
    if (m_needLoadFromFile) {
        SCOPED_LOCK(m_sharedProcessLock);

        m_needLoadFromFile = false;
        loadFromFile();
        return;
    }
    if(! m_isInterProcess) {// single process
        return;
    }

    if(! m_metaFile->isFileValid()) {return;
    }
    // TODO: atomic lock m_metaFile?
    MMKVMetaInfo metaInfo;
    metaInfo.read(m_metaFile->getMemory());
    if(m_metaInfo->m_sequence ! = metaInfo.m_sequence) { MMKVInfo("[%s] oldSeq %u, newSeq %u",...). ; SCOPED_LOCK(m_sharedProcessLock); clearMemoryCache(); loadFromFile(); notifyContentChanged(); }else if(m_metaInfo->m_crcDigest ! = metaInfo.m_crcDigest) { MMKVDebug("[%s] oldCrc %u, newCrc %u, new actualSize"...). ; SCOPED_LOCK(m_sharedProcessLock);size_t fileSize = m_file->getActualFileSize();
        if(m_file->getFileSize() ! = fileSize) { MMKVInfo("file size has changed [%s] from %zu to %zu"...). ; clearMemoryCache(); loadFromFile(); }else{ partialLoadFromFile(); } notifyContentChanged(); }}Copy the code

To prevent multiprocess file tampering, the metaInfo recorded in the CRC file is first read and compared with m_metaInfo in the current memory. Data updates in metaInfo are done in writeActualSize. When the file is read abnormally, the space is insufficient, or the CRC check fails, meta_info changes are triggered. Specific treatment:

  1. M_sequence indicates that dirt bit data exists. In this case, you need to reload M_file.
  2. If the m_crcDigest is different and the fileSize is different, the capacity is expanded and the M_file needs to be loaded again.
  3. If the m_crcDigest is different and the fileSize is the same, it indicates that full write-back is performed and will pass laterpartialLoadFromFileComplete the update of relevant memory data.

appendData

The official instructions

Standard Protobuf does not provide the ability to update incrementally, and every write must be written in full. Given that the main usage scenario is frequent write updates, we need the ability to do incremental updates: serialize the incremental kV object and append it directly to the end of memory; In this way, the same key will have new and old data, with the latest data at the end; You can keep the data up to date by constantly replacing the previous values with the values read later when you first turn on MMKV when the program starts.

bool MMKV::appendDataWithKey(const MMBuffer &data, MMKVKey_t key) {
#ifdef MMKV_APPLE
    auto keyData = [key dataUsingEncoding:NSUTF8StringEncoding];
    size_t keyLength = keyData.length;
#else
    size_t keyLength = key.length();
#endif
    // size needed to encode the key
    size_t size = keyLength + pbRawVarint32Size((int32_t) keyLength);
    // size needed to encode the value
    size += data.length() + pbRawVarint32Size((int32_t) data.length());

    SCOPED_LOCK(m_exclusiveProcessLock);

    bool hasEnoughSize = ensureMemorySize(size);
    if(! hasEnoughSize || ! isFileValid()) {return false;
    }

#ifdef MMKV_IOS
    auto ret = protectFromBackgroundWriting(m_output->curWritePointer(), size, ^{
      m_output->writeData(MMBuffer(keyData, MMBufferNoCopy));
      m_output->writeData(data); // note: write size of data
    });
    if(! ret) {return false;
    }
#else.M_output ->writeData(data);
#endif.// Encrypt data to update m_actualSize and crcDigest
    return true;
}
Copy the code

The append logic is relatively simple. It is to compress the MMBuffer of Key and Data by PB and write it into m_file. The problem with appending directly to the end of m_file is that the size of the file grows rapidly, leading to uncontrollable file size. Therefore, the remaining file space needs to be checked for each write.

Set Object

Now let’s look at how NSObject in Objc is accessed.

bool MMKV::set(NSObject<NSCoding> *__unsafe_unretained obj, MMKVKey_t key) {
    if (isKeyEmpty(key)) {
        return false;
    }
    if(! obj) { removeValueForKey(key);return true;
    }
    MMBuffer data;
    if (MiniPBCoder::isCompatibleObject(obj)) {
        data = MiniPBCoder::encodeDataWithObject(obj);
    } else {
        /*if ([object conformsToProtocol:@protocol(NSCoding)])*/ {
            auto tmp = [NSKeyedArchiver archivedDataWithRootObject:obj];
            if (tmp.length > 0) { data = MMBuffer(tmp); }}}return setDataForKey(std::move(data), key);
}
Copy the code

MiniPBCoder only supports basic data types and NSString, NSData, and NSDate for Objc:

bool MiniPBCoder::isCompatibleObject(NSObject *obj) {
    if ([obj isKindOfClass:[NSString class]]) {
        return true;
    }
    if ([obj isKindOfClass:[NSData class]]) {
        return true;
    }
    if ([obj isKindOfClass:[NSDate class]]) {
        return true;
    }

    return false;
}
Copy the code

The rest of the NSObject is going to be encoded as NSData and stored in an NSArchive using the NSCoding protocol.

Getter

bool getBool(MMKVKey_t key, bool defaultValue = false); .#ifdef MMKV_APPLE
    NSObject *getObject(MMKVKey_t key, Class cls);
#else  / /! defined(MMKV_APPLE)
    mmkv::MMBuffer getBytes(MMKVKey_t key);
    bool getVector(MMKVKey_t key, std: :vector<std: :string> &result);
#endif // MMKV_APPLE
Copy the code

Take the bool as an example:

bool MMKV::getBool(MMKVKey_t key, bool defaultValue) {
    if (isKeyEmpty(key)) {
        return defaultValue;
    }
    SCOPED_LOCK(m_lock);
    auto &data = getDataForKey(key);
    if (data.length() > 0) {
        try {
            CodedInputData input(data.getPtr(), data.length());
            return input.readBool();
        } catch (std::exception &exception) {
            MMKVError("%s", exception.what());
        }
    }
    return defaultValue;
}
Copy the code

It is easier to read the data, directly from getDataForKey MMBuffer, CodedOutputData conversion to get bool.

getDataForKey

const MMBuffer &MMKV::getDataForKey(MMKVKey_t key) {
    checkLoadData();
    auto itr = m_dic.find(key);
    if(itr ! = m_dic.end()) {return itr->second;
    }
    static MMBuffer nan;
    return nan;
}
Copy the code

Get Object

NSObject *MMKV::getObject(MMKVKey_t key, Class cls) {
    if(isKeyEmpty(key) || ! cls) {return nil;
    }
    SCOPED_LOCK(m_lock);
    auto &data = getDataForKey(key);
    if (data.length() > 0) {
        if (MiniPBCoder::isCompatibleClass(cls)) {
            try {
                auto result = MiniPBCoder::decodeObject(data, cls);
                return result;
            } catch (std::exception &exception) {
                MMKVError("%s", exception.what()); }}else {
            if ([cls conformsToProtocol:@protocol(NSCoding)]) {
                auto tmp = [NSData dataWithBytesNoCopy:data.getPtr() length:data.length() freeWhenDone:NO];
                return[NSKeyedUnarchiver unarchiveObjectWithData:tmp]; }}}return nil;
}
Copy the code

This is also a relatively simple one and I won’t expand it.

conclusion

I’d rather kill 1,000 than one.

This is the first impression after reading the core logic of MMKV. Why is that?

MMKV serves as a framework for multi-process reading and writing. Careful students can find that a lot of exception checks have been carried out before the real logic execution of each method, and the protection and fault tolerance of dirty data are also relatively convoluted. It feels like it’s hard to get the meaning of the method unless you’ve looked at it all. Compared to this, CocoaLumberjack code is very friendly, the role of each key field, the explanation of the core logic, and some of the principles behind are very detailed comments.

This article ignores the codec logic and read/write lock protection of MiniPB and focuses on reading/writing core logic files. MMKV for as long as the exception is a variety of markers, and then overload. The whole framework is also around loadFromFile constantly add protection, file lock, CRC check, dirty data write back.

If you look at this, you can see that this article goes down the layers of the invocation logic, trying to keep the context of each method cohesive as much as possible. Hope to help you understand the core logic of MMKV.