preface

The path of APP performance optimization is endless, here is a Tencent open source lightweight storage framework for improving the efficiency of local storage MMKV

Currently, the project uses SharedPreferences for lightweight storage. Although SP compatibility is excellent, it has been criticized for its low performance, and there have been some INSTANCES of SP-related ANR online

There are many optimization schemes for SP on the Internet. Here, the author uses the method of creating Hook SP in the Application and replacing it with custom SP to enhance the performance. After SDK 28 disables reflection of queuedWork. getHandler, this mode will be disabled

Therefore, an alternative lightweight storage solution is needed, and MMKV is one such framework

I. Integration and testing

Here’s a simple way to use it, and check out the Wiki for more details

Dependency injection

In the build.gradle file of the App module, add:

dependencies {
    implementation 'com. Tencent: MMKV: 1.0.22'
    // replace "1.0.22" with any available version
}
Copy the code

Initialize the

String dir = getFilesDir().geTabSolutePath () +"/mmkv_2";
String rootDir = MMKV.initialize(dir);
Log.i("MMKV"."mmkv root: " + rootDir);
Copy the code

For instance

// Get the default global instance MMKV kv = mmkV.defaultmmkv (); MMKV kv = mmKV.mmkvWithID (mmkv.mmkvwithid = mmKV.mmkvwithid)"MyID"); MKV = mmkV.mmkvWithID (mmkv.mmkvwithid)"MyID", MMKV.MULTI_PROCESS_MODE);
Copy the code

CURD

// Add/update data kv.encode(key, value); Int TMP = kv.decodeint (key); // Delete data kv.removevalueforkey (key);Copy the code

SP migration

private void testImportSharedPreferences() {
    MMKV mmkv = MMKV.mmkvWithID("myData");
    SharedPreferences old_man = getSharedPreferences("myData", MODE_PRIVATE); / / old data migration MMKV. ImportFromSharedPreferences (old_man); Old_man.edit ().clear().commit(); . }Copy the code

The test data

Here are the test results for MMKV, SharedPreferences, and SQLite to write 1000 pieces of data synchronously

// MMKV
MMKV: MMKV write int: loop[1000]: 12 ms
MMKV: MMKV read int: loop[1000]: 3 ms

MMKV: MMKV write String: loop[1000]: 7 ms
MMKV: MMKV read String: loop[1000]: 4 ms

// SharedPreferences
MMKV: SharedPreferences write int: loop[1000]: 119 ms
MMKV: SharedPreferences read int: loop[1000]: 3 ms

MMKV: SharedPreferences write String: loop[1000]: 187
MMKV: SharedPreferences read String: loop[1000]: 2 ms

// SQLite
MMKV: sqlite write int: loop[1000]: 101 ms
MMKV: sqlite read int: loop[1000]: 136 ms

MMKV: sqlite write String: loop[1000]: 29 ms
MMKV: sqlite read String: loop[1000]: 93 ms
Copy the code

It can be seen that MMKV, whether compared with SP or SQLite, has great advantages in performance. The official data test results are as follows

See the wiki for more detailed performance testing

Understanding how MMKV is used and the test results makes me very curious about its implementation principle. Next, I will see how it achieves this performance. Here is the main analysis of the basic operation of MMKV

  • Initialize the
  • instantiation
  • encode
  • decode
  • Synchronization of process reads and writes

Let’s start with the initialization process

2. Initialization

public class MMKV implements SharedPreferences, SharedPreferences.Editor {
    
    // call on program start
    public static String initialize(Context context) {
        String root = context.getFilesDir().getAbsolutePath() + "/mmkv";
        returninitialize(root, null); } static private String rootDir = null; public static String initialize(String rootDir, LibLoader loader) { ...... // Save the root directory mmKV.rootdir = rootDir; // Native layer initialization jniInitialize(mmKV.rootdir);return rootDir;
    }
    
    private static native void jniInitialize(String rootDir);
    
}
Copy the code

MMKV is initialized by passing the root directory to the Native layer via jniInitialize. Let’s look at the initialization operation of Native

// native-bridge.cpp
namespace mmkv {
    
MMKV_JNI void jniInitialize(JNIEnv *env, jobject obj, jstring rootDir) {
    if(! rootDir) {return;
    }
    const char *kstr = env->GetStringUTFChars(rootDir, nullptr);
    if (kstr) {
        MMKV::initializeMMKV(kstr);
        env->ReleaseStringUTFChars(rootDir, kstr);
    }
}
    
}

// MMKV.cpp

static unordered_map<std::string, MMKV *> *g_instanceDic;
static ThreadLock g_instanceLock;
static std::string g_rootDir;

void initialize() {// 1.1 get an unordered_map, similar to Java HashMap g_instanceDic = new unordered_map< STD ::string, MMKV *>; G_instanceLock = ThreadLock(); . } void MMKV::initializeMMKV(const STD ::string &rootDir) {// The Thread mutex and condition variables ensure that the initialize function is executed only once in a process https://blog.csdn.net/zhangxiao93/article/details/51910043 static pthread_once_t once_control = PTHREAD_ONCE_INIT; Pthread_once (&once_control, initialize); // 2. Save the root directory to the global variable g_rootDir = rootDir; // Copy the string char *path = strdup(g_rootdir.c_str ());if(path) {// 3. Generate the directory of the target address according to the path mkPath(path); Free (path); }}Copy the code

You can see that the main tasks in initializeMMKV are to initialize the data and create the root directory

  • Pthread_once_t: A Java singleton whose initialize method is executed only once in the process
    • Create the cache hash g_instanceDic for the MMKV object
    • Create a thread lock g_instanceLock
  • MkPath: creates a file directory based on a string

Let’s take a look at the directory creation process

Directory creation

// mmapedfile. CPP bool mkPath(char *path) {// definestatStruct A property struct used to describe a filestat sb = {};
    bool done = false; // Point to the start of the string char *slash = path;while (!done) {// Move to the first non"/"Slash += STRSPN (slash,"/"); // Move to the first one"/"Slash += STRCSPN (slash,"/");

        done = (*slash == '\ 0');
        *slash = '\ 0';

        if (stat(path, &sb) ! = 0) {// Perform the operation of creating folders, C does not have the operation of mkdirs, need to create folders one by oneif(errno ! = ENOENT || mkdir(path, 0777) ! = 0) { MMKVWarning("%s : %s", path, strerror(errno));
                return false; } // If it is not a folder, it is an invalid pathelse if(! S_ISDIR(sb.st_mode)) { MMKVWarning("%s: %s", path, strerror(ENOTDIR));
            return false;
        }

        *slash = '/';
    }
    return true;
}
Copy the code

The above is a common code for creating file paths at the Native layer. The logic is clear

Ok, with the file directory created, the initialization of the Native layer is finished. Let’s look at the MMKV instance building process

Three. Instantiation

public class MMKV implements SharedPreferences, SharedPreferences.Editor { @Nullable public static MMKV mmkvWithID(String mmapID, int mode, String cryptKey, String relativePath) { ...... Long Handle = getMMKVWithID(mmapID, mode, cryptKey, relativePath);if (handle == 0) {
            returnnull; } // Build a Java shell objectreturnnew MMKV(handle); } private native static long getMMKVWithID(String mmapID, int mode, String cryptKey, String relativePath); // jni private long nativeHandle; private MMKV(long handle) { nativeHandle = handle; }}Copy the code

You can see that the main logic of the MMKV instance building is implemented through the getMMKVWithID method. What does it do internally

// native-bridge.cpp
namespace mmkv {

MMKV_JNI jlong getMMKVWithID(
    JNIEnv *env, jobject, jstring mmapID, jint mode, jstring cryptKey, jstring relativePath) {
    MMKV *kv = nullptr;
    if(! mmapID) {return(jlong) kv; } // Obtain the independent storage id. String STR = jstring2String (env, mmapID); booldone = false;
    ifString crypt = jString2String (env, cryptKey);if (crypt.length() > 0) {
            ifString path = jString2String (env, relativePath); Kv = MMKV::mmkvWithID(STR, DEFAULT_MMAP_SIZE, (MMKVMode) mode, &crypt, &path); }else {
                kv = MMKV::mmkvWithID(str, DEFAULT_MMAP_SIZE, (MMKVMode) mode, &crypt, nullptr);
            }
            done = true; }}... // Return to Javareturn(jlong) kv; }}Copy the code

You can see that the MMKV object is finally obtained by the MMKV::mmkvWithID function

// MMKV.cpp
MMKV *MMKV::mmkvWithID(
    const std::string &mmapID, int size, MMKVMode mode, string *cryptKey, string *relativePath) {

    if (mmapID.empty()) {
        returnnullptr; } SCOPEDLOCK(g_instanceLock); // 1. Use mmapID and relativePath to form the key of the final mmap file path. Auto mmapKey = mmapedKVKey(mmapID, relativePath); Auto itr = g_instanceDic->find(mmapKey); // 2.if(itr ! = g_instanceDic->end()) { MMKV *kv = itr->second;returnkv; } // 3. Create a cache fileifAuto filePath = mappedKVPathWithID(mmapID, mappedKVPathWithID, mmapID, mappedKVPathWithID, mmapID, mappedKVPathWithID, mmapID, mappedKVPathWithID, mmapID,  mode, relativePath); // If it does not exist, create a fileif(! isFileExist(filePath)) {if(! createFile(filePath)) {returnnullptr; }}... } // 4. Create an instance object auto kv = new MMKV(mmapID, size, mode, cryptKey, relativePath); // 5. Cache mmapKey (*g_instanceDic)[mmapKey] = kV;return kv;
}
Copy the code

The mmkvWithID function implementation process is very clear, here we focus on the instance object creation process

// MMKV.cpp MMKV::MMKV( const std::string &mmapID, int size, MMKVMode mode, string *cryptKey, string *relativePath) : M_path (mappedKVPathWithID(m_mmapID, mode, m_mmapID) M_crcPath (crcPathWithID(m_mmapID, mode, relativePath)) // 1. Map file summary information to memory, 4 KB size, m_metaFile(m_crcPath, DEFAULT_MMAP_SIZE, (mode & MMKV_ASHMEM)? MMAP_ASHMEM : MMAP_FILE) ...... , m_sharedProcessLock(&m_fileLock, SharedLockType) ...... , m_isAshmem((mode & MMKV_ASHMEM) ! = 0) {... // Check whether Ashmem is used to share memory anonymously across processesif(m_isAshmem) {// Create shared memory file m_ashmemFile = new MmapedFile(m_mmapID, static_cast<size_t>(size), MMAP_ASHMEM); m_fd = m_ashmemFile->getFd(); }else{ m_ashmemFile = nullptr; } // Create an AES encryption/decryption engine based on cryptKeyif(cryptKey && cryptKey->length() > 0) { m_crypter = new AESCrypt((const unsigned char *) cryptKey->data(), cryptKey->length()); }... // sensitive zone { SCOPEDLOCK(m_sharedProcessLock); // 2. LoadFromFile (); }}Copy the code

You can see a lot of interesting information in MMKV’s constructor. MMKV supports Ashmem shared memory. When we don’t want to write files to disk, but want to communicate across processes, we can use MMAP_ASHMEM provided by MMKV

But let’s focus on two key points here

  • M_metaFile file summary mapping
  • LoadFromFile Load of data

Let’s first look at the mapping of the file summary information

1) Mapping of file summaries

// MmapedFile.cpp MmapedFile::MmapedFile(const std::string &path, size_t size, bool fileType) : M_name (PATH), m_fd(-1), m_segmentPtr(NULlPtr), m_segmentSize(0), m_fileType(fileType) {// File for memory mappingif(m_fileType == MMAP_FILE) { // 1. Open the file m_fd = open (m_name c_str (), O_RDWR | O_CREAT, S_IRWXU);if (m_fd < 0) {
            MMKVError("fail to open:%s, %s", m_name.c_str(), strerror(errno));
        } else{// 2. Create a FileLock FileLock FileLock (m_fd); InterProcessLock lock(&fileLock, ExclusiveLockType); SCOPEDLOCK(lock); // Obtain the information struct of the filestat st = {};
            if(fstat(m_fd, &st) ! M_segmentSize = static_cast<size_t>(st.st_size); } // 3. Verify that the size of the file is less than one memory page, which is usually 4KBif(m_segmentSize < DEFAULT_MMAP_SIZE) { m_segmentSize = static_cast<size_t>(DEFAULT_MMAP_SIZE); // 3.1 Set the size of the file to the memory page by ftruncate // 3.2 Set the size of the file to the blank part after it with 0 by zeroFillFileif(ftruncate(m_fd, m_segmentSize) ! = 0 | |! ZeroFillFile (m_fd, 0, m_segmentSize)) {// The file expansion failed, remove the file close(m_fd); m_fd = -1; removeFile(m_name);return; }} / / 4. Through mmap to memory mapped files, access to memory the first address m_segmentPtr = (char *) mmap (nullptr m_segmentSize, PROT_READ | PROT_WRITE, MAP_SHARED, m_fd, 0);if (m_segmentPtr == MAP_FAILED) {
                MMKVError("fail to mmap [%s], %s", m_name.c_str(), strerror(errno)); close(m_fd); m_fd = -1; m_segmentPtr = nullptr; }}} // File used for shared memoryelse{... }}Copy the code

The constructor of MmapedFile handles the following transactions

  • Open the specified file
  • Create the file lock
  • Fixed file size to 4KB minimum
    • The first 4KB is used for the total size of the statistics
  • Mmap maps files to memory

Ok, MmapedFile constructor, we can get the first address of the mapped memory, when the Linux kernel is responsible for synchronizing the data in memory to the file

Mmap is much more elegant than SP data synchronization. Even if the process dies unexpectedly, mMAP can also use the protection mechanism of the Linux kernel to brush the mapped memory data into the file, improving the reliability of data writing

Let’s see the data load

2) Data loading

// MMKV.cpp
void MMKV::loadFromFile() {... // Ignore anonymous shared memory related code // if the file mapping is already doneif(m_metafile.isFilevalid ()) {// get the relevant data m_metainfo.read (m_metafile.getMemory ()); } / / retrieves the file descriptor m_fd = open (m_path. C_str (), O_RDWR | O_CREAT, S_IRWXU);if (m_fd < 0) {
        MMKVError("fail to open:%s, %s", m_path.c_str(), strerror(errno));
    } else{// 1. Obtain file size m_size = 0; structstat st = {0};
        if(fstat(m_fd, &st) ! = -1) { m_size = static_cast<size_t>(st.st_size); } // 1.1 Set the file size to an integer multiple of its size to the memory pageif(m_size < DEFAULT_MMAP_SIZE || (m_size % DEFAULT_MMAP_SIZE ! = 0)) {... } / / 2. Get after mapping file memory address m_ptr = (char *) mmap (nullptr m_size, PROT_READ | PROT_WRITE, MAP_SHARED, m_fd, 0).if (m_ptr == MAP_FAILED) {
            ......
        } else{// 3. Memcpy (&m_actualSize, m_ptr, Fixed32Size); . bool loadFromFile =false, needFullWriteback = false;
            if(m_actualSize > 0) {// 4. Verify the length of the fileif(m_actualSize < m_size && m_actualSize + Fixed32Size <= m_size) { // 5. Verify that the CRC is correctif (checkFileCRCValid()) {
                        loadFromFile = true;
                    } elseAuto Strategic = MMKV ::onMMKVCRCCheckFail(m_mmapID);if (strategic == OnErrorRecover) {
                            loadFromFile = true;
                            needFullWriteback = true; }}}else{/ / callback file length abnormality auto strategic = MMKV: : onMMKVFileLengthError (m_mmapID);if (strategic == OnErrorRecover) {
                        writeAcutalSize(m_size - Fixed32Size);
                        loadFromFile = true;
                        needFullWriteback = true; }}} // 6. You need to obtain data from the fileif(loadFromFile) { ...... InputBuffer (m_ptr + Fixed32Size, m_actualSize, MMBufferNoCopy);if(m_crypter) {// decrypt the data decryptBuffer in the inputBuffer (*m_crypter, inputBuffer); } // Read data from the input buffer into m_dic m_dic. Clear (); MiniPBCoder::decodeMap(m_dic, inputBuffer); M_output = new CodedOutputData(m_ptr + Fixed32Size + m_actualSize, m_size -fixed32SIZE -m_actualSize); // Write back by refactoring to eliminate duplicate dataif(needFullWriteback) { fullWriteback(); }} // 7. Note that there is no data in the file, or the verification failselse{ SCOPEDLOCK(m_exclusiveProcessLock); // Empty the fileif(m_actualSize > 0) { writeAcutalSize(0); } m_output = new CodedOutputData(m_ptr + Fixed32Size, m_size - Fixed32Size); // recaculateCRCDigest(); }... }}... m_needLoadFromFile =false;
}
Copy the code

Ok, you can see that loadFromFile will read the data from the file into the m_DIC cache if the CRC passes, otherwise it will empty the file

  • Therefore, if the user maliciously modifies the file, the CRC value will be destroyed and the stored data will be invalidated
  • After reading data from the file to m_DIC, mDIC is written back to the file, the purpose of rewriting is to eliminate duplicate data
    • As to why duplicate data occurs, reanalyze it later in the encode operation

3) review

At this point, the MMKV instance is built, and with the memory cache M_DIC, we can query the data more efficiently

In terms of the final result, it is consistent with SP in that all the data in the file will be loaded into the hash table at the first loading. However, MMKV has an extra step of data write back operation, so when the amount of data is large, the speed of instance construction will be affected to some extent

Create MMKV instance time is 4 ms E/TAG: create MMKV instance time is 4 ms E/TAG: create SharedPreferences instance time is 1 msCopy the code

As a result, MMVK does have some disadvantages in instance construction speed, but it benefits from writing data from M_DIC to MMAP memory. The actual time of writing is determined by the Linux kernel, and the page caching mechanism of the file. But not unacceptably

Four encode.

Encode is the process of adding and updating data. EncodeString is taken as an example here

public class MMKV implements SharedPreferences, SharedPreferences.Editor {

    public boolean encode(String key, String value) {
        return encodeString(nativeHandle, key, value);
    }
    
    private native boolean encodeString(long handle, String key, String value);

}
Copy the code

Look at the implementation of the Native layer

// native-bridge.cpp
namespace mmkv {

MMKV_JNI jboolean encodeString(JNIEnv *env, jobject, jlong handle, jstring oKey, jstring oValue) {
    MMKV *kv = reinterpret_cast<MMKV *>(handle);
    if(kv && oKey) { string key = jstring2string(env, oKey); // If value is not NULLif(oValue) {// PasssetThe StringForKey function, which stores data into string Value = jString2String (env, oValue);return (jboolean) kv->setStringForKey(value, key); } // If the value is NULL, remove the value corresponding to the keyelse {
            kv->removeValueForKey(key);
            return (jboolean) true; }}return (jboolean) false; }}Copy the code

So what WE’re going to focus on here is setStringForKey

// MMKV.cpp
bool MMKV::setStringForKey(const std::string &value, const std::string &key) {
    if (key.empty()) {
        return false; } / / 1. The data is encoded into ProtocolBuffer auto data = MiniPBCoder: : encodeDataWithObject (value); // 2. Update the key-value pairreturn setDataForKey(std::move(data), key);
}
Copy the code

There are two main steps

  • Data encoding
  • Update key-value pairs

1) Data coding

MMKV adopts the ProtocolBuffer code, which will not be introduced here. Please check the official Google document for details

// MiniPBCoder.cpp MMBuffer MiniPBCoder::getEncodeData(const string &str) { // 1. M_encodeItems = new vector<PBEncodeItem>(); Size_t index = prepareObjectForEncode(STR); PBEncodeItem *oItem = (index < m_encodeItems->size()) ? &(*m_encodeItems)[index] : nullptr;if(oItem && oItem->compiledSize > 0) { // 3. M_outputBuffer = new MMBuffer(oItem->compiledSize); M_outputData = new CodedOutputData(m_outputBuffer->getPtr(), m_outputBuffer->length()); // Execute the protocolBuffer code and output to the buffer writeRootObject(); } // Call the move constructor, recreate the instance and returnreturnmove(*m_outputBuffer); } size_t MiniPBCoder: : prepareObjectForEncode (const string & STR) {/ / 2.1 create PBEncodeItem object used to describe the coded entries, M_encodeItems ->push_back(PBEncodeItem()); PBEncodeItem *encodeItem = &(m_encodeItems->back()); Size_t index = m_encodeItems->size() -1; {// 2.4 encodeItem->type = PBEncodeItemType_String;
        // 2.5 填充要编码的数据
        encodeItem->value.strValue = &str;
        // 2.6 填充数据大小
        encodeItem->valueSize = static_cast<int32_t>(str.size());
    }
    // 2.7 计算编码后的大小
    encodeItem->compiledSize = pbRawVarint32Size(encodeItem->valueSize) + encodeItem->valueSize;
    return index;
}
Copy the code

As you can see, the size of the encoded data is determined before encoding and is saved to encodeItem->compiledSize. Let’s look at the process of encoding the data and printing it to the buffer

// MiniPBCoder.cpp
void MiniPBCoder::writeRootObject() {
    for (size_t index = 0, total = m_encodeItems->size(); index < total; index++) {
        PBEncodeItem *encodeItem = &(*m_encodeItems)[index];
        switch (encodeItem->type) {// The main concern is the encoding Stringcase PBEncodeItemType_String: {
                m_outputData->writeString(*(encodeItem->value.strValue));
                break; }... } } } // CodedOutputData.cpp void CodedOutputData::writeString(const string &value) { size_t numberOfBytes = value.size(); . This ->writeRawVarint32((int32_t) numberOfBytes); // 1. Memcpy (m_ptr + m_position, ((uint8_t *) value.data()), numberOfBytes); // Update the value of position m_position += numberOfBytes; }Copy the code

You can see that the writeString of CodedOutputData is encoded according to the protocol buffer

Where m_ptr is the address of the memory buffer created above, that is, when the writeString is finished, the data is already written to the buffer

Now that you have the encoded data buffer, let’s look at updating the key-value pair

2) Update of key/value pairs

// MMKV.cpp
bool MMKV::setStringForKey(const std::string &value, Const STD: : string & key) {/ / auto coding data acquisition data storage buffer data = MiniPBCoder: : encodeDataWithObject (value); // Update the key-value pairreturn setDataForKey(std::move(data), key);
}

bool MMKV::setDataForKey(MMBuffer &&data, const std::string &key) { ...... Auto RET = appendDataWithKey(data, key); // Write successful, update hash dataif (ret) {
        m_dic[key] = std::move(data);
        m_hasFullWriteback = false;
    }
    returnret; } bool MMKV::appendDataWithKey(const MMBuffer &data, const std::string &key) { // 1. Size_t keyLength = key.length(); size_t size = keyLength + pbRawVarint32Size((int32_t) keyLength); size += data.length() + pbRawVarint32Size((int32_t) data.length()); SCOPEDLOCK(m_exclusiveProcessLock); Bool hasEnoughSize = ensureMemorySize(size); // 2.if(! hasEnoughSize || ! isFileValid()) {return false; } // 3. Update the total data size of the file header. WriteAcutalSize (m_actualSize + SIZE); // 4. Write the key and the encoded value to the memory of the file mapping m_output->writeString(key); m_output->writeData(data); Auto uint8_t *) m_ptr + Fixed32Size + m_actualSize - size; // 3.if(m_crypter) {// encrypt(PTR, PTR, size); } // 6. Update CRC updateCRCDigest(PTR, size, KeepSequence);return true;
}
Copy the code

Ok, as you can see, the operation of updating the key-value pair is a little more complicated, first write the key-value pair data to the file mapped memory, and update the hash data after the write is successful

The comments in the above code are also very clear about the process of writing to the file map. How do we reassemble and expand ensureMemorySize

Reorganization and expansion of data

// MMKV.cpp bool MMKV::ensureMemorySize(size_t newSize) { ...... // Calculate the size of the new key-value pair. Constexpr size_t ItemSizeHolderSize = 4;if(m_dic.empty()) { newSize += ItemSizeHolderSize; } // Data rewrite: // 1. The file has less free space than the new key-value pair // 2. Hash to emptyif(newSize > = m_output - > spaceLeft () | | m_dic. Empty ()) {/ / calculate the required data space static const int offset = pbFixed32Size (0); MMBuffer data = MiniPBCoder::encodeDataWithObject(m_dic); size_t lenNeeded = data.length() + offset + newSize;if (m_isAshmem) {
            ......
        } else{// // calculate the average size of each key-value pair size_t avgItemSize = lenNeeded/STD :: Max <size_t>(1, m_dic. Size ()); Size_t futureUsage = avgItemSize * STD :: Max <size_t>(8, (m_dic. Size () + 1) / 2); Required space >= Total size of the current file // 2. 1.5 times of the required space >= Total size of the current fileif(lenNeeded > = m_size | | (lenNeeded + futureUsage) > = m_size) {/ / capacity for 2 times size_t oldSize = m_size;do {
                    m_size *= 2;
                } while(lenNeeded + futureUsage >= m_size); . }}... WriteAcutalSize (data.length()); . }return true;
}
Copy the code

From the above code we can see that

  • Timing of data rewriting
    • The remaining file space is less than the size of the new key-value pair
    • Hash to empty
  • File Expansion Time
    • When the required space is 1.5 times larger than the total size of the current file, the capacity is expanded to double the previous size

3) review

So now that we’re done with the process of encode, let’s review the whole process of encode

  • Use the ProtocolBuffer code value
  • willkeyThe encoded valueUse the ProtocolBuffer format Append to the end of the file map memory
    • Insufficient file space
      • Determine whether to expand the capacity
      • Write back data
    • That is, the file is appended
  • This key value is used for uniform encryption of the region
  • Update the value of CRC
  • Update the memory area of the ProtocolBuffer for the key and value to the hash table m_dic

Through the analysis of encode, we know that the storage mode of MMKV file is as follows

Here’s how decode works

Five decode.

The process of decode also uses decodeString as an example

// native-bridge.cpp
MMKV_JNI jstring
decodeString(JNIEnv *env, jobject obj, jlong handle, jstring oKey, jstring oDefaultValue) {
    MMKV *kv = reinterpret_cast<MMKV *>(handle);
    if(kv && oKey) { string key = jstring2string(env, oKey); // getStringForKey (); // getStringForKey (); bool hasValue = kv->getStringForKey(key, value);if (hasValue) {
            returnstring2jstring(env, value); }}return oDefaultValue;
}

// MMKV.cpp
bool MMKV::getStringForKey(const std::string &key, std::string &result) {
    if (key.empty()) {
        return false; } SCOPEDLOCK(m_lock); Auto &data = getDataForKey(key); // 1.if(the data length () > 0) {/ / 2. Analytical data corresponding ProtocolBuffer data result = MiniPBCoder: : decodeString (data);return true;
    }
    return false; } const MMBuffer &mmkV ::getDataForKey(const STD ::string &key) {auto itr = m_dic. Find (key);if(itr ! = m_dic.end()) {return itr->second;
    }
    static MMBuffer nan(0);
    return nan;
}
Copy the code

The decode process is relatively simple. First, the ProtocolBuffer area is fetched from the memory cache. Then, the decode area is parsed to obtain the true value from the memory buffer

thinking

One question that might be asked is why m_DIC doesn’t store the raw key and value data directly, so the query efficiency is faster.

  • Indeed, the query efficiency is faster because there is no ProtocolBuffer decoding process

As can be seen from the results of the figure, MMKV read performance is slightly lower than SharedPreferences, here I give my own thinking

  • M_dic also plays an important role in data refactoring. It is necessary to rely on M_DIC to write data to the file map area of MMAP. This process is very time-consuming. Then, it is necessary to conduct another ProtocolBuffer operation for all the values. Especially when the data volume is large, the performance loss caused by this operation cannot be ignored

Since m_DIC is also used to facilitate data replication, is it possible to add a memory cache dedicated to storing the original value?

  • MMKV is a lightweight cache framework, so the increase in query time is not enough to pay for the increase in memory. I think this is a reasonable solution that Tencent got after weighing various aspects

6. Process read and write synchronization

It is natural to think of Linux semaphores when talking about inter-process read/write synchronization, but this approach has the disadvantages of not releasing the semaphores when the process holding the lock dies unexpectedly, and it is dangerous that a blocked process will not be awakened if multiple processes are competing

The MMKV uses file lock to synchronize between processes

  • LOCK_SH(shared lock): Multiple processes can use the same lock, often used as a read shared lock
  • LOCK_EX(exclusive lock): Allowed to be used by only one process at a time. Often used as a write lock
  • LOCK_UN: releases the lock

Next I look at MMKV plus unlock operation

A) File sharing lock

MMKV::MMKV( const std::string &mmapID, int size, MMKVMode mode, string *cryptKey, string *relativePath) : M_fileLock (m_metafile.getfd ()) // Describe the shared lock, M_exclusiveProcessLock (&M_Filelock, SharedLockType) ExclusiveLockType) m_isInterProcess((mode & MMKV_MULTI_PROCESS)! = 0 || (mode & CONTEXT_MODE_MULTI_PROCESS) ! = 0) , m_isAshmem((mode & MMKV_ASHMEM) ! = 0) {... M_sharedprocesslocke. m_enable = m_isInterProcess; // Allow or disable shared and exclusive locks based on whether or not they operate across processes. m_exclusiveProcessLock.m_enable = m_isInterProcess; // sensitive zone {// enable file share lock SCOPEDLOCK(m_sharedProcessLock); loadFromFile(); }}Copy the code

You can see that in the constructor we analyzed earlier, MMKV initializes file locks and creates shared and exclusive locks that are enabled for cross-process operations and shared locks that are enabled for read operations

2) Exclusive file lock

bool MMKV::fullWriteback() {... auto allData = MiniPBCoder::encodeDataWithObject(m_dic); // Start exclusive lock SCOPEDLOCK(m_exclusiveProcessLock);if (allData.length() > 0) {
        if (allData.length() + Fixed32Size <= m_size) {
            if (m_crypter) {
                m_crypter->reset();
                auto ptr = (unsigned char *) allData.getPtr();
                m_crypter->encrypt(ptr, ptr, allData.length());
            }
            writeAcutalSize(allData.length());
            delete m_output;
            m_output = new CodedOutputData(m_ptr + Fixed32Size, m_size - Fixed32Size);
            m_output->writeRawData(allData); // note: don't write size of data recaculateCRCDigest(); m_hasFullWriteback = true; return true; } else { // ensureMemorySize will extend file & full rewrite, no need to write back again return ensureMemorySize(allData.length() + Fixed32Size - m_size); } } return false; }Copy the code

An exclusive lock is enabled in a function that writes back data

3) Performance of reading and writing efficiency

The performance of synchronous read and write is as follows

As you can see, the process synchronization of read and write efficiency is also very nice

About cross process synchronization is introduced here, and of course MMKV file lock is not as easy on the surface, because the file lock lock for the state, no matter how many times the lock, a unlock operation to remove completely, apparently unable to pair should function calls, nested MMKV internal passed to implement a counter to realize reentrancy of lock, See the wiki for more details

conclusion

Through the above analysis, we have an overall control of MMKV, and its specific performance is shown as follows

project evaluation describe
correctness optimal Supports multi-process security and uses MMAP to ensure correct data write back by the operating system
The time overhead optimal Using MMAP implementation, the copying of user space data to kernel space is reduced
The space overhead In the Using a Protocl buffer to store data, the same data consumes less space than XML or JSON

Data is appended to the end. Key merging is triggered only when a certain threshold is reached. If multiple copies of the same key are not merged, multiple copies exist
security In the CRC is used to identify abnormal data caused by unstable file systems and operating systems
Development costs optimal It is simple to use
compatibility optimal All Android versions are compatible

Although MMKV is slightly slower than SP in some scenarios (e.g. When the first instance is instantiated, the data will be overwritten to eliminate duplicate data, which is a little slower than SP. When the data is queried, there is a ProtocolBuffer decoding, which is a little slower than SP.), but its inversion data writing speed and MMAP Linux kernel guarantee data synchronization. And the smaller local storage footprint that the ProtocolBuffer code brings are all good points

In the process of analyzing MMKV code, I learned a lot from it. Thanks to Tencent for their contribution to the open source community

reference

  • Github.com/Tencent/MMK…
  • Developers.google.com/protocol-bu…
  • Time.geekbang.org/column/arti…
  • www.cnblogs.com/kex1n/p/710…