directory

  • SharedPreferences and MMKV
  • SharedPreferences and MMKV

MMKV source code analysis

Initialization /MMKV.initialize(this);

In the whole process of MMKV, the initialization of MMKV plays a connecting role.

public static String initialize(Context context) {
        // Get the root path
        String root = context.getFilesDir().getAbsolutePath() + "/mmkv";
        MMKVLogLevel logLevel = MMKVLogLevel.LevelInfo;
        return initialize(root, (MMKV.LibLoader)null, logLevel); 1 -->
    }
/ / 1 - >
public static String initialize(String rootDir, MMKV.LibLoader loader, MMKVLogLevel logLevel) {
        // Load the necessary so files
        //...
        // Use JNI to initialize the implementation of c
        jniInitialize(MMKV.rootDir, logLevel2Int(logLevel));
        return rootDir;
    }
Copy the code

Because to here directly through the third party library import can not meet the view, so directly to download MMKV open source library source code view is more appropriate.

If you are not familiar with JNI method scheduling, it doesn’t matter, I will slowly through the way to teach you.

You can see it’s explodingJNIMethod, how do you locate it?
Shift
initializeMMKV

You can see that there are two methods, you go in and you know as C says, you’ve already targeted the target group.

void MMKV::initializeMMKV(const MMKVPath_t &rootDir, MMKVLogLevel logLevel) {
    // ThreadOnce indicates that the initialization process takes place only once
    ThreadLock::ThreadOnce(&once_control, initialize);

    g_rootDir = rootDir;
    // Set the destination path and create the hierarchy if it does not exist.
    mkPath(g_rootDir);
}
Copy the code

Object instance gets /MMKV.defaultMMKV()

public static MMKV defaultMMKV(a) {
        // Can be set to multi-process mode
        // What is the point
        long handle = getDefaultMMKV(SINGLE_PROCESS_MODE, null); / / * * * -- >
        return new MMKV(handle); / / 1 - >
    }
/ / 1 - >
// Is a handle variable of type long
private MMKV(long handle) {
        nativeHandle = handle;
    }
Copy the code

You can see that the *** comment position is a deceptive behavior in the code. The data type definition can tell that the data is a data type of long. We can guess that the data is used to find the corresponding MMKV. Further calls reveal that he called a mmkvWithID() method, where DEFAULT_MMAP_ID is mmKV.default.

MMKV *MMKV::mmkvWithID(const string &mmapID, int size, MMKVMode mode, string *cryptKey, string *relativePath) {
    //...
    auto mmapKey = mmapedKVKey(mmapID, relativePath); / / 1 - >
    auto itr = g_instanceDic->find(mmapKey);
    if(itr ! = g_instanceDic->end()) { MMKV *kv = itr->second;/ / 2 - >
        return kv;
    }
    if (relativePath) {
        if(! isFileExist(*relativePath)) {if(! mkPath(*relativePath)) {return nullptr; }}}auto kv = new MMKV(mmapID, size, mode, cryptKey, relativePath); / / 3 - >
    (*g_instanceDic)[mmapKey] = kv;
    return kv;
}
Copy the code

There are two core parts in this code:

  1. Calculation of mmapKey value:throughmmapIDandrelativePathTwo values perform certain operations, and the relationship ismmapIDandrelativePathThe coincidence relationship, or see code implementation.
  2. Generation of MMKV:The interpretation here correspondsNote 2andNote 3, is through aMapTo store the data if theg_instanceDicThis variable is used for data queries.

Internal structure of MMKV

MMKV::MMKV(const string &mmapID, int size, MMKVMode mode, string *cryptKey, string *relativePath)
    : m_mmapID(mmapedKVKey(mmapID, relativePath)) // historically Android mistakenly use mmapKey as mmapID
    , m_path(mappedKVPathWithID(m_mmapID, mode, relativePath)) / / 1 - >
    , m_crcPath(crcPathWithID(m_mmapID, mode, relativePath))
    , m_dic(nullptr)
    , m_dicCrypt(nullptr)
    , m_file(new MemoryFile(m_path, size, (mode & MMKV_ASHMEM) ? MMFILE_TYPE_ASHMEM : MMFILE_TYPE_FILE))
    , m_metaFile(new MemoryFile(m_crcPath, DEFAULT_MMAP_SIZE, m_file->m_fileType))
    , m_metaInfo(new MMKVMetaInfo())
    , m_crypter(nullptr)
    , m_lock(new ThreadLock())
    , m_fileLock(new FileLock(m_metaFile->getFd(), (mode & MMKV_ASHMEM)))
    , m_sharedProcessLock(new InterProcessLock(m_fileLock, SharedLockType))
    , m_exclusiveProcessLock(newInterProcessLock(m_fileLock, ExclusiveLockType)) , m_isInterProcess((mode & MMKV_MULTI_PROCESS) ! =0|| (mode & CONTEXT_MODE_MULTI_PROCESS) ! =0) {
    m_actualSize = 0;
    m_output = nullptr;

    if (cryptKey && cryptKey->length() > 0) {
        m_dicCrypt = new MMKVMapCrypt();
        m_crypter = new AESCrypt(cryptKey->data(), cryptKey->length());
    } else {
        m_dic = new MMKVMap();
    }
    //... Some assignment operations

    // sensitive zone{ SCOPED_LOCK(m_sharedProcessLock); loadFromFile(); }}Copy the code

And SharedPreferences the same finally still need to experience a life-to-death struggle and file read and write, the question comes, the same is the file read and write, why MMKV can with a hundred times the speed of crushing all kinds of mature products? From my thinking, it can be divided into several situations like this:

  1. Insufficiently robust error data handling.This is if you make a simple versionFastJsonYou can find that basically the speed of data processing can be very high. But this is not usually the case for relatively mature products.
  2. Data processing is done at the bottom.To some extent, the implementation of this plan is also reasonable for the current comparison between the two, because it can be foundMMKVThe implementation scheme is basically dependent onJNITo dispatch the completion, whileCThe processing speed andJavaI think we can see that.
  3. More optimized file reading scheme.This is the analysis of the current scenario, because I haven’t seen the following code, so this is a scenario guess. becauseSharedPreferencesandMMKVWe have seen that both need to read and write data operations, and the final source of data is the local file, a more easy to read and write file scheme is bound to be a key breakthrough point.
  4. . It’s up to you to start thinking more.

Back to business:loadFromFile();

In the conjecture just now, I mentioned the problem of file reading and writing, because for MMKV, file reading and writing must be unavoidable, but how to be more efficient is the point we should think about.

void MMKV::loadFromFile() {
    // Reload the file if it is not valid
    if(! m_file->isFileValid()) { m_file->reloadFromFile(); }// If the file is still invalid, an error is reported
    if(! m_file->isFileValid()) { MMKVError("file [%s] not valid", m_path.c_str());
    } else {
        // This step at least indicates that the file is valid, but the data needs to be verified
        // error checking
        bool loadFromFile = false, needFullWriteback = false;
        checkDataValid(loadFromFile, needFullWriteback);
        auto ptr = (uint8_t *) m_file->getMemory();
        // loading
        if (loadFromFile && m_actualSize > 0) {
            MMBuffer inputBuffer(ptr + Fixed32Size, m_actualSize, MMBufferNoCopy);
            if (m_crypter) {
                clearDictionary(m_dicCrypt);
            } else {
                clearDictionary(m_dic);
            }
            / / 1 - >
            if (needFullWriteback) {
                if (m_crypter) {
                    MiniPBCoder::greedyDecodeMap(*m_dicCrypt, inputBuffer, m_crypter); / / 2 - >
                } else {
                    MiniPBCoder::greedyDecodeMap(*m_dic, inputBuffer); / / 2 - >}}else {
            / / 1 - >
                if (m_crypter) {
                    MiniPBCoder::decodeMap(*m_dicCrypt, inputBuffer, m_crypter); / / 2 - >
                } else {
                    MiniPBCoder::decodeMap(*m_dic, inputBuffer); / / 2 - >
                }
            }
            m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
            m_output->seek(m_actualSize); // Calculate the actual size of the data volume
            if(needFullWriteback) { fullWriteback(); }}else {
            // Discard data if it is invalid or empty.
            //...
        }
    }

    m_needLoadFromFile = false;
}
Copy the code

In the code snippet I have highlighted comments 1 and 2, which I think are the most important ones, to do two things:

  1. Data write back scheme making:This is a very distinctive place. Why do you say so? In fact, you can tell from the name of a variable that there is a choice between the write back strategy of the data, the partial write back strategy and the full write back strategy, so that’s the first reason whyMMKVThe comprehensive performance can be stronger thanSharedPreferences.
  2. File format selection:This is actually a matter of parsing. The argument for this passage comes fromMMKV principle.protobufAs aMMKVThe final option has a good performance in terms of performance and footprint.

Data updatekv.encodeXXX("string", XXX);

Take only one as an example for code analysis here

MMKV_JNI jboolean encodeBool(JNIEnv *env, jobject, jlong handle, jstring oKey, jboolean value) {
    MMKV *kv = reinterpret_cast<MMKV *>(handle); / / 1 - >
    if (kv && oKey) {
        string key = jstring2string(env, oKey); // Make the key special processing
        return (jboolean) kv->set((bool) value, key); / / 2 - >
    }
    return (jboolean) false;
}
Copy the code

Focus on a few comment points:

  1. Note 1:This is what was mentioned earlier in the aboveJavaThe operation performed in this layer is a data type oflongthehandleVariable to perform an assignment, and thishandleCan be parsed to initialized at a later stageMMKVObject.
  2. Note 2: After the corresponding data placement, it is time to observe how deep scheduling of the code works.
bool MMKV::set(bool value, MMKVKey_t key) {
    // 1. Measure the data and create an interval of the same size
    size_t size = pbBoolSize();
    MMBuffer data(size);
    // 2. Convert to CodedOutputData object for writing
    CodedOutputData output(data.getPtr(), size);
    output.writeBool(value); / / 3 - >
    // This is a formal data substitution operation
    // A file is written after tracing.
    return setDataForKey(move(data), key); 
}
/ / 3 - >
void CodedOutputData::writeBool(bool value) {
    // Use 0 and 1 to represent the final value
    this->writeRawByte(static_cast<uint8_t>(value ? 1 : 0));
}
Copy the code

However, according to the official document, there is a problem with the data in this file format, that is, it does not support incremental update, which means more complex operations. What is Tencent’s solution?

Standard Protobuf does not provide incremental update capability; each write must be written in full. Considering that the main usage scenario is frequent write update, we need the ability of incremental update: append the incremental KV object directly to the end of memory after serialization; In this way, there will be several new and old copies of the same key, with the latest data at the end. Therefore, when the program starts and turns on MMKV for the first time, the data can be guaranteed to be up-to-date and effective by constantly replacing the previous value with the value read later.

In short, what is new or changed is added last and then inserted.

And the addition of new and old data is bound to create a huge file, that aspectMMKVWhat are the proposed solutions?

Space is allocated in the unit of memory pagesize, and append mode is used until space is exhausted. When append to the end of the file, file reorganization, key rearrangement, try serialization save the rearrangement result; If you don’t have enough space after reloading, double the size of the file until you have enough space.

Again, to describe it in one sentence, a file rewrite with an upper bound target.

The ensureMemorySize() method in MMKV_IO is used to sort through the total amount of existing data sizes, as large data volumes are often caused by repeated additions of large volumes of data.

Data acquisitionkv.decodeXXX("string");

MMKV_JNI jboolean decodeBool(JNIEnv *env, jobject, jlong handle, jstring oKey, jboolean defaultValue) {
    MMKV *kv = reinterpret_cast<MMKV *>(handle);
    if (kv && oKey) {
        string key = jstring2string(env, oKey);
        return (jboolean) kv->getBool(key, defaultValue);
    }
    return defaultValue;
}
Copy the code

In fact, the basic logic and write file almost, this time or first to obtain a corresponding MMKV object, and then complete the data acquisition.

bool MMKV::getBool(MMKVKey_t key, bool defaultValue) {
    auto data = getDataForKey(key);
    if (data.length() > 0) {
        CodedInputData input(data.getPtr(), data.length(a));
        return input.readBool();
    }
    return defaultValue;
}
Copy the code

The CodedInputData object is converted to read the data, and if the data does not exist, the default value is returned.

Delete the corresponding data /kv.removeValueForKey("string")

Before you look at the code, how would you do something like this based on the data you already know?

The following are the points we should pay attention to:

  1. protobufIs a file format that does not support incremental updatesMMKVThe proposed solution is to superimpose old and new data through tail increment
  2. fromQuestion 1Extended, the old and new data superposition of a query and delete the problem, because the old and new data, so do the query is bound to repeatedly check, if every time the data have1GThen your query will stack up to1GTo the extent that you can start to delete.

If we think clearly about the above problems, we can give the solution of MMKV.

auto itr = m_dic->find(key);
        if(itr ! = m_dic->end()) { m_hasFullWriteback =false;
            static MMBuffer nan; / / * * * * * *
            auto ret = appendDataWithKey(nan, itr->second); / / * * * * * *
            if (ret.first) {
#ifdef MMKV_APPLE
                [itr->first release];
#endif
                m_dic->erase(itr);
            }
            return ret.first;
        }
Copy the code

Placing the focus entirely on the section annotated with *, an unassigned MMBuffer stating that the data is null, and then directly calling the appendDataWithKey() file write scheme indicates that the final data style appearing in a Protobuf will look like this.

message empty{
	
}
Copy the code

It’s just adding a new empty data as the new data.

conclusion

After analyzing the source code, compared with SharedPreferences, the rearrangement can be summarized as the following breakthroughs:

  1. Use of MMAP: The use of memory mapping techniques reduces the time spent copying and submitting SharedPreferences.
  2. Data update method:Locally updated data is done by tail append rather than imageSharedPreferencesSame direct file refactoring. Also be aware that this approach leads to an increase in redundant data.
  3. Design for Multi-process access Security:Details can be found inDesign and implementation of MMKV for Android multi-process, mainly withmmapAs a breakthrough, to complete a state perception of the operation of other processes on the current file, mainly divided into three aspects:Write pointer growth, memory refactoring, memory growth

The resources

  1. Official document of MMKV