preface

In this article, I’ll walk through the Sizzle’s entire filtering process. Everything from how it finds a seed set, how it converts tokens into filtering rules, to how it filters through rules. I’m going to illustrate this with an example. The token conversion to filter rules is very convoluted, especially with the Sizzle and the caching logic, and the most complicated is actually the caching. My personal description may not let a person listen to very understand, so people are interested in, can combine my introduction to look at the source code, I now also is only read for more than eighty percent, the cache of the relevant code and I was not particularly thorough understanding, so here I can only give you to analyse my own understanding, the main process of select elements.

Example: Sizzle(‘.container input[type=text]’)

Sizzle selection principle

The Sizzle is not selected from left to right, instead of selecting ‘.container’ and then looking for the input below. This may seem reasonable, but it is time-consuming, because as the DOM tree branches down, the Sizzle will first find a seed set at the end of the selector, and then work its way up through the seed set to see if it meets the criteria.

So how do you choose a seed? So that’s what the select function does.

Sizzle.select

So this function basically does two things.

  1. Will select the stringtokenize
  2. To find outseed

A selection string may have multiple relational selectors, such as body P > INPUT :disabled. If we use these relational selectors as a split, we can get several sets of selectors. The seed is the element selector, the ID selector, or the class selector in the last set of selectors. If the last set of selectors does not have any of these three selectors, then there is no seed.

In the example above, the seed is all the inputs in the entire document.

If, when setDocument support. GetElementsByClass = false words, then ` seed ` does not include the class selector

Examples of tokenize

function

select = Sizzle.select = function(selector, context, results, seed) {
    var i, tokens, token, type, find,
        compiled = type selector === 'function'&& selector, match = ! seed && tokenize( (selector = compiled.selector || selector) ); results = results || [];// Select a string without a comma.
    if (match.length === 1) {
        tokens = match[0] = match[0].slice(0);
        if (tokens.length > 2 && documentIsHTML && Expr.relative[tokens[1].type] ) {
            context = (Expr.find["ID"](token.matches[0]
                .replace(runescape, funescape), context) || [])[0];
            if(! context) {return results;
            } else if (compiled) {
               context = context.parentNode 
            }
            selector = selector.silce(tokens.shift().value.length);
        }
        i = matchExpr['needsContxt'].test(selector) ? 0 : tokens.length;
        
        // Start looking for seed
        while (i--) {
            token = tokens[i];
            // From the back to the front, if the relation selector is encountered, it will not be found
            if (Expr.relative[(type = token.type)]) {
                break;
            }
            
            // expr. find has at most three properties, which are set when you setDocument
            // TAG CLASS ID
            if ((find = Expr.find[type])) {
                if ( (seed = find(
                    token.matches[0].replace(runescape, funescape),
                    rsibling.test(tokens[0].type) && testContext(context.parentNode) ||
                        context
                ) ) ) {
                    tokens.splice(i, 1);
                    // Reassemble the selector because the seed has been extracted
                    Container [type=text]. Container [type=text]
                    selector = seed.length && toSelector(tokens);
                    if(! selector) { push.apply(results, seed);return results
                    }
                    break; } } } } (compiled || compile(selecotr, match)) ( seed, context, ! documentIsHTML, results, ! context || rsibling.test(selector) && testContext(context.parentNode) || context );}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}
    return results;
}
Copy the code

compile

Compile is not actually a function to generate rules, it is a total entry, the main function is to generate the rule cache, from the cache to find whether there is a corresponding rule, return a superMatch function, superMatch function is to do filtering function.

function

compile = Sizzle.compile = function(selector, match) {
    var i,
        setMatcher = [],
        elementMatchers = [],
        cached = compilerCache[selector + ' '];
    if(! cache) {if(! match) { match = tokenize(selector); } i = match.length;// Note: match is the entire two-dimensional array, the entire selection group, so loop only once
        while(i--) {
            cached = matcherFromTokens(match[i]);
            // In the case of complex selectors, pseudo-class functions are marked
            if (cached[expando]) {
                setMatchers.push(cached);
            } else{ elementMatchers.push(cached); }}}/ / cache
    cache = compilerCache(
        selector,
        // This function returns the superMatch function
        matcherFromGroupMatchers(elementMatchers, setMatchers)
    )
}
Copy the code

matcherFromTokens

MatcherFromTokens generates rules through tokens. The process is as follows. It creates an array of Matchers and creates a baseMathcer function, which is usually true. After traversing the entire token, as long as no relational operator is encountered, the corresponding filter function is pushed into matchers; When relational operators are encountered, all filter functions already in matchers will be wrapped in elementMatcher, and then addCombinator will be used as a link to return a function, instead of the previous matchers. So loop until the entire token traversal is completed. The main function of addCombinator is to find siblings and parents based on relational operators.

I’m going to put matcherFromTokens, elementMatcher, addCombinator down here.

function

function matcherFromTokens(tokens) {
    var checkContext, matcher, j,
        len = tokens.length,
        // Check whether it starts with a relational operator
        leadingRelative = Expr.relative[ tokens[0].type ],
        // If it does not start with a relation, the default is the parent set
        implicitRelative = leadingRelative || Expr.relative[' '],
        i = leadingRelative ? 1 : 0.// This is baseMatcher
        // Fn as an argument in addCombinator is a filter
        matchContext = addCombinator(function(elem) {
            return elem === checkContext;
        }, implicitRelative, true),
        matchAnyContext = addCombinator(function(elem) {
            return indexOf(checkContext, elem) > - 1;
        }, implicitRelative, true),
        
        // This is the final set of rules, which first puts baseMatcher in the set
        // In general (! leadingRelative && (xml || context ! == outermostContext)) returns true and does not execute the following function
        matchers = [ function(elem, context, xml) {
            varret = (! leadingRelative && (xml || context ! == outermostContext)) || ( ( checkContext = context ).nodeType ? matchContext(elem, context, xml) : matchAnyContext(elem, context, xml) ); checkContext =null;
            returnret; }];// Pass tokens forward
    for (; i < len; i++) {
        // If it is a relation
        if ((matcher = Expr.relative[tokens[i].type])) {
            // Wrap the existing rules in elementMatcher and create the association with addCombinator;
            // The new matcher is generated to replace all the original matcher
            matchers = [addCombinator(elementMatcher(matchers), matcher)];
        TAG ATTR PESUDO ID CLASS CHILD
        } else {
            matcher = Expr.filter[tokens[i].type].apply(null, toekns[i].matches);
            
            // If it is a pseudo-class, I have tried many selectors here but none of them get into the if
            // It feels like a very complex selector
            // Because I haven't tried it out yet, I don't know what it is
            if (matcher[expando]) {
                j = ++i;
                for (; j < len; j++) {
                    if (Expr.relative[tokens[j].type]) {
                        break; }}return setMatcher(
                    i > 1 && elementMatcher(matchers),
                    i > 1 && toSelector(
                    tokens
                        .slice(0, i - 1)
                        .concat({value: tokens[i - 2].type === ' ' ? The '*' : ' '})
                    ).replace(rtrim, '$1'), matcher, i < j && matcherFromTokens(tokens.slice(i, j)), j < len && matcherFromTokens((tokens = tokens.slice(j))), j < len && toSelector(tokens) ); } matchers.push(matcher); }}// Finally wrap it in elementMatcher and return a function
    return elementMatcher(matchers);
}

function addCombinator(matcher, combinator, base) {
    var dir = combinator.dir,
        skip = combinator.next,
        key = skip || dir,
        checkNonElements = base && key === 'parentNode',
        doneName = done++;
    // If it is > +, it is > +
    return combinator.first ?
        
    // Check the nearest parent or sibling element
    function(elem, context, xml) {
        // The elem of the while loop is continuously assigned
        // That's why it's called a bond
        // After this loop, the new elements found are placed in the subsequent Matcher, which forms the logic of looking up through the seed level
        while(elem = elem[dir]) {
            // When an element node is encountered
            if (elem.nodeType === 1 || checkNonElements) {
                returnmatcher(elem, context, xml); }}return false;
    } :
    
    // Check all parent or sibling elements
    function(elem, context, xml) {
        var oldCahce, uniqueCache, outerCache,
            newCache = [dirruns, doneName];
        
        if (xml) {
            while ((elem = elem[dir])) {
                if (elem.nodeType === 1 || checkNonElments) {
                    if (matcher(elem, context, xml)) {
                        return true; }}}}else {
            while ((elem = elem[dir])) {
                // This block is cached
                // Cache is the most confusing part
                // I don't understand this part. If someone understands this part, please let me know
                / / crab crab
                if (elem.nodeType === 1 || checkNonElements) {
                    outerCache = elem[expando] || (elem[expando] = {});
                    uniqueCache = outerCache[elem.uniqueID] || 
                        (outerCache[elem.uniqueID] = {} );
                    if (skip && skip === elem.nodeName.toLowerCase()) {
                        elem = elem[dir] || elem;
                    } else if ( (oldCache = uniqueCache[key]) &&
                        oldCache[0] === dirruns && oldCache[1] === doneName) {
                        return (newCache[2] = oldCache[2]);
                    } else {
                        // The above two ifs should both take values from the cache
                        uniqueCache[key] = newCache;
                        if ( (newCache[2] = matcher(elem, context, xml)) ) {
                            return true
                        }
                    }
                }
            }
        }
        return false; }}// The method is to knead a bunch of matacher into one
function elementMatcher(matchers) {
    return matchers.length > 1 ?
        function(elem, context, xml) {
            var i = matchers.length;
            // Note that the word "I" means "I"
            // This is like peeling an onion, judging the rules one by one
            while (i--) {
                // Return false if one of them is not satisfied
                if(! matchers[i](elem, context, xml)) {return false;
                }
                return true;
            }
        } : 
        matchers[0];
}
Copy the code

The flow chart

matcherFromGroupMatchers

After running matcherFromTokens, we went back to see compile again. When all matcherFromTokens of Compile were run, there was only one thing left to return to do cache and return matcherFromGroupMatchers. The matcherFromGroupMatchers function returns the superMatcher function, which is used to traverse the seed, filtering the seed through the rules obtained by the previous matcherFromTokes run.

function

function matcherFromGroupsMatchers(elementMatchers, setMatchers) {
    var bySet = setMatchers.length > 0,
        byElement = elementMatchers.length > 0,
        superMatcher = function(seed, context, xml, results, outermost) {
            var elem, j, matcher,
                matchedCount = 0,
                i = '0',
                unmatched = seed && [],
                setMatched = [],
                contextBackup = outermostContext,
                // If there is no seed then use all the elements of the document as the seed
                elems = seed || byElement && Expr.find["TAG"] (The '*', outermost),
                / / cache
                dirrunsUnique = (dirruns += contextBackup == null ? 1 : Math.random() || 0.1),
                len = elems.length;
            
            if (outermost) {
                // This outermostContext will be used as a judgment during baseMatcher
                outermostContext = context == document || context || outermost;
            }
            
            for(; i ! == len && (elem = elems[i] ! =null); i++) {
                if(byElement && elem) {
                    j = 0;
                    if(! context && elem.ownerDoucment ! =document) { setDocumet(elem); xml = ! documentIsHtml; }// There will be more than one case of elementMatches
                    // The current element can be pushed to the result set as long as a set of rules are met
                    // Our example has only one set of rules
                    while ( (matcher = elementMatches[j++]) ) {
                        if (matcher(elem, context || document, xml)) {
                            results.push(elem);
                            break; }}/ / cache
                    if(outermost) { dirruns = dirrunsUnique; }}// The elements that are not matched
                if (bySet) {
                    if((elem = ! matcher && elem)) { matchedCount--; }if (seed) {
                        unmatched.push(elem);
                    }
                }
            }
            matchedCount += i;
            
            // I don't understand the logic here, because none of the examples I've tried have gone this far
            // This should also be the case for complex selectors. I tried nesting not(:not), and it didn't go anywhere
            // I hope someone who understands it can explain it
            
            // If there is no for loop, then I is the string '0' and matchedCount is the number 0
            // matchedCount will be -- it's possible that matchedCount will be unequal even if the for loop goes through
            if(bySet && i ! == matchedCount) { j =0;
                while((matcher = setMatchers[j++])) {
                    matcher(unmatched, setMatched, context, xml);
                }
                if (seed) {
                    if (matchedCount > 0) {
                        while (i--) {
                            if ( !(unmatched[i] | setMatched[i]) ) {
                                setMatched[i] = pop.call(results);
                            }
                        }
                    }
                    setMatched = condense(setMatched);
                }
                push.apply(results, setMatched);
                
                if(outermost && ! seed && setMatched.length >0 &&
                    (matchedCount + setMatchers.length) > - 1) {
                        / / sortingSizzle.uniqueSort(results); }}if (outermost) {
                dirruns = dirrunsUnique;
                outermoustContext = contextBackup;
            }
            // In the select function, return the result in unmatched, but results is the final result
            return unmatched;
        }
    return bySet ?
        // marFunction is an expando marker for an argument function
        markFunction(superMatcher) :
        superMatcher;
}
Copy the code

conclusion

I watched the Sizzle for about 2 months, and I have read all the general procedures before 2020, which is the New Year. In this last search, the Sizzle uses a lot of closures, a lot of currying, just to make sure that all the filter parameters are elem, context, XML. This was the first library I read, and I really gained a lot after reading it. At the beginning, I wanted to finish reading The Sizzle on a whim because I read The big book by Stuart. During the period, I thought it was too difficult and wanted to give up, but finally I stumbled and finally read it. This time, I finished reading the JavaScript framework design, and then I finished the jquery source code, and then I finished the New Year. Hahahaha.