usage
Lodash’s split method takes three arguments
- The first argument is the string to be unpacked
- The second argument is the split delimiter
- The third parameter is the number of reserved characters
split('a-b-c'.The '-'.2)
// => ['a', 'b']
Copy the code
parsing
Let’s look at its function entry first
function split(string, separator, limit) {
limit = limit === undefined ? MAX_ARRAY_LENGTH : limit >>> 0
if(! limit) {return[]}if (string && (
typeof separator === 'string'|| (separator ! =null && !isRegExp(separator))
)) {
if(! separator && hasUnicode(string)) {return castSlice(stringToArray(string), 0, limit)
}
}
return string.split(separator, limit)
}
Copy the code
First, the passed limit is checked. If it is not passed, all the broken characters are preserved. If it is passed, the unsigned shift operator ‘>>>’ is used to convert limit meaningfully (i.e., non-negative, no decimal), and return an empty array if limit makes no sense
See this article about the unsigned shift operator ‘>>>’
Then, in the first if, it determines whether the string passed in makes sense and checks whether the delimiter is of type string or not null and regExp. If not, call string’s native split method directly.
The second if checks whether the delimiter is null and whether there are special symbols in the string by checking the Unicode encoding. Here’s the hasUnicode method
const rsAstralRange = '\\ud800-\\udfff'
const rsComboMarksRange = '\\u0300-\\u036f'
const reComboHalfMarksRange = '\\ufe20-\\ufe2f'
const rsComboSymbolsRange = '\\u20d0-\\u20ff'
const rsComboMarksExtendedRange = '\\u1ab0-\\u1aff'
const rsComboMarksSupplementRange = '\\u1dc0-\\u1dff'
const rsComboRange = rsComboMarksRange + reComboHalfMarksRange + rsComboSymbolsRange + rsComboMarksExtendedRange + rsComboMarksSupplementRange
const rsVarRange = '\\ufe0e\\ufe0f'
const rsZWJ = '\\u200d'
function hasUnicode(string) {
return reHasUnicode.test(string)
}
Copy the code
Next we look at the castSlice method, which is a method that slashes an array. The array passed in is a split array.
function castSlice(array, start, end) {
const { length } = array
end = end === undefined ? length : end
return(! start && end >= length) ? array : slice(array, start, end) }Copy the code
Next we look at the stringToArray method, which also checks for special symbols and, if so, uses the unicodeToArray method, where the reUnicode definition is very complex in order to cover all symbols. Otherwise, the native split method is called
function stringToArray(string) {
return hasUnicode(string)
? unicodeToArray(string)
: asciiToArray(string)
}
function unicodeToArray(string) {
return string.match(reUnicode) || []
}
function asciiToArray(string) {
return string.split(' ')}Copy the code
conclusion
Lodash’s split method adds a lot of edge decisions to the original split method, and these decision functions are reused in many methods.