usage

Lodash’s split method takes three arguments

  • The first argument is the string to be unpacked
  • The second argument is the split delimiter
  • The third parameter is the number of reserved characters
 split('a-b-c'.The '-'.2)
 // => ['a', 'b']
Copy the code

parsing

Let’s look at its function entry first

function split(string, separator, limit) {
  limit = limit === undefined ? MAX_ARRAY_LENGTH : limit >>> 0
  if(! limit) {return[]}if (string && (
    typeof separator === 'string'|| (separator ! =null && !isRegExp(separator))
  )) {
    if(! separator && hasUnicode(string)) {return castSlice(stringToArray(string), 0, limit)
    }
  }
  return string.split(separator, limit)
}
Copy the code

First, the passed limit is checked. If it is not passed, all the broken characters are preserved. If it is passed, the unsigned shift operator ‘>>>’ is used to convert limit meaningfully (i.e., non-negative, no decimal), and return an empty array if limit makes no sense

See this article about the unsigned shift operator ‘>>>’

Then, in the first if, it determines whether the string passed in makes sense and checks whether the delimiter is of type string or not null and regExp. If not, call string’s native split method directly.

The second if checks whether the delimiter is null and whether there are special symbols in the string by checking the Unicode encoding. Here’s the hasUnicode method

const rsAstralRange = '\\ud800-\\udfff'
const rsComboMarksRange = '\\u0300-\\u036f'
const reComboHalfMarksRange = '\\ufe20-\\ufe2f'
const rsComboSymbolsRange = '\\u20d0-\\u20ff'
const rsComboMarksExtendedRange = '\\u1ab0-\\u1aff'
const rsComboMarksSupplementRange = '\\u1dc0-\\u1dff'
const rsComboRange = rsComboMarksRange + reComboHalfMarksRange + rsComboSymbolsRange + rsComboMarksExtendedRange + rsComboMarksSupplementRange
const rsVarRange = '\\ufe0e\\ufe0f'

const rsZWJ = '\\u200d'

function hasUnicode(string) {
  return reHasUnicode.test(string)
}

Copy the code

Next we look at the castSlice method, which is a method that slashes an array. The array passed in is a split array.

function castSlice(array, start, end) {
  const { length } = array
  end = end === undefined ? length : end
  return(! start && end >= length) ? array : slice(array, start, end) }Copy the code

Next we look at the stringToArray method, which also checks for special symbols and, if so, uses the unicodeToArray method, where the reUnicode definition is very complex in order to cover all symbols. Otherwise, the native split method is called

function stringToArray(string) {
  return hasUnicode(string)
    ? unicodeToArray(string)
    : asciiToArray(string)
}

function unicodeToArray(string) {
  return string.match(reUnicode) || []
}

function asciiToArray(string) {
  return string.split(' ')}Copy the code

conclusion

Lodash’s split method adds a lot of edge decisions to the original split method, and these decision functions are reused in many methods.