The camelCase function, which literally translates to hump conversion. This article will cover ASCII code table, Unicode code table, using regular expressions to match ASCII or Unicode code, type conversion, intensive array slice method and so on
ASCII
Unicode
define
Unicode
Unicode is a list of characters used to write text. ASCII is also included in Unicode
UTF-8
Utf-8 is a code for converting a series of code points into machine code. All Unicode code points can be encoded in UTF-8. ASCII can also be encoded, but only 128 characters are supported
character
Characters are a rather vague concept. Letters and numbers and punctuation marks are characters. Basically a character somewhere in the Unicode table.
glyph
A glyph is a visual representation of a symbol, provided by a font. It may represent a single character or multiple characters. Or for both.
Dark Corners of Unicode The author brings up an interesting insight: javascript-has-no-string-type.
Related websites
Unicode website
Related online tools
Contains basically all types of Unicode codes
Type conversion
getTag
const toString = Object.prototype.toString
function getTag(value) {
if (value == null) {
// Compatible with low versions of javascript, special handling of null and undefined
return value === undefined ? '[object Undefined]' : '[object Null]'
}
return toString.call(value)
}
Copy the code
isSymbol
function isSymbol(value) {
const type = typeof value
return type == 'symbol' || (type === 'object'&& value ! =null && getTag(value) == '[object Symbol]')}Copy the code
toString
Special case of minus 0. Recursive processing of arrays (possible stack overflow).
const INFINITY = 1 / 0
function toString(value) {
if (value == null) {
return ' '
}
// Exit early for strings to avoid a performance hit in some environments.
// String is directly returned
if (typeof value === 'string') {
return value
}
// Array type
if (Array.isArray(value)) {
// Recursively convert values (susceptible to call stack limits).
// If the array item is not null or undefined, the recursive call itself is converted
return `${value.map((other) => other == null ? other : toString(other))}`
}
// symbol type calls symbol's toString method
if (isSymbol(value)) {
return value.toString()
}
// Handle the -0 case
const result = `${value}`
return (result == '0' && (1 / value) == -INFINITY) ? '0' : result
}
Copy the code
Matches ASCII and Unicode codes
unicodeWords
Function to match Unicode encodings
/** Used to compose unicode character classes. */
/ / star layer
const rsAstralRange = '\\ud800-\\udfff'
// https://www.unicode.org/charts/PDF/U0300.pdf
const rsComboMarksRange = '\\u0300-\\u036f'
// https://www.unicode.org/charts/PDF/UFE20.pdf
const reComboHalfMarksRange = '\\ufe20-\\ufe2f'
// https://www.unicode.org/charts/PDF/U20D0.pdf
const rsComboSymbolsRange = '\\u20d0-\\u20ff'
// https://www.unicode.org/charts/PDF/U1AB0.pdf
const rsComboMarksExtendedRange = '\\u1ab0-\\u1aff'
// https://www.unicode.org/charts/PDF/U1DC0.pdf
const rsComboMarksSupplementRange = '\\u1dc0-\\u1dff'
const rsComboRange = rsComboMarksRange + reComboHalfMarksRange + rsComboSymbolsRange + rsComboMarksExtendedRange + rsComboMarksSupplementRange
// https://www.unicode.org/charts/PDF/U2700.pdf
const rsDingbatRange = '\\u2700-\\u27bf'
// a-z 223-246 248-255
const rsLowerRange = 'a-z\\xdf-\\xf6\\xf8-\\xff'
// 172 177 215 x 247 +
const rsMathOpRange = '\\xac\\xb1\\xd7\\xf7'
// 0-48 58-64 91-96 124-191
const rsNonCharRange = '\\x00-\\x2f\\x3a-\\x40\\x5b-\\x60\\x7b-\\xbf'
// https://www.unicode.org/charts/PDF/U2000.pdf
const rsPunctuationRange = '\\u2000-\\u206f'
// \t \n \r \f 11 160
// feff https://www.unicode.org/charts/PDF/UFE70.pdf
// 1680 https://www.unicode.org/charts/PDF/U1680.pdf
// 180e https://www.unicode.org/charts/PDF/U1800.pdf
// 2000-200a 2028 2029 202f 205f https://www.unicode.org/charts/PDF/U2000.pdf
// 3000 https://www.unicode.org/charts/PDF/U3000.pdf
const rsSpaceRange = ' \\t\\x0b\\f\\xa0\\ufeff\\n\\r\\u2028\\u2029\\u1680\\u180e\\u2000\\u2001\\u2002\\u2003\\u2004\\u2005\\u2006\\u2007\\u2008 \\u2009\\u200a\\u202f\\u205f\\u3000'
// A-Z 192-214 216-222
const rsUpperRange = 'A-Z\\xc0-\\xd6\\xd8-\\xde'
// fe0e fe0f https://www.unicode.org/charts/PDF/UFE00.pdf
const rsVarRange = '\\ufe0e\\ufe0f'
const rsBreakRange = rsMathOpRange + rsNonCharRange + rsPunctuationRange + rsSpaceRange
/** Used to compose unicode capture groups. */
/ / match the apostrophe (apostrophe) 'at https://www.unicode.org/charts/PDF/U2000.pdf
const rsApos = "['\u2019]"
// Match operators non-char characters various punctuation characters various Spaces
const rsBreak = ` [${rsBreakRange}] `
// See the above url
const rsCombo = ` [${rsComboRange}] `
/ / digital
const rsDigit = '\\d'
// See the above url
const rsDingbat = ` [${rsDingbatRange}] `
// Lowercase letters
const rsLower = ` [${rsLowerRange}] `
// Matches various other characters
const rsMisc = ` [^${rsAstralRange}${rsBreakRange + rsDigit + rsDingbatRange + rsLowerRange + rsUpperRange}] `
// Combine by d83c +
// a combination of d83c and DC00-dfff is located in ud800-UDFFf
// fitz emoji
const rsFitz = '\\ud83c[\\udffb-\\udfff]'
/ / modifier
const rsModifier = ` (? :${rsCombo}|${rsFitz}) `
// Non-astral athos
const rsNonAstral = ` [^${rsAstralRange}] `
// Map of the country
const rsRegional = '(? :\\ud83c[\\udde6-\\uddff]){2}'
const rsSurrPair = '[\\ud800-\\udbff][\\udc00-\\udfff]'
// Uppercase letters
const rsUpper = ` [${rsUpperRange}] `
// ZWJ https://www.unicode.org/charts/PDF/U2000.pdf ZERO WIDTH JOINER
const rsZWJ = '\\u200d'
/** Used to compose unicode regexes. */
const rsMiscLower = ` (? :${rsLower}|${rsMisc}) `
const rsMiscUpper = ` (? :${rsUpper}|${rsMisc}) `
const rsOptContrLower = ` (? :${rsApos}(? :d|ll|m|re|s|t|ve))? `
const rsOptContrUpper = ` (? :${rsApos}(? :D|LL|M|RE|S|T|VE))? `
const reOptMod = `${rsModifier}? `
const rsOptVar = ` [${rsVarRange}]? `
const rsOptJoin = ` (? :${rsZWJ}(? :${[rsNonAstral, rsRegional, rsSurrPair].join('|')})${rsOptVar + reOptMod}) * `
const rsOrdLower = '\\d*(? :1st|2nd|3rd|(? ! [123])\\dth)(? =\\b|[A-Z_])'
const rsOrdUpper = '\\d*(? :1ST|2ND|3RD|(? ! [123])\\dTH)(? =\\b|[a-z_])'
const rsSeq = rsOptVar + reOptMod + rsOptJoin
const rsEmoji = ` (? :${[rsDingbat, rsRegional, rsSurrPair].join('|')})${rsSeq}`
const reUnicodeWords = RegExp([
`${rsUpper}?${rsLower}+${rsOptContrLower}(? =${[rsBreak, rsUpper, '$'].join('|')}) `.`${rsMiscUpper}+${rsOptContrUpper}(? =${[rsBreak, rsUpper + rsMiscLower, '$'].join('|')}) `.`${rsUpper}?${rsMiscLower}+${rsOptContrLower}`.`${rsUpper}+${rsOptContrUpper}`,
rsOrdUpper,
rsOrdLower,
`${rsDigit}+ `,
rsEmoji
].join('|'), 'g')
/**
* Splits a Unicode `string` into an array of its words.
*
* @private
* @param {string} The string to inspect.
* @returns {Array} Returns the words of `string`.
*/
function unicodeWords(string) {
return string.match(reUnicodeWords)
}
Copy the code
ReUnicodeWords generates the following regular expressions: Graphical results
hasUnicode
Contains Unicode or not
/** Used to compose unicode character classes. */
/ / star layer
const rsAstralRange = '\\ud800-\\udfff'
// https://www.unicode.org/charts/PDF/U0300.pdf
const rsComboMarksRange = '\\u0300-\\u036f'
// https://www.unicode.org/charts/PDF/UFE20.pdf
const reComboHalfMarksRange = '\\ufe20-\\ufe2f'
// https://www.unicode.org/charts/PDF/U20D0.pdf
const rsComboSymbolsRange = '\\u20d0-\\u20ff'
// https://www.unicode.org/charts/PDF/U1AB0.pdf
const rsComboMarksExtendedRange = '\\u1ab0-\\u1aff'
// https://www.unicode.org/charts/PDF/U1DC0.pdf
const rsComboMarksSupplementRange = '\\u1dc0-\\u1dff'
const rsComboRange = rsComboMarksRange + reComboHalfMarksRange + rsComboSymbolsRange + rsComboMarksExtendedRange + rsComboMarksSupplementRange
// fe0e fe0f https://www.unicode.org/charts/PDF/UFE00.pdf
const rsVarRange = '\\ufe0e\\ufe0f'
/** Used to compose unicode capture groups. */
// ZWJ https://www.unicode.org/charts/PDF/U2000.pdf ZERO WIDTH JOINER
const rsZWJ = '\\u200d'
/** Used to detect strings with [zero-width joiners or code points from the astral planes](http://eev.ee/blog/2015/09/12/dark-corners-of-unicode/). */
const reHasUnicode = RegExp(` [${rsZWJ + rsAstralRange + rsComboRange + rsVarRange}] `)
/** * Checks if 'string' contains Unicode symbols@private
* @param {string} string The string to inspect.
* @returns {boolean} Returns `true` if a symbol is found, else `false`.
*/
function hasUnicode(string) {
return reHasUnicode.test(string)
}
Copy the code
asciiWords
Used to match the corresponding range of ASCII, compared to the ASCII code table
/** Used to match words composed of alphanumeric characters. */
// Used to match alphanumeric words, without punctuation
// 0-47 58-64 91-96 124-127
const reAsciiWord = /[^\x00-\x2f\x3a-\x40\x5b-\x60\x7b-\x7f]+/g
function asciiWords(string) {
return string.match(reAsciiWord)
}
Copy the code
asciiToArray
Convert a string of ASCII characters to an array
/**
* Converts an ASCII `string` to an array.
*
* @private
* @param {string} string The string to convert.
* @returns {Array} Returns the converted array.
*/
function asciiToArray(string) {
// Call split
return string.split(' ')}Copy the code
unicodeToArray
Converts a string of Unicode characters to an array
/** Used to compose unicode character classes. */
/ / star layer
const rsAstralRange = '\\ud800-\\udfff'
// https://www.unicode.org/charts/PDF/U0300.pdf
const rsComboMarksRange = '\\u0300-\\u036f'
// https://www.unicode.org/charts/PDF/UFE20.pdf
const reComboHalfMarksRange = '\\ufe20-\\ufe2f'
// https://www.unicode.org/charts/PDF/U20D0.pdf
const rsComboSymbolsRange = '\\u20d0-\\u20ff'
// https://www.unicode.org/charts/PDF/U1AB0.pdf
const rsComboMarksExtendedRange = '\\u1ab0-\\u1aff'
// https://www.unicode.org/charts/PDF/U1DC0.pdf
const rsComboMarksSupplementRange = '\\u1dc0-\\u1dff'
const rsComboRange = rsComboMarksRange + reComboHalfMarksRange + rsComboSymbolsRange + rsComboMarksExtendedRange + rsComboMarksSupplementRange
const rsVarRange = '\\ufe0e\\ufe0f'
/** Used to compose unicode capture groups. */
const rsAstral = ` [${rsAstralRange}] `
const rsCombo = ` [${rsComboRange}] `
// fitz emoji
const rsFitz = '\\ud83c[\\udffb-\\udfff]'
/ / modifier
const rsModifier = ` (? :${rsCombo}|${rsFitz}) `
// Non-astral athos
const rsNonAstral = ` [^${rsAstralRange}] `
// National flag
const rsRegional = '(? :\\ud83c[\\udde6-\\uddff]){2}'
// High Surrogate Area https://www.unicode.org/charts/PDF/UD800.pdf
// Low Surrogate Area https://www.unicode.org/charts/PDF/UDC00.pdf
const rsSurrPair = '[\\ud800-\\udbff][\\udc00-\\udfff]'
// ZWJ https://www.unicode.org/charts/PDF/U2000.pdf ZERO WIDTH JOINER
const rsZWJ = '\\u200d'
/** Used to compose unicode regexes. */
// Generate a matching re
const reOptMod = `${rsModifier}? `
const rsOptVar = ` [${rsVarRange}]? `
const rsOptJoin = ` (? :${rsZWJ}(? :${[rsNonAstral, rsRegional, rsSurrPair].join('|')})${rsOptVar + reOptMod}) * `
const rsSeq = rsOptVar + reOptMod + rsOptJoin
const rsNonAstralCombo = `${rsNonAstral}${rsCombo}? `
const rsSymbol = ` (? :${[rsNonAstralCombo, rsCombo, rsRegional, rsSurrPair, rsAstral].join('|')}) `
/** Used to match [string symbols](https://mathiasbynens.be/notes/javascript-unicode). */
const reUnicode = RegExp(`${rsFitz}(? =${rsFitz}) |${rsSymbol + rsSeq}`.'g')
/**
* Converts a Unicode `string` to an array.
*
* @private
* @param {string} string The string to convert.
* @returns {Array} Returns the converted array.
*/
function unicodeToArray(string) {
return string.match(reUnicode) || []
}
Copy the code
Utility methods
stringToArray
The string character is converted to a string, depending on the hasUnicode, asciiToArray, unicodeToArray methods above
import asciiToArray from './asciiToArray.js'
import hasUnicode from './hasUnicode.js'
import unicodeToArray from './unicodeToArray.js'
/**
* Converts `string` to an array.
*
* @private
* @param {string} string The string to convert.
* @returns {Array} Returns the converted array.
*/
function stringToArray(string) {
return hasUnicode(string)
? unicodeToArray(string)
: asciiToArray(string)
}
Copy the code
slice
Array.prototype.slice method implementation. This method ensures the return of a dense array.
/** * Creates a slice of `array` from `start` up to, but not including, `end`. * * **Note:** This method is used instead of * [`Array#slice`](https://mdn.io/Array/slice) to ensure dense arrays are * returned. This method ensures that intensive arrays return * *@since 3.0.0
* @category Array
* @param {Array} array The array to slice.
* @param {number} [start=0] The start position. A negative index will be treated as an offset from the end.
* @param {number} [end=array.length] The end position. A negative index will be treated as an offset from the end.
* @returns {Array} Returns the slice of `array`.
* @example
*
* var array = [1, 2, 3, 4]
*
* _.slice(array, 2)
* // => [3, 4]
*/
function slice(array, start, end) {
let length = array == null ? 0 : array.length
// length 0 returns an empty array
if(! length) {return[]}// start is null or undefined. Default is 0
start = start == null ? 0 : start
// end is the default array length of undefined
end = end === undefined ? length : end
// if start is negative, position the starting position from back to front
if (start < 0) {
start = -start > length ? 0 : (length + start)
}
// End cannot exceed the length of the array
end = end > length ? length : end
// end is a negative number
if (end < 0) {
end += length
}
// Start position greater than cutoff position returns 0 otherwise
// Unsigned right shift round
length = start > end ? 0 : ((end - start) >>> 0)
// Unsigned right shift round
start >>>= 0
// Return the corresponding array
let index = -1
const result = new Array(length)
while (++index < length) {
result[index] = array[index + start]
}
return result
}
Copy the code
castSlice
Depends on the slice function above. Added judgment on whether slice is needed when passing arrays
import slice from '.. /slice.js'
/**
* Casts `array` to a slice if it's needed.
*
* @private
* @param {Array} array The array to inspect.
* @param {number} start The start position.
* @param {number} [end=array.length] The end position.
* @returns {Array} Returns the cast slice.
*/
function castSlice(array, start, end) {
const { length } = array
// Do not pass the default array length
end = end === undefined ? length : end
// Call the slice method
return(! start && end >= length) ? array : slice(array, start, end) }Copy the code
createCaseFirst
Generates the corresponding function based on the passed methodName, essentially calling the corresponding method on the first character of the passed string. Rely on the above castSlice function (which intercepts after the first character), hasUnicode function (which detects if the string passed in contains Unicode), stringToArray function (which correctly interprets the string as an array if it contains Unicode)
import castSlice from './castSlice.js'
import hasUnicode from './hasUnicode.js'
import stringToArray from './stringToArray.js'
/** Creates a function like 'lowerFirst'@private
* @param {string} methodName The name of the `String` case method to use.
* @returns {Function} Returns the new case function.
*/
function createCaseFirst(methodName) {
return (string) = > {
// String does nothing for an empty string
if(! string) {return ' '
}
// Contains the Unicode code, calling the internal asciiToArray and unicodeToArray methods
const strSymbols = hasUnicode(string)
? stringToArray(string)
: undefined
// String characters do not contain the first character of a string that Unicode takes by default. Otherwise, take the first character after converting to an array
const chr = strSymbols
? strSymbols[0]
: string[0]
// string intercepts the rest of the string. Contains Unicode cases, intercepting the 1 to last item of the array and converting it to a string
const trailing = strSymbols
? castSlice(strSymbols, 1).join(' ')
: string.slice(1)
// Call the method corresponding to the first string to execute the function corresponding to prototype and append subsequent strings
// Only the first character of the string is operated on
return chr[methodName]() + trailing
}
}
Copy the code
upperFirst
CreateCaseFirst is called and the method name toUpperCase is passed in. The default invocation String. Prototype. ToUpperCase
import createCaseFirst from './.internal/createCaseFirst.js'
Case the first character of 'string' to upper case@since 4.0.0
* @category String
* @param {string} [string=''] The string to convert.
* @returns {string} Returns the converted string.
* @see camelCase, kebabCase, lowerCase, snakeCase, startCase, upperCase
* @example
*
* upperFirst('fred')
* // => 'Fred'
*
* upperFirst('FRED')
* // => 'FRED'
*/
const upperFirst = createCaseFirst('toUpperCase')
Copy the code
hasUnicodeWord
The matching rules are as follows:
const hasUnicodeWord = RegExp.prototype.test.bind(
/[a-z][A-Z]|[A-Z]{2}[a-z]|[0-9][a-zA-Z]|[a-zA-Z][0-9]|[^a-zA-Z0-9 ]/
)
Copy the code
Words method
ReAsciiWord (returns an array containing matching strings), unicodeWords (returns an array containing matching strings), hasUnicodeWord (matches containing aA AAa 0A 0A A0 A0 and any non-alphanumeric Spaces) are called
/**
* Splits `string` into an array of its words.
*
* @since 3.0.0
* @category String
* @param {string} [string=''] The string to inspect.
* @param {RegExp|string} [pattern] The pattern to match words.
* @returns {Array} Returns the words of `string`.
* @example
*
* words('fred, barney, & pebbles')
* // => ['fred', 'barney', 'pebbles']
*
* words('fred, barney, & pebbles', /[^, ]+/g)
* // => ['fred', 'barney', '&', 'pebbles']
*/
function words(string, pattern) {
if (pattern === undefined) {
const result = hasUnicodeWord(string) ? unicodeWords(string) : asciiWords(string)
return result || []
}
return string.match(pattern) || []
}
Copy the code
camelCase
Depending on the toString function (which converts the incoming value to a string), the words function (which matches characters in the string and returns an array), and the upperFirst function (which capitalizes the first letter of a word)
import upperFirst from './upperFirst.js'
import words from './words.js'
import toString from './toString.js'
/ * * * Converts ` string ` to [camel case] (https://en.wikipedia.org/wiki/CamelCase). Conversion of * * * the hump@since 3.0.0
* @category String
* @param {string} [string=''] The string to convert.
* @returns {string} Returns the camel cased string.
* @see lowerCase, kebabCase, snakeCase, startCase, upperCase, upperFirst
* @example
*
* camelCase('Foo Bar')
* // => 'fooBar'
*
* camelCase('--foo-bar--')
* // => 'fooBar'
*
* camelCase('__FOO_BAR__')
* // => 'fooBar'
*/
const camelCase = (string) = > (
/** * 1. Convert to string * 2. Replace with an empty string * 3. 4. Process the array returned by Words */
words(toString(string).replace(/['\u2019]/g.' ')).reduce((result, word, index) = > {
// Change the current word to lowercase
word = word.toLowerCase()
// Uppercase the first letter of the current word when index does not equal 0
return result + (index ? upperFirst(word) : word)
}, ' '))Copy the code