Welcome to join the Human high quality front-end framework research group, Band fly
Hello, everyone. I am Karsong.
Businesses often encounter scenarios where you need to deal with risky DOM, such as:
-
Text paste function of various tools
-
You need to render the scene where the server returns HTML
To prevent potential XSS attacks, there are two options:
-
Escape
-
Sanitize
This article describes the differences between the two and the API for Sanitizer for DOM sanitization.
Safe DOM Manipulation with the Sanitizer API
Escape and disinfection
Suppose we want to insert an HTML string into the DOM like this:
const str = "<img src='' onerror='alert(0)'>";
Copy the code
The ability of IMG’s onError callback to execute JS code poses an XSS risk if it is taken directly as innerHTML of an element.
One common solution is to escape strings.
What is the escape
The browser will parse some reserved characters into HTML code, such as:
-
< is parsed as the beginning of the tag
-
> is parsed as the end of the tag
-
“Is parsed as the beginning and end of the property value
To display these reserved characters as text (not to be parsed into HTML code), replace them with the corresponding entity (HTML entity) :
-
< entities are <
-
> the entity is > >
-
“The entity of” is “.
This way of replacing an HTML character with an entity is called an escape
What is a sanitize
For the HTML string above:
const str = "<img src='' onerror='alert(0)'>";
Copy the code
Instead of escaping ” to avoid XSS risk, there is a more intuitive approach: filter the onError attribute directly.
This method of directly removing harmful code (such as
You need to use an API called Sanitizer.
First, we use Sanitizer to construct an example:
const sanitizer = new Sanitizer();
Copy the code
Call the instance’s sanitizeFor method, passing in the container element type and the HTML string to sanitize:
sanitizer.sanitizeFor("div", str);
Copy the code
You get an HTMLDivElement (the container element type we passed in) that contains an img with no onError attribute inside:
By default, Sanitizer removes all code that might cause JS to execute.
Rich configuration
Out of the box, Sanitizer offers rich whitelist and blacklist configurations:
const config = {
allowElements: [].blockElements: [].dropElements: [].allowAttributes: {},
dropAttributes: {},
allowCustomElements: true.allowComments: true
};
new Sanitizer(config)
Copy the code
For example, allowElements defines a whitelist of elements, and only the elements in the whitelist are retained. The corresponding blockElements are blacklists of elements:
const str = `hello <b><i>world</i></b>`
new Sanitizer().sanitizeFor("div", str)
// <div>hello <b><i>world</i></b></div>
new Sanitizer({allowElements: [ "b" ]}).sanitizeFor("div", str)
// <div>hello <b>world</b></div>
new Sanitizer({blockElements: [ "b" ]}).sanitizeFor("div", str)
// <div>hello <i>world</i></div>
new Sanitizer({allowElements: []}).sanitizeFor("div", str)
// <div>hello world</div>
Copy the code
AllowAttributes is a whitelist of attributes, and dropAttributes is a blacklist of attributes for the following configuration:
{
allowAttributes: {"style": ["span"]},
dropAttributes: {"id": ["*"]}}}Copy the code
Represents the sterilized HTML:
-
Only SPAN elements are allowed to have the style attribute
-
Removes the ID attribute for all elements (* wildcard for all elements)
compatibility
How about API compatibility:
It is currently available only after Chrome 93 with the trial logo on:
about://flags/#enable-experimental-web-platform-features
Copy the code
While native Sanitizer is far from stable, you can use the DOMPurify library for similar functionality.
Afterword.
Do you prefer to use Escape or Sanitize everyday?