Cheerio: Node weaves content into HTML
const getHTML = (function(){
let resData = fs.readFileSync(path.join(process.cwd(), STATIC_PATH, 'index.html'));
let $ = cheerio.load(resData);
$('body').append('<script>123</script>');
return function(){ return$}; })() app.use(function(req, res, next){
let matcher = req.url.match(/\/[0-9a-zA-Z%]+\.html/g);
if(matcher && matcher.length && (matcher = matcher[0])){
let $ = getHTML();
// Only weave content into index.html
if(matcher.indexOf('index') = = =1 ) res.end($.html());
}
next();
})
app.use(express.static(path.join(__dirname, STATIC_PATH)));
Copy the code
Why is everything in the output that should have been in the head tag put in the body
SCRIPT and LINK tags defined in HEAD incorrectly to BODY #1072
// Convert buffer to string, trim, and pass to Cheerio
cheerio.load(resData.toString().trim());
// Delete the BOM in front of the buffercheerio.load(removeBufferBom(resData)); The first method is adopted, because Cheerio. Load receives Buffer and toString it into a string before processing, while the second method needs to create a new Buffer object, which wastes memoryCopy the code
Upon closer inspection this only occurs when you have a custom element
preceeding the tags in the head tag
In the example above the link tag will show up but the script tag gets
moved into the body (but not in the body HTML.) This differs from jQuery,
which will always place it in the head tree.
Additional note: it does actually copy the non-standard tags into the
start of the body html (for instance the header tag above,) but doesn't
not copy the other elements into the body.
Copy the code
When there is a custom tag in the head tag, the content of the custom tag is transferred to the body tag. However, when we check our page, there is no custom tag in the head tag, and the first node of the transformed HTML is always a text node: “”
<body>"The & # 65279"<title>Document</title>
<meta charset="UTF-8">
<meta name="viewport" content="Width = device - width, initial - scale = 1.0">
</body>
Copy the code
After Baidu, why is the information in the head of this web page code understood by the browser to be in the body? The nesting error is caused by an invisible U+FEFF (just before “”) at the beginning of the entire HTML document, which taints the HTML.
The front end has no trace embedding point
Mp.weixin.qq.com/s/nJZk-0WtE…
Js fingerprint generation
FingerprintJS