Js advanced regular expression to filter HTML tags

Html_str = “1234567891011121314151617181920” regular extraction (normal thinking)

`var re = new RegExp('<[^<>]+>','g'); var text = html_str.replace(re ,""); Var text = html_str.replace(/<[^<>]+>/g,""); console.log(text); / / output 1234567891011121314151617181920 `
Copy the code

Note 1: global matching g must not forget to write 2. <> tags cannot contain tags to implement filtering HTML tags

Keep some label writing

Replace (/<(? ! img).*? >/g, “”);

If img,p tags are retained, description.replace(/<(? ! img|p|/p).*? >/g, “”);

The escape character is required for/in JS.

Js process to remove HTML tag and end of line whitespace

/ function removeHTMLTag(STR) {STR = str.replace(/
]>/g,”); / / remove the HTML tag STR = STR. Replace (/ / | \ n/g, ‘\ n’); / / remove the end of each line / / STR = STR. Replace (/ \ n [\ s | |] * \ r/g, ‘\ n’); STR =str.replace(/ /ig,”); // return STR; } / / ideogram with normal character function escape2Html (STR) {var arrEntities = {‘ lt ‘:’ < ‘, ‘gt’ : ‘>’, ‘NBSP’ : ‘, ‘amp’ : ‘&’, ‘quot’ : ‘”‘}; return str.replace(/&(lt|gt|nbsp|amp|quot); /ig,function(all,t){return arrEntities[t]; }); }

Two, using DOM technology extraction (completely amuse yourself thinking)

var oDiv = documentThe createElement method (" div "); oDiv.innerHTML = html_str;var text = oDiv.innerText;
console.log(text);
/ / output 1234567891011121314151617181920
// This is more flexible than regex, which can directly manipulate/extract anything from a tag
// Get the number of child nodes
console.log(oDiv.childNodes.length) / / output 20
// Get the innerHTML of the 11th child node
console.log(oDiv.childNodes[10].innerHTML) / / output 11
Copy the code