The phenomenon of

Carriers, hackers, browser makers, mobile phone makers have somehow tampered with web pages that people normally visit, inserting ads and other miscellaneous things. In some remote areas, off-brand operators are particularly common.

The principle of network hijacking

  • DNS hijacking

Generally speaking, the DNS server for users to access the Internet is assigned by the carrier, so the carrier can do whatever it wants on this node. For example, access jiankang.qq.com/index.html,…

  • HTTP hijacked

On the router node of the carrier, protocol detection is configured. If HTTP requests and HTML requests are found, they are intercepted and processed. There are two common types: + similar to DNS hijacking return 302 to redirect the user browser to another address. (This is what phishing sites do) + Insert JS or DOM nodes (ads) into the HTML data returned by the server. (More common)

For example:

What about being hijacked?

  • For users, the most direct is to complain to the operator.
  • Add to the HTML<meta http-equiv="Cache-Control" content="no-siteapp"> <meta http-equiv="Cache-Control" content="no-transform " /> Baidu official to prohibit transcoding statement.
  • The most useful way to use HTTPS is to make the data less obvious. HTTPS uses SSL to encrypt data.
  • Add code filtering in the development of web pages, the general idea is to use JavaScript code to check whether all external links belong to the white list.

There are all kinds of hijacking methods: 1. Directly return an HTML with an advertisement 2. Insert JS in the original HTML, and then insert ads through JS script; 3. Iframe displays the original normal web page.

Js actual confrontation

  • The Window listens for the DOMNodeInserted event, reports the inserted DOM, and analyzes the dom information. (usually match all url, then compare whether white list domain name, if not, what is judged to be hijacked, reported at the same time, remove the dom. ParentNode. RemoveChild (dom)); The DOM you just inserted. Be careful of friendly fire. Relatively stable operation is to do monitoring statistics, and then decision-making prevention.
    ul.addEventListener('DOMNodeInserted',function (e) {
        console.log(e.srcElement)
        console.log(ul.childElementCount)
    })
    ul.addEventListener('DOMNodeRemoved',function (e) {
        console.log(e.srcElement)
        console.log(ul.childElementCount)
    })

Copy the code

As follows:

function checkDivHijack(e) { var dom = e ? e.srcElement : document.documentElement; if (! dom.outerHTML) { return; } var imgList = (dom.nodename.toupperCase () == 'IMG')? [dom] : dom.getElementsByTagName('img'); if (! imgList || imgList.length == 0) { return; } var httpReg = /^http:\/\/(.*\.qq\.com|.*\.gtimg\.cn|.*\.qlogo\.cn|.*\.qpic\.cn)\//; var base64Reg = /^data:image/; var src; var hijack = false; for (var i = 0; i < imgList.length; i++) { src = imgList[i].src; if (! httpReg.test(src) && ! base64Reg.test(src)) { hijack = true; break; }}}Copy the code

However, there is a loophole. If the carrier displays the AD image with a div+style setting as the background, the above code will not be detected.

  • In the case of iframe inserts, compare self and top to see if they are the same
function checkIframeHijack() { var flag = 'iframe_hijack_redirected'; if (getURLParam(flag)) { sendHijackReport('jiankang.hijack.iframe_ad', 'iframe hijack: ' + location.href); } else { if (self ! = top) { var url = location.href; var parts = url.split('#'); if (location.search) { parts[0] += '&' + flag + '=1'; } else { parts[0] += '? ' + flag + '=1'; } try { top.location = parts.join('#'); } catch (e) { } } } }Copy the code

Eg:

window.addEventListener('DOMNodeInserted', checkDivHijack);    
function checkDivHijack(e) {
        var html = e ? (e.srcElement.outerHTML || e.srcElement.wholeText) : $('html').html();
        var reg = /http:\/\/([^\/]+)\//g;
        var urlList = html.match(reg);
        if (!urlList || urlList.length == 0) {
            return;
        }
        reg = /^http:\/\/(.*\.qq\.com|.*\.gtimg\.cn|.*\.qlogo\.cn|.*\.qpic\.cn|.*\.wanggou\.com)\/$/;
        var hijack = false;
        for (var i = 0; i < urlList.length; i++) {
            if (!reg.test(urlList[i])) {
                hijack = true;
                break;
            }
        }
}

Copy the code

In the end, the ultimate solution is to use HTTPS. If it’s a client, go through the CLIENT API, because they use direct IP connection.

How do I traverse a DOM tree

Function searchDom(node){if(node && node.nodeType === 1){console.log(node.tagName) if(node.tagName ===  'IMG'){ // ... }} var I = 0, childNodes = node.childNodes, item; for(; i < childNodes.length; i++){ item = childNodes[i]; If (item.nodeType === 1){// Recursively traverses the child node searchDom(item)}}}Copy the code
let rootElement = document.documentElement; (function preTraverse(ele){// ele.tagname // inViewPort check if an element is on the viewport // traverse the dom tree if(ele.children && ele.children > 0){ Array.from(ele.children).forEach(child => preTraverse(child)); } })(rootElement)Copy the code

If in the APP you can encrypt JS

General operators add the page cache, plug code to push advertising. Using CSP content security policy processing, content security policy

Reference link Reference link HTTPS hijacking

3-5 years internal position (Ping An, Lexin, 5 million, Vivo, OPPO) recommendation opportunities, welcome to send your resume to:[email protected]