EtherDream · 2014/10/17″

0 x00 preface


In my previous article on traffic hijacking, I talked about “HTTPS downgrading” — replacing all HTTPS hyperlinks in a page with HTTP versions, allowing users to always communicate in plain text.

This might remind you of a classic man-in-the-middle attack tool called SSLStrip, which does exactly this.

But today, it is a completely different idea, a more efficient and advanced solution – HTTPS front-end hijacking.

0x01 Backend Defect


In the past, traffic hijacking was done mostly through the back end, and SSLStrip is a classic example.

Like other man-in-the-middle tools, a pure back-end implementation can only manipulate the most primitive traffic data, which is a serious hindrance to progress to the next level and presents many difficult problems to solve.

  • What about dynamic elements?

  • How to handle packet fragmentation?

  • Can performance cost be reduced?

  • .

Dynamic elements

In the early days of the Web, tools like SSLStrip were useful. Web pages at that time are static, simple structure and clear hierarchy. On the flow of replacement, fully competent.

However, today’s web pages are getting more complex, and scripts are becoming more and more important. If only from the flow on the beginning, apparently powerless.

#! js var protocol = 'https'; document.write('<a href="' + protocol + '://www.alipay.com/">Login</a>');Copy the code

Even with very simple dynamic elements, the back end is helpless.

Shard processing

The principle of block transmission is well understood. For large data, you can’t send it all in one go. The client receives each piece of data in turn before it is merged into a complete web page.

This can cause a lot of trouble for link replacement because each time you receive a broken fragment. This is compounded by the fact that many pages are not standard UTF-8 encoding.

To make this work, the middleman usually collects data and waits until the page is received before replacing it.

If data is a flow of water, the agent acts like a dam, stopping the flow of water until it is full and then releasing it. As a result, people downstream have to endure long periods of drought before they can get water.

Performance overhead

Because HTML is compatible with many legacy specifications, replacement is not an easy task.

All kinds of complex regular expressions consume a lot of CPU resources. Although the user ends up clicking on only one or two of the links, the middleman doesn’t know which one it will be, so he still needs to analyze the entire page. It’s sad to say.


0x02 Front-end advantage


Wouldn’t it be better if our middleman could get to the front of the page?

Shard processing

First, send a spy to the page. This is very easy to do:

Unlike hyperlinks scattered all over the page, scripts are inserted in the header and run. So we don’t need the whole page data at all, we just need to transform the first chunk, and the subsequent data is still handed over to the system for forwarding.

Therefore, the time of the entire agent is almost constant!

Dynamic elements

Good. We can infiltrate the page easily. But then how to attack?

Now that we’re in the front end, there are quite a few methods. The simplest is to iterate through the hyperlink elements and replace all HTTPS versions with HTTP versions.

It’s a good idea, but it’s still in the SSLStrip mind-set. It’s “replace” again, just move from the back end to the front end.

While this method works for most situations, it’s still not perfect. We do not know when dynamic elements will be added, so we need to turn on the timer to constantly scan. This is obviously a bad idea.

Performance optimization

In fact, no matter who created the hyperlink or when it was added, it doesn’t work if you don’t click on it. So we only care about when we click — if our programs can control the scene the first time a click occurs, then the rest of the process is up to us.

It may sound silly, but on the front end, it’s a piece of cake. Click, it’s just an event. Since it is an event, we can take it down easily with the most basic event capture mechanism:

#! js document.addEventListener('click', function(e) { // ... }, true);Copy the code

Dom-3-event is a very meaningful Event model. Previously used for “inline XSS interception”, it can now also be used to hijack links.

We capture global click events, and if any of them land on an HTTPS hyperlink, send them…… Intercept?

If it is blocked, the new page will not appear. Of course you can say, window.open, you can play it yourself, but you can pop it in the window.open event anyway.

Don’t forget, though, that not all hyperlinks are popovers, and many are direct jumps. You could also say you can change the location to do that.

But it’s not easy to distinguish between a popover and a jump. In addition to the target attribute of the hyperlink, the element in the page also has an effect. Of course, I’m sure you can handle all this.

However, the reality is not always that simple. Some hyperlinks have an onclick event bound to them and even return false or preventDefault in them, masking the default behavior. If we ignore this and still simulate a jump or popover, we’re going against the wishes of the page.

In fact, there is a very simple solution: when our capture program runs, long before the new page appears, there is still a chance to modify the href of the hyperlink. The browser only reads the href property as the final result when the event bubbles and the default behavior is performed.

Therefore, we just need to capture the click event and change the hyperlink address. As for jump, popover, or blocked, we don’t care at all.

It’s that simple. Because we changed it after the user clicked on it, the browser status bar still shows the original HTTPS!

Of course, after clicking once and placing the mouse over the hyperlink, the status bar displays the changes.

So just to keep fooling around, let’s change the href back in the next thread cycle after we modify it. Because of the delay, new pages are not affected.

#! js var url = link.href; / / save the original address link href = url. Replace (' https:// ', 'http://'); SetTimeout (function() {link.href = url; }, 0);Copy the code

This way, the hyperlinks on the page are always normal — only temporarily disguised when the user clicks on them.


0x03 More intercepts


There are other ways to access pages than through hyperlinks, and we should monitor as much as possible. Such as:

  • The form submission
  • Window. The open window
  • Framework page
  • .

The form submission

Form submissions are very similar to hyperlinks in that they both have events, but replace click with Submit and href with action.

The script window pops

Function call is the simplest, just need a small hook to do:

#! js var raw_open = window.open; window.open = function(url) { // FIX: null, case insensitive arguments[0] = url.replace('https://', 'http://'); raw_open.apply(this, arguments); }Copy the code

Framework page

Because we’ve demoted the main page to HTTP, but the frame address is still the same. Because of the different protocols, this can cause cross-domain problems, causing the page to not work properly.

So we also need to convert the frames in the page to the HTTP version to make sure that they are consistent with the main page.

But the framework is different from its predecessors because it loads automatically, and there is no event that is about to load. If you wait until the framework is finished loading, you may already be reporting cross-domain errors. It also wastes a load of traffic for nothing.

Therefore, we must have the frame replace the address as soon as it appears.

This used to be a tricky problem, but the HTML5 era has brought us new hope — mutationEvents. It can be used to monitor page elements in real time, and some experiments have been made before.

Of course, even mutationEvents occasionally have delayed omissions. To avoid HTTPS frame pages completely, we continue to use a new technology from HTML5, Content Security Policy, which is implemented thoroughly because it is supported natively by browsers.

Add the following HTTP header to our proxy return header to perfectly intercept the HTTPS frame page:

Content-Security-Policy: default-src * data 'unsafe-inline' 'unsafe-eval'; frame-src http://*

Solve the problem of the frame page, we can successfully hijack the alipay login page of the account frame IFrame!


0x04 Backend Coordination


With XSS scripts on the front end, we can easily solve the thorny problems of the past. But the challenges do not end there. We still face many challenges.

How to tell the agent

On the front end, though, we’ve circumvented the various ways to get into HTTPS and let the request go to the broker in clear text. But how does the agent decide whether the request should be forwarded using HTTPS or HTTP?

Traditional backend hijacking can be forwarded correctly only when the hyperlink is replaced, which has been recorded. When there is a request in the record, HTTPS is forwarded.

Our hijacking is on the front end and only happens at the click. It’s too late to tell the middleman that a URL is HTTPS.

Telling the middleman is a must. But instead of sending a separate message, we can do it in a clever way — we can just put a little token in the transformed URL.

When the proxy finds this token in the requested URL, it will understand and go to HTTPS!

Since the page was demoted from HTTPS to HTTP, the referer for the related request became the HTTP version. Therefore, the middleman should try to fix the referer back to avoid being detected by the server.

Hide the camouflage

However, the method of adding tags to urls also has major drawbacks.

Because the URL of the page is displayed in the address bar, the user will see our token. Of course, we can use some confusing characters, such as? Zh_cn,? Utf_8,? From_baidu etc, better cheat users.

Of course, if you’re still not satisfied, there are ways to make the eyesore disappear as quickly as possible:

if url has symbol history.replaceState(... , clear_symbol(url) )Copy the code

HTML5 gives us the ability to modify the address bar without refreshing it. These powerful functions can now be used in the front end.

Redirected hijacking

Of course, front-end hijacking alone, or far from enough. In reality, there is another way that is very common, and that is to redirect to a secure page.

Think about how we usually get to the site we want to visit. For example, unless you have a favorite, you have to type www.alipay.com or www.zhifubao.com. When you enter, how will the browser know that it is an HTTPS site?

Obviously, the first request is still plain HTTP. Of course, the HTTP version of Alipay does exist, and its only function is to redirect users to the HTTPS version.

When our middleman finds a redirect to an HTTPS site, he certainly doesn’t want the user to go that route out of his control. So you block the redirect, you get the redirect in HTTPS, and you send it back to the user in HTTP plaintext.

Therefore, from the user’s point of view, you are always on an HTTP site.

However, there is a new Security standard in the Web today: HTTP Strict Transport Security. If the client receives this header and then accesses the site for a period of time, it always does so over HTTPS.

So when our middleman finds this field, he has to delete it.

Of course, it’s not very common for users to type directly. Most of them are search engines, and they come straight from the first result.

The tragedy is that almost all domestic search engines are HTTP. When the user visits the search page, our XSS is already lurking in it, so any result that comes out of it will not enter the official HTTPS 🙂

In addition to the search page, many similar hao123 and so on website daquan, most also do not open HTTPS. Therefore, the diversion of the website, are facing the risk of being hijacked by middlemen.


0x05 Preventive Measures


Introduced the attack method, then explained the defense measures.

The script to jump

In fact, whether front-end hijacking or back-end filtering, there are still many websites can not succeed. For example, jingdong login:

It is scripted to an HTTPS address. The browser location is a special property that can be masked, but cannot be overridden. So it’s hard to control how the page jumps.

If you want to hijack jingdong page, we can only use the way of whitelist, special treatment of the site. But that dramatically increases the cost of the attack.

Confusion clear

Of course, it is not difficult to find jingdong’s login script, THE URL is the most straightforward plain text. So using SSLStrip to replace the https:// text in a script can also help, since most scripts are unprepared for this.

But for a slightly more complex script, such as a URL that is concatenated from strings, it is difficult to implement.

Therefore, in the case of higher security needs, some important addresses may as well be dealt with simply, the middleman can not use the general way to attack. However, special treatment must be given to the site, thus increasing the cost of attack.

As many HSTS as possible

The HSTS header was mentioned earlier. As long as this field appears once, the browser will only use HTTPS to access the site for a long time. Therefore, we turn on as many HSTS as possible.

Hijacking in the real world is not always 100% successful, and as mentioned above, it is easy to miss when using script jumps. So, once the user is caught missing, HSTS can make subsequent page degradation completely invalid.


0x06 Attack demo


Because of front-end hijacking, Demo has two files: one front-end code and one back-end script (NodeJS).

Related source: github.com/EtherDream/…

Compared to the traffic hijacking demo written earlier, this feature is more specific and does not provide additional hijacking paths (such as DNS, etc.).

To test this, it is very easy to configure the browser proxy to simulate HTTP hijacking:

If you don’t mind, you can also test on a Linux kernel system and forward 80 to the local machine. The principle is the same.

Let’s test it on any HTTP -> HTTPS site.

Thanks to the advantage of the front-end script, we hover over the login hyperlink and the status bar still displays the original URL:

As soon as we click, the XSS hook in the hidden page fires, successfully taking us to the man-in-the-middle virtual HTTP login page.

Of course, because of the number of URL parameters, the mark in the address bar is lost.

Fortunately, taobao’s login page did not address judgment, was degraded after the page can still login success!

Of course, as mentioned earlier, not all pages can be hijacked successfully.

Now more and more websites have paid attention to, so the front-end security detection is also born. Hijacking large-scale universalization with just one tool will be harder in the future.

But there is more room to play with a back-to-front approach than with a traditional pure back-end implementation.