quinta-feira, 14 de julho de 2011

Pointers for User Shared Data

Just a quick flash on how to find the kernel and user address of the Shared User Data structure:
x nt!*shared*
8055c6e0 nt!MmSharedUserDataPte = 
805360d8 nt!ExAcquireSharedStarveExclusive = 
8056094c nt!MmTransitionSharedPagesPeak = 
...

dd 8055c6e0  l1
8055c6e0  e100b498

dt -r nt!_MMPTE e100b498
...
+0x000 Flush            : _HARDWARE_PTE
+0x000 Valid            : 0y1
+0x000 Write            : 0y0
+0x000 Owner            : 0y0
+0x000 WriteThrough     : 0y0
+0x000 CacheDisable     : 0y0
+0x000 Accessed         : 0y1
+0x000 Dirty            : 0y0
+0x000 LargePage        : 0y0
+0x000 Global           : 0y1
+0x000 CopyOnWrite      : 0y0
+0x000 Prototype        : 0y0
+0x000 reserved0        : 0y0
+0x000 PageFrameNumber  : 0y00...1000001 (0x41)
+0x000 reserved1        : 0y10...0000000 (0x2000000)
+0x000 LowPart          : 0x41121
+0x004 HighPart         : 0x80000000

!pfn 41
PFN 00000041 at address 81AE671C
flink       00000023  blink / share
count 0000003C  pteaddress C07FEF80
reference count 0001   Cached     color 0
restore pte 00000080
containing page        00074F  Active      P
Shared

!pte C07FEF80
  VA ffdf0000
PDE at 00000000C0603FF0    PTE at 00000000C07FEF80
contains 000000000074F163  contains 0000000000041163
pfn 74f        -G-DA--KWEV    pfn 41         -G-DA--KWEV

And here we've got the kernel address, we can dump the shared space by:
dt nt!_kuser_shared_data ffdf0000  (KERNEL MODE ADDR)
+0x000 TickCountLow     : 0x3f7443d
+0x004 TickCountMultiplier : 0xfa00000
+0x008 InterruptTime    : _KSYSTEM_TIME
+0x014 SystemTime       : _KSYSTEM_TIME
+0x020 TimeZoneBias     : _KSYSTEM_TIME
+0x02c ImageNumberLow   : 0x14c
+0x02e ImageNumberHigh  : 0x14c
+0x030 NtSystemRoot     : [260] 0x43
+0x238 MaxStackTraceDepth : 0
+0x23c CryptoExponent   : 0
+0x240 TimeZoneId       : 2
+0x244 Reserved2        : [8] 0
+0x264 NtProductType    : 1 ( NtProductWinNt )
+0x268 ProductTypeIsValid : 0x1 ''
+0x26c NtMajorVersion   : 5
+0x270 NtMinorVersion   : 1
+0x274 ProcessorFeatures : [64]  ""
+0x2b4 Reserved1        : 0x7ffeffff  (USER MODE ADDRESS: 0x7ffe0000)
....

As you can see the structure has a self reference pointer for user mode addressing. But we need to 64k realign it before using it because of its allocation size (see vmmap output figure), so:
?0x7ffeffff& (@@(~(0x10000-1)))
Evaluate expression: 2147414016 = 7ffe0000
Dumping it at this address we get the exact same content as before:
dt nt!_kuser_shared_data 0x7ffe0000 
+0x000 TickCountLow     : 0x3f7443d
+0x004 TickCountMultiplier : 0xfa00000
+0x008 InterruptTime    : _KSYSTEM_TIME
+0x014 SystemTime       : _KSYSTEM_TIME
+0x020 TimeZoneBias     : _KSYSTEM_TIME
+0x02c ImageNumberLow   : 0x14c
+0x02e ImageNumberHigh  : 0x14c
+0x030 NtSystemRoot     : [260] 0x43
+0x238 MaxStackTraceDepth : 0
+0x23c CryptoExponent   : 0
+0x240 TimeZoneId       : 2
+0x244 Reserved2        : [8] 0
+0x264 NtProductType    : 1 ( NtProductWinNt )
+0x268 ProductTypeIsValid : 0x1 ''
+0x26c NtMajorVersion   : 5
+0x270 NtMinorVersion   : 1
+0x274 ProcessorFeatures : [64]  ""
+0x2b4 Reserved1        : 0x7ffeffff
+0x2b8 Reserved3        : 0x80000000
+0x2bc TimeSlip         : 0
+0x2c0 AlternativeArchitecture : 0 ( StandardDesign )
+0x2c8 SystemExpirationDate : _LARGE_INTEGER 0x0
+0x2d0 SuiteMask        : 0x110
+0x2d4 KdDebuggerEnabled : 0 ''
+0x2d5 NXSupportPolicy  : 0x2 ''
+0x2d8 ActiveConsoleId  : 0
+0x2dc DismountCount    : 0
+0x2e0 ComPlusPackage   : 0xffffffff
+0x2e4 LastSystemRITEventTickCount : 0x3df72600
+0x2e8 NumberOfPhysicalPages : 0x3f73c
+0x2ec SafeBootMode     : 0 ''
+0x2f0 TraceLogging     : 0
+0x2f8 TestRetInstruction : 0xc3
+0x300 SystemCall       : 0x7c90e510
+0x304 SystemCallReturn : 0x7c90e514
+0x308 SystemCallPad    : [3] 0
+0x320 TickCount        : _KSYSTEM_TIME
+0x320 TickCountQuad    : 0
+0x330 Cookie           : 0xe066d175
The output from vmmap shows us the allocated size and the structure size of this special area:

sábado, 2 de julho de 2011

Cracking a Pimp

Just came back from vacations and found the Pimp My Crackme contest. Although I'm not in time to participate in the competition, being a follower of the first prize authors work, I decided to take a look at their challenge.
What I found was a really interesting and defying piece of protection software, and I think worth mentioning in a post about cracking and RE.
What I'll be referring here though, will be just a specific piece of code that blocks the reverser from tracing from withing some debuggers, Olly being one of the affected, and not the entire puzzle cracking. If you're using Windbg you won't have to deal with this.

This specific issue relates to the authors having used a known issue where some debuggers get lost or fail to synchronize when an LDT is used to detour the execution path.

The reversed source code to setup the LDT entry is this:
codeSegment = getCodeSegment(); // CS=0x1B
GetThreadSelectorEntry(GetCurrentThread(),
                    codeSegment,
                    &SelectorEntry);
SelectorEntry.LimitLow = 0xFFEF;
SelectorEntry.HighWord.Bytes.Flags2.LimitHi = 0x7;
pHandle = GetCurrentProcess();
error = setLDT(pHandle,
            (int *)0x7FF,
             &SelectorEntry); // LDT = 0x7F8
...
memcpy(farCallClone, farCall, 0x1Fu);
In process, we've got this selector before setup:
And this afterwords:
Flags2 in LDT_ENTRY comprises LimitHi, Sys, Reserved_0, Default_Big, and Granularity in the selector entry. Being its definition:
DWORD LimitHi :4;
DWORD Sys :1;
DWORD Reserved_0 :1;
DWORD Default_Big :1;
DWORD Granularity  :1;
So they're setting up limitHi = 0x7 for selector entry 0x7FF. What this means to Olly, is that it fails to trace when far jumping, and consequently we fail to use it to find the correct key. To bypass this "protection" we need to fix the jumped segments to be the same than our start up segment. There are a couple of ways to do this, the simplest one I found for this case is just to patch position:
009200DD call far 07FF:00000000
to our original execution segment:
009200DD call far 001B:00000000
Now, the funny part is that they have escalated the problem using multiple segments. I'm not going to delve much in the cracking process or explain how the protection works, but in order to explain how the various segments are utilized, I need to refer that the core of the protection lies in a virtual machine architecture. This virtual machine has a set of functions that virtualize it's domain operations. Aside from other interesting approaches used, this functions where laid out, each one on it's own LDT segment. And shuffled after key validation. So you can imagine how Olly must feel, always jumping around segments.
How can we bypass this? As I pointed out before, we need to maintain execution to our home segment 0x1b. How can we do this? patching the key function that distributes all the work of the VM: I called it CallVMInstruction.
Replace this function with this code:
0040CD95 mov word ptr ss:[ebp-2],1B
0040CD9B jmp short pimp_cra.0040CDC1
0040CD9D nop
0040CD9E nop
0040CD9F nop
0040CDA0 push ebp
0040CDA1 mov ebp,esp
0040CDA3 sub esp,14
0040CDA6 mov eax,[arg.1]
0040CDA9 and al,0F8 
0040CDAB shr eax,1
0040CDAD mov edi,eax
0040CDAF shl eax,2
0040CDB2 add eax,edi 
0040CDB4 lea eax,dword ptr ds:[eax+91FE14] 
0040CDBA mov eax,dword ptr ds:[eax] 
0040CDBC mov dword ptr ss:[ebp-6],eax 
0040CDBF jmp short pimp_cra.0040CD95
0040CDC1 mov edi,dword ptr ds:[AF8EF8]
0040CDC7 push [arg.2]
0040CDCA call far fword ptr ss:[ebp-6]
What this code does is it resolves dynamically the function address to our home segment, based on the segment index passed as an argument to the dispatcher and retained in a VM call dispatcher table.

With this you can set a breakpoint at address 0x0040CDCA. Now just relax and wait for the key validation, and you can now trace all the virtual machine operations.

Finally two simple last patches in order to prepare for the flush and hashing operations:
Replace the code at address:
00403046 mov eax,9200C0
0040304B nop
0040304C nop
0040304D nop
0040304E nop
0040304F nop
00403050 nop

00403243 mov eax,9200C0
00403248 nop
If you find any others function that need to be patched, now you know what to do. Consider it an exercise.

Just a side note about a second problem that I found while debugging the crackme. The authors used a far call stub function for the previously discussed purpose. They rebased this function to address zero. Ollydbg gets totally lost when trying to trace a zero base address. The way I found to successfully trace the stub calls was to instead of setting a breakpoint in address zero, I set the breakpoint in the second instruction, address 4, at lea eax,dword ptr ds:[eax*4+8] instruction.
The reversed code for it is again presented here:
...

BaseAddress = 4;
RegionSize = 4092;
pHandle = GetCurrentProcess();
error = NtAllocateVirtualMemory(pHandle,
              &BaseAddress,
              0,
              &RegionSize,
              MEM_TOP_DOWNMEM_RESERVEMEM_COMMIT,
              PAGE_EXECUTE_READWRITE);
if ( error >= 0 )
{
 farCallClone = 0;
 memcpy(farCallClone,
        farCall,
        0x1F);
....

Good look finding the correct key.