Shifting yourself to space

October 14, 2011

Inside KiSystemService

Preface

I wanted to write this post for over a week and couldn’t find a decent way to start writing it well.KiSystemService is one of the interesting functions I ran into while reversing ntoskrnl.exe and I have always wanted to understand it from top to bottom, I took myself a copy of Windows Internals but they did not seem to draw a complete
flow of how things go from userland to kernel land and how the actual function gets executed.I hope this post would fill some blank objects for those who are interested in understanding the actual mechanism.

To start our post I will use winxp sp2 on vbox to test it, among with IDA, calc.exe, ollydbg and windbg to ease up reversing, I’m lazy- yes, I can’t be arsed to start looking for different symbols/struct when I got both windbg’s dt/dds and IDA’s flowgraph,
for userland debugging I’ll use olly, and for kernel I’ll use windbg, IDA is only used for the flowgraph.

For this post I had chosen a random function to follow which is CloseHandle,
I will not dive into the actual code of what it does and how CloseHandle works, I will only explain how things are working in order to get the actual execution.
I will try describing things at my best, however I’m not a Windows guru so be aware of some mistakes I might do, most of the material I will be writing here is pieces of information written in several (many) places, the purpose of this post would probably be to gather most of them into one post.

Introduction

KiSystemService is a kernel function which provides system services, what does that mean you might ask ? and why would I want to know about it ?

KiSystemService is a function in kernel land which is triggered after a system service request is called. It is actually the last gateway between the actual function in kernel land the usermode process which wants to call (in our case, CloseHandle inside calc.exe).

“Wait a second”, someone might say, “but I got CloseHandle inside a dll in usermode!”

That is correct πŸ™‚ but if we look carefully enough, we will see that the actual function which does the “real thing” is not in userland (e.g not in kernel32.dll or in ntdll.dll).
NTDLL or KERNEL32.dll only deal with error handling and parameter verification so the kernel could deal with the real thing – which is in our case close the specified handle.

Let’s take a quick look about how things are going inside calc.exe
I found where CloseHandle is imported from (kernel32.dll) and looked for any references to it, then I just printed out the function which calls it,
it does not really matter what this function does, let’s see what’s going on there.

/*100436C*/  PUSH ESI
/*100436D*/  PUSH DWORD PTR DS:[1014EFC]
/*1004373*/  MOV ESI,DWORD PTR DS:[<&KERNEL32.SetEvent>]
/*1004379*/  MOV DWORD PTR DS:[1014EF8],1
/*1004383*/  CALL ESI
/*1004385*/  PUSH DWORD PTR DS:[1014F00]
/*100438B*/  CALL ESI
/*100438D*/  PUSH 9C40
/*1004392*/  PUSH DWORD PTR DS:[1014F04]
/*1004398*/  CALL DWORD PTR DS:[<&KERNEL32.WaitForSingleObject>]
/*100439E*/  PUSH DWORD PTR DS:[1014EFC]
/*10043A4*/  MOV ESI,DWORD PTR DS:[<&KERNEL32.CloseHandle>]
/*10043AA*/  CALL ESI
/*10043AC*/  PUSH DWORD PTR DS:[1014F00]
/*10043B2*/  CALL ESI
/*10043B4*/  PUSH DWORD PTR DS:[1014F04]
/*10043BA*/  CALL ESI
/*10043BC*/  POP ESI
/*10043BD*/  RETN

Looks quite simple right ? CloseHandle only gets one argument, according to it’s prototype on msdn

BOOL WINAPI CloseHandle(
  __in  HANDLE hObject
);

I hope this clears things a bit.
Now let’s dive into Kernel32.CloseHandle

.text:7C809B77                 mov     edi, edi
.text:7C809B79                 push    ebp
.text:7C809B7A                 mov     ebp, esp
.text:7C809B7C                 mov     eax, large fs:18h ; TEB
.text:7C809B82                 mov     ecx, [eax+30h]  ; TEB->PEB
.text:7C809B85                 mov     eax, [ebp+hObject] ; userparam
.text:7C809B88                 cmp     eax, STD_ERROR_HANDLE
.text:7C809B8B                 jz      std_error_handle_res ; PEB->ProcessParameters->StandardError
.text:7C809B91                 cmp     eax, STD_OUTPUT_HANDLE
.text:7C809B94                 jz      std_output_handle ; PEB->ProcessParameters->StandardOutput
.text:7C809B9A                 cmp     eax, STD_INPUT_HANDLE
.text:7C809B9D                 jz      std_input_handle_res ; PEB->ProcessParameters->StandardInput
.text:7C809BA3
.text:7C809BA3 do_NtClose:                             ; CODE XREF: CloseHandle+1456Ej
.text:7C809BA3                                         ; CloseHandle+14579j ...
.text:7C809BA3                 mov     ecx, eax        ; hObject
.text:7C809BA5                 and     ecx, 10000003h
.text:7C809BAB                 cmp     ecx, 3
.text:7C809BAE                 push    eax
.text:7C809BAF                 jz      loc_7C81D937
.text:7C809BB5                 call    ds:NtClose
.text:7C809BBB                 test    eax, eax
.text:7C809BBD                 jl      loc_7C81E0D2
.text:7C809BC3                 xor     eax, eax
.text:7C809BC5                 inc     eax

Let’s try to understand what is going on there

.text:7C809B7C                 mov     eax, large fs:18h ; TEB

FS:[18] is the Thread Environment Block,

.text:7C809B82                 mov     ecx, [eax+30h]  ; TEB->PEB

lkd> dt ntdll!_TEB
   +0x000 NtTib            : _NT_TIB
   +0x01c EnvironmentPointer : Ptr32 Void
   +0x020 ClientId         : _CLIENT_ID
   +0x028 ActiveRpcHandle  : Ptr32 Void
   +0x02c ThreadLocalStoragePointer : Ptr32 Void
   +0x030 ProcessEnvironmentBlock : Ptr32 _PEB
   +0x034 LastErrorValue   : Uint4B
   +0x038 CountOfOwnedCriticalSections : Uint4B
   +0x03c CsrClientThread  : Ptr32 Void
   +0x040 Win32ThreadInfo  : Ptr32 Void
   +0x044 User32Reserved   : [26] Uint4B

CloseHandle first checks the values in the parameter passed to see if they are some sort of errors, it prepares for 3 conditions
std_error_handle_res
std_output_handle
std_input_handle_res

which both eventually access TEB->PEB->ProcessParameter param, each with their own unique value.
Eventually, or if none of the above conditions did not happen,

.text:7C809BB5                 call    ds:NtClose

NtClose is called.

just a quick note regarding anyone who is following my post and is not
using winxp, the PEB structure has changed across different versions of Windows,
I have tried looking at from kd and it didn’t look the same, so you have two choices
either fetch the right .h file or use a vm and debug it correctly to get the same results I got

NtClose is not part of kernel32.dll but it is part of ntdll.dll which is actually the core dll that all system service function eventually dwell into.

I had a slight problem to find NtClose in ntdll.dll but eventually realized that I should look for ZwClose, in order not to go outside the barriers of this post, there is a really good post on osronline which explains it briefly why Zw and not Nt (or vice versa).

.text:7C95D586 ZwClose         proc near               ; CODE XREF: RtlFormatCurrentUserKeyPath+6Cp
.text:7C95D586                                         ; RtlDosSearchPath_U+23Ap ...
.text:7C95D586                 mov     eax, 19h        ; NtClose
.text:7C95D58B                 mov     edx, 7FFE0300h
.text:7C95D590                 call    dword ptr [edx]
.text:7C95D592                 retn    4
.text:7C95D592 ZwClose         endp

0x7FFE0300 is a pointer to the system call stub, basically it consists of a simple 3 instruction code block, for more information about it (there is a short description of why we can’t disassemble 0x7FFE0300 directly) see here, or look at Nynaeve’spost (ironically, I found his post while writing this post, and he also wrote about NtClose, hehe, although he’s speaking about int 2eh, and we’re dealing with the newer version, the only difference (that I know of), is that the trap frames differs, but nothing more than that, if you lost me here it’s cool, more on this later )

.text:7C95EB8B KiFastSystemCall proc near              ; DATA XREF: .text:off_7C95395Co
.text:7C95EB8B                 mov     edx, esp
.text:7C95EB8D                 sysenter

There are two important things I almost forgot to mention here.
The first is the transfer of 19h to eax in ZwClose,
this is an index number which is quite important for us while we’ll be in kernel land.
The second thing is the mov edx,esp – we move the stack arguments – aka hObject of CloseHandle to kernel so it would know what to close.

Inside sysenter

Phew, wicked, we’ve now got to our first barrier, after we passed from calc.exe to kernel32.dll and into ntdll.dll and into SystemCallStub, we’ve finally managed to get into kernel land, let’s try to summarize it with a short chart (sorry for not owning visio :/ )

Phew πŸ™‚ Finally, we got to the actual sysenter call which transfers to the kernel and does the actual call.
Before we continue again I would like quote from Intel’s Developer’s Manual about the sysenter instruction to see what it actually does and how it know to contact the right function

Executes a fast call to a level 0 system procedure or routine.
SYSENTER is a companion instruction to SYSEXIT.

The instruction is optimized to provide the maximum performance for system calls from user code running at privilege level 3 to operating system or executive procedures running at privilege level 0.

Prior to executing the SYSENTER instruction, software must specify the privilege level 0 code segment and code entry point, and the privilege level 0 stack segment and stack pointer by writing values to the following MSRs:

β€’IA32_SYSENTER_CS β€” Contains a 32-bit value, of which the lower 16 bits are the segment selector for the privilege level 0 code segment. This value is also used to compute the segment selector of the privilege level 0 stack segment.
β€’IA32_SYSENTER_EIP β€” Contains the 32-bit offset into the privilege level 0 code segment to the first instruction of the selected operating procedure or routine.
β€’IA32_SYSENTER_ESP β€” Contains the 32-bit stack pointer for the privilege level 0 stack.These MSRs can be read from and written to using RDMSR/WRMSR. Register addresses are listed in Table 4-17. The addresses are defined to remain fixed for future Intel 64 and IA-32 processors.

Traps taps traps taps

So, sysenter reads it’s information from the MSR registers to know where KiSystemService is, it also gets the stack and makes sure that the segment is executable (Intel specifies that the page must be readable, and executable )
The MSR registers are filled upon boot with the appropriate values, there must be a small note here about different types of Processors, as some processors do not support the sysenter opcode, Windows upon boot time detects the processor type and then chooses which values and which interrupt types to use,
whether it should use int 2eh, syscall, epc (IA-64) or sysenter, nix users will remember the good old int 0x80 or \xcd\x80 from nifty shellcodes (;

Before we continue to dive directly into KiSystemService some theory is required to understand the actual internals of it.
When calling a service function (in our case CloseHandle), the SYSENTER instruction makes transition to the kernel land and once it has finished executing the function it calls SYSEXIT to back to the user.
The user continues it’s work as if nothing happened and it does not have to restore any values or anything like it. This might be quite trivial for most of you, as those who code in usermode know that they never needed any sort of special adjustment to the stack, or to save any sort of values.
The coder (or the program/process/etc), just needs to check for the returned value and then continue it’s execution prior to the result (if there was an error, handle it, if everything worked, continue to the next task etc etc).
In other words – the kernel takes care of everything for the usermode process to continue it’s execution as if the actual function was in ntdll.dll.

How can this be ? some might ask, well we’re here to find out πŸ˜›
There is no magic here, the kernel does so by building a “trap frame” to save
all the important things it needs in order to restore execution back to usermode

Short note though – Windows changes it’s trap frame from version to version, and from call to call, sysenter’s trap frame is different from int 2eh’s trap frame ,
however this is the only difference between these two functions,

Short note though, before dive into the trap frame, depending on the call type, the trap frame also changes, what does it mean ?
It means that if your processor does not support the SYSENTER instruction, and uses int 2eh it means that the trap frame generated by int 2eh will be different from the trap frame of SYSENTER’s.
Eventually, however, both will jmp into KiSystemService (no call, but a simple jmp).
That is the only difference that I know of between different call mechanisms (sysenter, syscall, int 2eh, epc, etc , but I’d be fond to know if there is anything else ).

So what is exactly a trap frame ? I’ll try quoting from Windows Internals and I hope it will give a satisfying answers.

When a hardware exception or interrupt is generated, the processor records enough machine state on the kernel stack of the thread that’s interrupt so that is can return that point in the control flow and continue execution as if nothing had happened. If the thread was executing in user mode, Windows switches to the thread’s kernel-mode stack. Windows then creates as trap frame on the kernel stack of the interrupted thread into which it stores the execution state of the thread. The trap is a subset of a thread’s complete context, and you can view its definition by typing dt nt!_ktrap_frame in the kernel debugger.

So, the trap frame keeps information about the current thread context so it could restore it with SYSEXIT instruction, that is (;
This is winxp’s trap frame from windbg :

lkd> dt nt!_KTRAP_FRAME
   +0x000 DbgEbp           : Uint4B
   +0x004 DbgEip           : Uint4B
   +0x008 DbgArgMark       : Uint4B
   +0x00c DbgArgPointer    : Uint4B
   +0x010 TempSegCs        : Uint4B
   +0x014 TempEsp          : Uint4B
   +0x018 Dr0              : Uint4B
   +0x01c Dr1              : Uint4B
   +0x020 Dr2              : Uint4B
   +0x024 Dr3              : Uint4B
   +0x028 Dr6              : Uint4B
   +0x02c Dr7              : Uint4B
   +0x030 SegGs            : Uint4B
   +0x034 SegEs            : Uint4B
   +0x038 SegDs            : Uint4B
   +0x03c Edx              : Uint4B
   +0x040 Ecx              : Uint4B
   +0x044 Eax              : Uint4B
   +0x048 PreviousPreviousMode : Uint4B
   +0x04c ExceptionList    : Ptr32 _EXCEPTION_REGISTRATION_RECORD
   +0x050 SegFs            : Uint4B
   +0x054 Edi              : Uint4B
   +0x058 Esi              : Uint4B
   +0x05c Ebx              : Uint4B
   +0x060 Ebp              : Uint4B
   +0x064 ErrCode          : Uint4B
   +0x068 Eip              : Uint4B
   +0x06c SegCs            : Uint4B
   +0x070 EFlags           : Uint4B
   +0x074 HardwareEsp      : Uint4B
   +0x078 HardwareSegSs    : Uint4B
   +0x07c V86Es            : Uint4B
   +0x080 V86Ds            : Uint4B
   +0x084 V86Fs            : Uint4B
   +0x088 V86Gs            : Uint4B

hum, woot, now let’s dive into the actual KiSystemService function

KiSystemService is quite a simple function, it is divided into several parts
that once you understand all of them it’s quite easy to understand the whole concept.
Eventually, the function does two main things:
1. Setup a trap frame to save all the information required to restore to usermode
without any special treatment from the usermode process which takes control.
2. Locate the system service function in the System Service Descriptor Table and call it.

Setting up the trap frame

.text:00407EA6                 push    0
.text:00407EA8                 push    ebp
.text:00407EA9                 push    ebx
.text:00407EAA                 push    esi
.text:00407EAB                 push    edi
.text:00407EAC                 push    fs
.text:00407EAE                 mov     ebx, 30h        ; KGDT_R0_PCR
.text:00407EB3                 db      66h
.text:00407EB3                 mov     fs, bx          ; push fs actually, IDA fuckup
.text:00407EB3                                         ; save and set FS to PCR
.text:00407EB3                                         ; set PCR segment number
.text:00407EB6                 push    dword ptr ds:0FFDFF000h ; KGDT_R3_TEB | RPL_MASK
.text:00407EBC                 mov     dword ptr ds:0FFDFF000h, 0FFFFFFFFh
.text:00407EC6                 mov     esi, ds:0FFDFF124h ; get current thread address from PCR[PcPrcbData + PbCurrentThread ]
.text:00407EC6                                         ;
.text:00407EC6                                         ; PcPrcbData and PbCurrentThread are constant values
.text:00407ECC                 push    dword ptr [esi+140h] ; save old exception list
.text:00407ED2                 sub     esp, 48h        ; Start a new exception list, calculate the value and put it in PCR[PcExceptionList]
.text:00407ED5                 mov     ebx, [esp+68h+arg_0]
.text:00407ED9                 and     ebx, 1          ; Logical AND
.text:00407EDC                 mov     [esi+140h], bl  ; bl = EXCEPTION_CHAIN_END
.text:00407EE2                 mov     ebp, esp        ; new stack
.text:00407EE4                 mov     ebx, [esi+134h] ; Save the current trap frame addr
.text:00407EEA                 mov     [ebp+3Ch], ebx
.text:00407EED                 mov     [esi+134h], ebp
.text:00407EF3                 cld                     ; Clear Direction Flag
.text:00407EF4                 mov     ebx, [ebp+60h]
.text:00407EF7                 mov     edi, [ebp+68h]
.text:00407EFA                 mov     [ebp+0Ch], edx
.text:00407EFD                 mov     dword ptr [ebp+8], 0BADB0D00h
.text:00407F04                 mov     [ebp+0], ebx
.text:00407F07                 mov     [ebp+4], edi
.text:00407F0A                 test    byte ptr [esi+2Ch], 0FFh ; Logical Compare
.text:00407F0E                 jnz     nt_Dr_kss_a     ; if zero we're currently debugging

The first few lines are only pushing arguments in order to save them, 0 is pushed for padding reasons or error or anything like, please do not hate me, I do not know the frame by my heart.

esi is playing a key point here, it points to the PCR (Processor Control Region)
by looking at the correct offsets (and with some great help from WRK), it is possible to
understand most of the code.
There’s one important thing to note here is that the PreviousMode is saved in order to distinguish between whether we’ve came from usermode or kernelmode,
this is quite important to know, so we’ll know how to get the parameters.

kd> dt ntkrnlpa!_KPCR
   +0x000 NtTib            : _NT_TIB
   +0x01c SelfPcr          : Ptr32 _KPCR
   +0x020 Prcb             : Ptr32 _KPRCB
   +0x024 Irql             : UChar
   +0x028 IRR              : Uint4B
   +0x02c IrrActive        : Uint4B
   +0x030 IDR              : Uint4B
   +0x034 KdVersionBlock   : Ptr32 Void
   +0x038 IDT              : Ptr32 _KIDTENTRY
   +0x03c GDT              : Ptr32 _KGDTENTRY
   +0x040 TSS              : Ptr32 _KTSS
   +0x044 MajorVersion     : Uint2B
   +0x046 MinorVersion     : Uint2B
   +0x048 SetMember        : Uint4B
   +0x04c StallScaleFactor : Uint4B
   +0x050 DebugActive      : UChar
   +0x051 Number           : UChar
   +0x052 Spare0           : UChar
   +0x053 SecondLevelCacheAssociativity : UChar
   +0x054 VdmAlert         : Uint4B
   +0x058 KernelReserved   : [14] Uint4B
   +0x090 SecondLevelCacheSize : Uint4B
   +0x094 HalReserved      : [16] Uint4B
   +0x0d4 InterruptMode    : Uint4B
   +0x0d8 Spare1           : UChar
   +0x0dc KernelReserved2  : [17] Uint4B
   +0x120 PrcbData         : _KPRCB

 dt ntkrnlpa!_KPRCB
   +0x000 MinorVersion     : Uint2B
   +0x002 MajorVersion     : Uint2B
   +0x004 CurrentThread    : Ptr32 _KTHREAD
   +0x008 NextThread       : Ptr32 _KTHREAD
   +0x00c IdleThread       : Ptr32 _KTHREAD
   +0x010 Number           : Char
   +0x011 Reserved         : Char
   +0x012 BuildType        : Uint2B
   +0x014 SetMember        : Uint4B
   +0x018 CpuType          : Char
   +0x019 CpuID            : Char
   +0x01a CpuStep          : Uint2B
   +0x01c ProcessorState   : _KPROCESSOR_STATE
   +0x33c KernelReserved   : [16] Uint4B
   +0x37c HalReserved      : [16] Uint4B
   +0x3bc PrcbPad0         : [92] UChar
   +0x418 LockQueue        : [16] _KSPIN_LOCK_QUEUE
   +0x498 PrcbPad1         : [8] UChar
   +0x4a0 NpxThread        : Ptr32 _KTHREAD
   +0x4a4 InterruptCount   : Uint4B
   +0x4a8 KernelTime       : Uint4B
   +0x4ac UserTime         : Uint4B
   +0x4b0 DpcTime          : Uint4B
   +0x4b4 DebugDpcTime     : Uint4B
   +0x4b8 InterruptTime    : Uint4B
   +0x4bc AdjustDpcThreshold : Uint4B
   +0x4c0 PageColor        : Uint4B
   +0x4c4 SkipTick         : Uint4B
   +0x4c8 MultiThreadSetBusy : UChar
   +0x4c9 Spare2           : [3] UChar
   +0x4cc ParentNode       : Ptr32 _KNODE
   +0x4d0 MultiThreadProcessorSet : Uint4B
   +0x4d4 MultiThreadSetMaster : Ptr32 _KPRCB
   +0x4d8 ThreadStartCount : [2] Uint4B
   +0x4e0 CcFastReadNoWait : Uint4B
   +0x4e4 CcFastReadWait   : Uint4B
   +0x4e8 CcFastReadNotPossible : Uint4B
   +0x4ec CcCopyReadNoWait : Uint4B
   +0x4f0 CcCopyReadWait   : Uint4B
   +0x4f4 CcCopyReadNoWaitMiss : Uint4B
   +0x4f8 KeAlignmentFixupCount : Uint4B
   +0x4fc KeContextSwitches : Uint4B
   +0x500 KeDcacheFlushCount : Uint4B
   +0x504 KeExceptionDispatchCount : Uint4B
   +0x508 KeFirstLevelTbFills : Uint4B
   +0x50c KeFloatingEmulationCount : Uint4B
   +0x510 KeIcacheFlushCount : Uint4B
   +0x514 KeSecondLevelTbFills : Uint4B
   +0x518 KeSystemCalls    : Uint4B
   +0x51c SpareCounter0    : [1] Uint4B
   +0x520 PPLookasideList  : [16] _PP_LOOKASIDE_LIST
   +0x5a0 PPNPagedLookasideList : [32] _PP_LOOKASIDE_LIST
   +0x6a0 PPPagedLookasideList : [32] _PP_LOOKASIDE_LIST
   +0x7a0 PacketBarrier    : Uint4B
   +0x7a4 ReverseStall     : Uint4B
   +0x7a8 IpiFrame         : Ptr32 Void
   +0x7ac PrcbPad2         : [52] UChar
   +0x7e0 CurrentPacket    : [3] Ptr32 Void
   +0x7ec TargetSet        : Uint4B
   +0x7f0 WorkerRoutine    : Ptr32     void 
   +0x7f4 IpiFrozen        : Uint4B
   +0x7f8 PrcbPad3         : [40] UChar
   +0x820 RequestSummary   : Uint4B
   +0x824 SignalDone       : Ptr32 _KPRCB
   +0x828 PrcbPad4         : [56] UChar
   +0x860 DpcListHead      : _LIST_ENTRY
   +0x868 DpcStack         : Ptr32 Void
   +0x86c DpcCount         : Uint4B
   +0x870 DpcQueueDepth    : Uint4B
   +0x874 DpcRoutineActive : Uint4B
   +0x878 DpcInterruptRequested : Uint4B
   +0x87c DpcLastCount     : Uint4B
   +0x880 DpcRequestRate   : Uint4B
   +0x884 MaximumDpcQueueDepth : Uint4B
   +0x888 MinimumDpcRate   : Uint4B
   +0x88c QuantumEnd       : Uint4B
   +0x890 PrcbPad5         : [16] UChar
   +0x8a0 DpcLock          : Uint4B
   +0x8a4 PrcbPad6         : [28] UChar
   +0x8c0 CallDpc          : _KDPC
   +0x8e0 ChainedInterruptList : Ptr32 Void
   +0x8e4 LookasideIrpFloat : Int4B
   +0x8e8 SpareFields0     : [6] Uint4B
   +0x900 VendorString     : [13] UChar
   +0x90d InitialApicId    : UChar
   +0x90e LogicalProcessorsPerPhysicalProcessor : UChar
   +0x910 MHz              : Uint4B
   +0x914 FeatureBits      : Uint4B
   +0x918 UpdateSignature  : _LARGE_INTEGER
   +0x920 NpxSaveArea      : _FX_SAVE_AREA
   +0xb30 PowerState       : _PROCESSOR_POWER_STATE

Now that we’ve finished setting up the frame, we can get to the actual calling mechanism

.text:00408000 nt_KiFastCallEntry_0x8d:                ; CODE XREF: KiSystemService?-13Cj
.text:00408000                                         ; KiSystemService?+6Fj
.text:00408000                 mov     edi, eax        ; jmp from set_ints
.text:00408000                                         ; eax = service number
.text:00408000                                         ; edx = stack caller
.text:00408000                                         ; esi = current thread
.text:00408002                 shr     edi, 8          ; extract the actual value to check
.text:00408002                                         ; which table it should be shadow or regular
.text:00408002                                         ; and also check whether it's in the right range
.text:00408005                 and     edi, 30h        ; Logical AND
.text:00408008                 mov     ecx, edi        ; ecx now has the actual index for the SSDT
.text:0040800A                 add     edi, [esi+0E0h] ; Add
.text:00408010                 mov     ebx, eax
.text:00408012                 and     eax, 0FFFh      ; Logical AND

This is the prologue for all the juicy part, eax holds the service number function (in our case, IIRC, 0x19 ), it gets saved into edi which is unused, as you probably remember
esi points to our current thread using the _KPCR array/struct

.text:0040800A                 add     edi, [esi+0E0h] ; Add

in 0040800A we actually compute the address of the SSDT table, to the specific index we need

.text:00408020                 cmp     ecx, 10h        ; Are we going to access the Shadow table
.text:00408020                                         ; or the "regular "table ?
.text:00408023                 jnz     short nt_KiFastCallEntry_0xcc ; Jump if Not Zero (ZF=0)
.text:00408025                 mov     ecx, ds:0FFDFF018h
.text:0040802B                 xor     ebx, ebx        ; Logical Exclusive OR
.text:0040802D
.text:0040802D loc_40802D:                             ; DATA XREF: .text:0040B038o
.text:0040802D                 or      ebx, [ecx+0F70h] ; Logical Inclusive OR
.text:00408033                 jz      short nt_KiFastCallEntry_0xcc ; Jump if Zero (ZF=1)

We the continue to check whether we should access the Shadow Table or the “Regular” SSDT table (I do not know how to call it actually, sorry)

CloseHandle isn’t a GDI service function, and therefore it is not part of the Shadow SSDT, therefore we’ll jmp here and continue

.text:0040803F nt_KiFastCallEntry_0xcc:                ; CODE XREF: KiSystemService?+17Dj
.text:0040803F                                         ; KiSystemService?+18Dj
.text:0040803F                 inc     dword ptr ds:0FFDFF638h ; Increment by 1
.text:00408045                 mov     esi, edx        ; esi now points to user arguments
.text:00408047                 mov     ebx, [edi+0Ch]  ; args table address
.text:0040804A                 xor     ecx, ecx        ; Logical Exclusive OR
.text:0040804C                 mov     cl, [eax+ebx]   ; argument size
.text:0040804F                 mov     edi, [edi]
.text:00408051                 mov     ebx, [edi+eax*4] ; ebx points to the actual service routine,
.text:00408051                                         ; in our case CloseHandle
.text:00408051                                         ; finally :P:P
.text:00408054                 sub     esp, ecx        ; Integer Subtraction
.text:00408056                 shr     ecx, 2          ; Shift Logical Right
.text:00408059                 mov     edi, esp
.text:0040805B                 cmp     esi, ds:MmUserProbeAddress ; Do our args are in kernel address
.text:0040805B                                         ; space or user address space ?
.text:00408061                 jnb     loc_408210      ; > kernel
.text:00408061                                         ; < user

at this point the actual service function address is calculated and we’re currently also
checking whether our arguments are already copied to kernel space,
in usual cases, and if I am not mistaken, our args are not yet copied to kernel space
and therefore we will not go through this jmp,
I just gotta give a small note here, upon analysis I got quite confused by this jmp,
I had a few cases when I did make this jmp and some cases when I was left in the dark,
so I might be mistaken here,
eitherway – the jmp leads to check what was our PreviousMode set to – user or kernel,
and sets the current error to ACCESS_VIOLATION no matter what into eax, since some fuckup
has occured, and the arguments should’ve been copied already.

iddqdkssdoit:                           ; CODE XREF: KiSystemService?+36Ej
.text:00408067                                         ; DATA XREF: .text:0040B02Eo
.text:00408067                 rep movsd               ; Move Byte(s) from String to String
.text:00408069                 call    ebx             ; Indirect Call Near Procedure

That’s it.
All the args are copied to the top of the stack, in our case only one argument and ebx
contains the address to the actual service function, if all goes well the
function will return and a restoration process of the frame will occur,
along with a SYSEXIT instruction or IRET instruction, depending whether you’re
debugging it or not.

I hope you have enjoyed or got to learn a few things, I know I have a few inaccurate things but it was hella fun journey.

Blog at WordPress.com.