Challenge 3

Uncompress the archive

Download the file:

Let's uncompress the archive:

$ 7z x 

7-Zip [64] 9.20
p7zip Version 9.20

Processing archive:

Extracting  such_evil
Enter password (will not be echoed) : malware

Everything is Ok

Size:       7168
Compressed: 2713

We have an executable named "such_evil":

$ file such_evil
such_evil: PE32 executable (console) Intel 80386 (stripped to external PDB), for MS Windows
$ md5sum such_evil
f015b845c2f85cd23271bc0babf2e963  such_evil

Running the executable

When we execute it (you can rename it so that it has the *.exe extension), we can see the following message:

Also if we analyze the program in memory with ProcessHacker, we can notice several interesting strings, one of which looking like the email address we're looking for:

If you choose to perform a dynamic analysis and see the "BrokenByte" message, it means that you have been too far and you will need to restart the program.

And so it begins

Shellcode identification

Let's disassemble the code into IDA Pro:

.text:004024EA                 call    _controlfp
.text:004024EF                 add     esp, 8
.text:004024F2                 mov     eax, 1
.text:004024F7                 push    eax
.text:004024F8                 call    __set_app_type
.text:004024FD                 add     esp, 4
.text:00402500                 lea     eax, [ebp+var_2C]
.text:00402503                 push    eax
.text:00402504                 mov     eax, 0
.text:00402509                 push    eax
.text:0040250A                 lea     eax, [ebp+var_24]
.text:0040250D                 push    eax
.text:0040250E                 lea     eax, [ebp+var_20]
.text:00402511                 push    eax
.text:00402512                 lea     eax, [ebp+var_1C]
.text:00402515                 push    eax
.text:00402516                 call    __getmainargs
.text:0040251B                 add     esp, 14h
.text:0040251E                 mov     eax, [ebp+var_24]
.text:00402521                 push    eax
.text:00402522                 mov     eax, [ebp+var_20]
.text:00402525                 push    eax
.text:00402526                 mov     eax, [ebp+var_1C]
.text:00402529                 push    eax
.text:0040252A                 call    sub_401000
.text:0040252F                 add     esp, 0Ch
.text:00402532                 mov     [ebp+Code], eax
.text:00402535                 mov     eax, [ebp+Code]
.text:00402538                 push    eax             ; Code
.text:00402539                 call    exit
.text:00402539 start           endp

We see an interesting call to sub_401000, which code is presented below:

.text:00401000 ; Attributes: bp-based frame
.text:00401000 sub_401000      proc near
.text:00401000 var_201         = byte ptr -201h
.text:00401000 var_200         = byte ptr -200h
.text:00401000 var_1FF         = byte ptr -1FFh
.text:00401000 var_1FE         = byte ptr -1FEh
.text:00401000 var_3           = byte ptr -3
.text:00401000 var_2           = byte ptr -2
.text:00401000 var_1           = byte ptr -1
.text:00401000                 push    ebp
.text:00401001                 mov     ebp, esp
.text:00401003                 sub     esp, 204h
.text:00401009                 nop
.text:0040100A                 mov     eax, 0E8h
.text:0040100F                 mov     [ebp+var_201], al
.text:00401015                 mov     eax, 0
.text:0040101A                 mov     [ebp+var_200], al
.text:00401020                 mov     eax, 0
.text:00401025                 mov     [ebp+var_1FF], al
.text:0040102B                 mov     eax, 0
.text:00401030                 mov     [ebp+var_1FE], al
.text:00401036                 mov     eax, 0
.text:0040247D                 mov     eax, 95h
.text:00402482                 mov     [ebp+var_3], al
.text:00402485                 mov     eax, 0C9h
.text:0040248A                 mov     [ebp+var_2], al
.text:0040248D                 mov     eax, 0
.text:00402492                 mov     [ebp+var_1], al
.text:00402495                 lea     eax, [ebp+var_201]
.text:0040249B                 call    eax
.text:0040249D                 mov     eax, 0
.text:004024A2                 jmp     $+5
.text:004024A7                 leave
.text:004024A8                 retn
.text:004024A8 sub_401000      endp

Let's open the executable in OllyDbg and put a breakpoint on the "call eax" instruction at offset 0x0040249B. We'll run the program till this breakpoint and dump the stack memory to a file (right click on the EAX register, select "Follow in dump". Then from the memory dump window, right click and select "Backup > Save data to file"):

Open the resulting file ("_0012E000.mem") in IDA Pro, set the loading offset to "0012E000":

Go to offset "0x0012FD7F", right click and select "Code". The figure below depicts the 2 blocks: the code and the data to decrypt:

Here is what the code looks like (Note: I've used shellcode2exe to process the hexa export built from OllyDbg):

At this stage, can can either continue with the static analysis and write several python IDA scripts to uncrypt the code or perform a dynamic analysis in OllyDbg. Along this tutorial, we'll describe both approaches.

UnXOR with static analysis

We have already identified where the XOR'ed stub begins (offset 0x0012FDA0), the size of the stub (0x1DF) as well as the XOR key (0x66). Let's write a python script that we will load into IDA Pro:

loc = 0x12FDA0
for i in range(0x1DF):
    b = Byte(loc+i)
    decoded_byte = b ^ 0x66
    PatchByte(loc+i, decoded_byte)
    SetColor(loc+i, CIC_ITEM, 0xffff00)

To run the script, go to "File > Script file...". Here is the decoded text:

UnXOR with dynamic analysis

Let's put a breakpoint at the end of the XOR loop to decode the string. Here is what we see: "and so it begins".

get ready to get nop'ed so damn hard in the paint

Static Analysis

Once the previous decryption routine has run, it has decrypted the "and so it begins" string as well as new code that we can now analyze (it jumps to 0x12FDB0). Let's analyze the new decryption routine:

seg000:0012FDB0                         loc_12FDB0:                             ; CODE XREF: seg000:loc_12FD9B�j
seg000:0012FDB0 68 75 73 00 00                          push    7375h           ; 'us'
seg000:0012FDB5 68 73 61 75 72                          push    72756173h       ; 'saur'
seg000:0012FDBA 68 6E 6F 70 61                          push    61706F6Eh       ; 'nopa'
seg000:0012FDBF 89 E3                                   mov     ebx, esp        ; EBX = 'nopasaurus'
seg000:0012FDC1 E8 00 00 00 00                          call    $+5
seg000:0012FDC6 8B 34 24                                mov     esi, [esp]      ; ESI = 0x12FDC6
seg000:0012FDC9 83 C6 2D                                add     esi, 2Dh        ; ESI = 0x12FDF3
seg000:0012FDCC 89 F1                                   mov     ecx, esi        ; ECX = 0x12FDF3
seg000:0012FDCE 81 C1 8C 01 00 00                       add     ecx, 18Ch       ; ECX = 0x12FF7F
seg000:0012FDD4 89 D8                                   mov     eax, ebx        ; EAX = 'nopasaurus'
seg000:0012FDD6 83 C0 0A                                add     eax, 0Ah        ; Length of 'nopasaurus' string
seg000:0012FDD9                         loc_12FDD9:                             ; CODE XREF: seg000:0012FDEC�j
seg000:0012FDD9 39 D8                                   cmp     eax, ebx
seg000:0012FDDB 75 05                                   jnz     short loc_12FDE2
seg000:0012FDDD 89 E3                                   mov     ebx, esp
seg000:0012FDDF 83 C3 04                                add     ebx, 4
seg000:0012FDE2                         loc_12FDE2:                             ; CODE XREF: seg000:0012FDDB�j
seg000:0012FDE2 39 CE                                   cmp     esi, ecx
seg000:0012FDE4 74 08                                   jz      short loc_12FDEE
seg000:0012FDE6 8A 13                                   mov     dl, [ebx]
seg000:0012FDE8 30 16                                   xor     [esi], dl       ; XOR each byte with characters of 'nopasaurus' key
seg000:0012FDEA 43                                      inc     ebx
seg000:0012FDEB 46                                      inc     esi
seg000:0012FDEC EB EB                                   jmp     short loc_12FDD9

We can see the the string "nopasaurus" is used as a key array that will be used to XOR each byte starting from 0x12FDF3. Based on the analysis of this new decryption routine, we can write the following python code:

loc_start = 0x12FDF3
loc_end = 0x12FF7F
k = "nopasaurus"

c = 0
for loc in range(loc_start, loc_end):
    b = Byte(loc)
    x = k[c % (len(k))] # cycle thru the characters in the key
    PatchByte(loc, b ^ ord(x))
    SetColor(loc, CIC_ITEM, 0xE4FFD4)

Once we run this script, here is what we get:

Dynamic Analysis

Then we have to deal with a second decoding loop:

The key "nopasaurus" is pushed to the stack:

At the end of the rountine, we can see "get ready to get nop'ed so damn hard in the paint" at offset 0x12FDD9:

omg is it almost over?!?

Static Analysis

The code now jumps to 0x12FE24 where we see a new decryption routine:

seg000:0012FE24                         loc_12FE24:                             ; CODE XREF: seg000:loc_12FDEE�j
seg000:0012FE24 E8 00 00 00 00                          call    $+5
seg000:0012FE29 8B 34 24                                mov     esi, [esp]      ; ESI = 0x12FE29
seg000:0012FE2C 83 C6 1E                                add     esi, 1Eh        ; ESI = 0x12FE47
seg000:0012FE2F B9 38 01 00 00                          mov     ecx, 138h       ; Size of encrypted stub. Will be divided by 4 as we're dealing with DWORD
seg000:0012FE34                         loc_12FE34:                             ; CODE XREF: seg000:0012FE45�j
seg000:0012FE34 83 F9 00                                cmp     ecx, 0
seg000:0012FE37 7E 0E                                   jle     short near ptr unk_12FE47 ; Once XOR loop finished, jump to 0x12FE47
seg000:0012FE39 81 36 62 4F 6C 47                       xor     dword ptr [esi], 476C4F62h ; DWORDs will be XOR'es with 0x476C4F62
seg000:0012FE3F 83 C6 04                                add     esi, 4
seg000:0012FE42 83 E9 04                                sub     ecx, 4
seg000:0012FE45 EB ED                                   jmp     short loc_12FE34

Once again, we can write a python code to decrypt the content. Notice that this time, we're not patching bytes but DWORDs.

loc = 0x12FE47

for i in range(0x138/4):
    d = Dword(loc+i*4)
    decoded_dword = d ^ 0x476C4F62
    PatchDword(loc+i*4, decoded_dword)
    for j in range(4):
        SetColor(loc+i*4+j, CIC_ITEM, 0xFFE9C9)

Once the script is run, it shows a new decryption routine which key is the string "omg is it amost over?!?":

Dynamic Analysis

Then we have another loop:

which decrypts as: "omg is it almost over?!?":

Final loop and solution

Static Analysis

This new decryption routine is as follows:

seg000:0012FE57 68 72 3F 21 3F                          push    3F213F72h       ; 'r?!?'
seg000:0012FE5C 68 20 6F 76 65                          push    65766F20h       ; ' ove'
seg000:0012FE61 68 6D 6F 73 74                          push    74736F6Dh       ; 'most'
seg000:0012FE66 68 74 20 61 6C                          push    6C612074h       ; 't al'
seg000:0012FE6B 68 69 73 20 69                          push    69207369h       ; 'is i'
seg000:0012FE70 68 6F 6D 67 20                          push    20676D6Fh       ; 'omg '
seg000:0012FE75 89 E3                                   mov     ebx, esp        ; EBX = 'omg is it almost over?!?'
seg000:0012FE77 E8 00 00 00 00                          call    $+5
seg000:0012FE7C 8B 34 24                                mov     esi, [esp]      ; ESI = 0x12FE7C
seg000:0012FE7F 83 C6 2D                                add     esi, 2Dh        ; ESI = 0x12FEA9
seg000:0012FE82 89 F1                                   mov     ecx, esi        ; ECX = 0x12FEA9
seg000:0012FE84 81 C1 D6 00 00 00                       add     ecx, 0D6h       ; ECX = 0x12FF7F
seg000:0012FE8A 89 D8                                   mov     eax, ebx        ; EAX = 'omg is it almost over?!?'
seg000:0012FE8C 83 C0 18                                add     eax, 18h        ; Lenth of string 'omg is it almost over?!?'
seg000:0012FE8F                         loc_12FE8F:                             ; CODE XREF: seg000:0012FEA2�j
seg000:0012FE8F 39 D8                                   cmp     eax, ebx        ; cycle thru each character of the key 'omg is it almost over?!?' to XOR bytes
seg000:0012FE91 75 05                                   jnz     short loc_12FE98
seg000:0012FE93 89 E3                                   mov     ebx, esp
seg000:0012FE95 83 C3 04                                add     ebx, 4
seg000:0012FE98                         loc_12FE98:                             ; CODE XREF: seg000:0012FE91�j
seg000:0012FE98 39 CE                                   cmp     esi, ecx
seg000:0012FE9A 74 08                                   jz      short near ptr unk_12FEA4 ; Once decryption finished, jump to 0x12FEA4
seg000:0012FE9C 8A 13                                   mov     dl, [ebx]
seg000:0012FE9E 30 16                                   xor     [esi], dl       ; XOR bytes with characers of the key ('omg is it almost over?!?')
seg000:0012FEA0 43                                      inc     ebx
seg000:0012FEA1 46                                      inc     esi
seg000:0012FEA2 EB EB                                   jmp     short loc_12FE8F

We can write a python script as follows:

loc_start = 0x12FEA9
loc_end = 0x12FF7F
k = "omg is it almost over?!?"

c = 0
for loc in range(loc_start, loc_end):
    b = Byte(loc)
    x = k[c % len(k)] # cycle thru the characters in the key
    PatchByte(loc, b ^ ord(x))
    SetColor(loc, CIC_ITEM, 0xFCFCAC)

This last decryption routine reveals the email address we're looking for. We're done:

Dynamic Analysis

The final loop leads to the solution which is: "[email protected]":


Keywords: reverse-engineering challenge flare fireeye xor shellcode