How does this milw0rm heap spraying exploit work?

45

Solution 1

The shellcode contains some x86 assembly instructions that will do the actual exploit. spray creates a long sequence of instructions that will be put in memory. Since we can't usually find out the exact location of our shellcode in memory, we put a lot of nop instructions before it and jump to somewhere there. The memory array will hold the actual x86 code along with the jumping mechanism. We'll feed the crafted XML to the library which has a bug. When it's being parsed, the bug will cause the instruction pointer register to be assigned to somewhere in our exploit, leading to arbitrary code execution.

To understand more deeply, you should actually figure out what is in the x86 code. unscape will be used to put the sequence of bytes represented of the string in the spray variable. It's valid x86 code that fills a large chunk of the heap and jumps to the start of shellcode. The reason for the ending condition is string length limitations of the scripting engine. You can't have strings larger than a specific length.

In x86 assembly, 0a0a represents or cl, [edx]. This is effectively equivalent to nop instruction for the purposes of our exploit. Wherever we jump to in the spray, we'll get to the next instruction until we reach the shellcode which is the code we actually want to execute.

If you look at the XML, you'll see 0x0a0a is there too. Exactly describing what happens requires specific knowledge of the exploit (you have to know where the bug is and how it's exploited, which I don't know). However, it seems that we force Internet Explorer to trigger the buggy code by setting the innerHtml to that malicious XML string. Internet Explorer tries to parse it and the buggy code somehow gives control to a location of memory where the array exists (since it's a large chunk, the probability of jumping there is high). When we jump there the CPU will keep executing or cl, [edx] instructions until in reaches the beginning of shellcode that's put in memory.

I've disassembled the shellcode:

00000000  C9                leave
00000001  2B1F              sub ebx,[edi]
00000003  B10C              mov cl,0xc
00000005  BDC536DB9B        mov ebp,0x9bdb36c5
0000000A  D9C5              fld st5
0000000C  2474              and al,0x74
0000000E  5A                pop edx
0000000F  F4                hlt
00000010  EA8331FC0B6A6A    jmp 0x6a6a:0xbfc3183
00000017  03D4              add edx,esp
00000019  07                pop es
0000001A  67305CFF          xor [si-0x1],bl
0000001E  98                cwde
0000001F  BBD7FFA4FE        mov ebx,0xfea4ffd7
00000024  9B                wait
00000025  74AD              jz 0xffffffd4
00000027  058B8B028D        add eax,0x8d028b8b
0000002C  D893BCCD35A2      fcom dword [ebx+0xa235cdbc]
00000032  37                aaa
00000033  B84290A63A        mov eax,0x3aa69042
00000038  94                xchg eax,esp
00000039  E99AA4D58D        jmp 0x8dd5a4d8
0000003E  E5A3              in eax,0xa3
00000040  1F                pop ds
00000041  4C                dec esp
00000042  EB46              jmp short 0x8a
00000044  4B                dec ebx
00000045  8CD0              mov eax,ss
00000047  AD                lodsd
00000048  A844              test al,0x44
0000004A  52                push edx
0000004B  4A                dec edx
0000004C  3B81B80DD748      cmp eax,[ecx+0x48d70db8]
00000052  4B                dec ebx
00000053  D46C              aam 0x6c
00000055  46                inc esi
00000056  1392734A204F      adc edx,[edx+0x4f204a73]
0000005C  F8                clc
0000005D  6E                outsb
0000005E  DC8EA20726B4      fmul qword [esi+0xb42607a2]
00000064  04D4              add al,0xd4
00000066  D084ECBA978221    rol byte [esp+ebp*8+0x218297ba],1
0000006D  7CE8              jl 0x57
0000006F  C0CA8C            ror dl,0x8c
00000072  F4                hlt
00000073  A6                cmpsb
00000074  47                inc edi
00000075  210D2EA0B0CD      and [0xcdb0a02e],ecx
0000007B  2CA8              sub al,0xa8
0000007D  B05B              mov al,0x5b
0000007F  43                inc ebx
00000080  F4                hlt
00000081  24E8              and al,0xe8
00000083  7A9C              jpe 0x21
00000085  BB857DCBA0        mov ebx,0xa0cb7d85
0000008A  7DED              jnl 0x79
0000008C  92                xchg eax,edx
0000008D  09E1              or ecx,esp
0000008F  96                xchg eax,esi
00000090  315580            xor [ebp-0x80],edx

Understanding this shellcode requires x86 assembly knowledge and the problem in the MS library itself (to know what the system state is when we reach here), not JavaScript! This code will in turn execute calc.exe.

Solution 2

This looks like an exploit of the recent Internet Explorer bug that Microsoft released the emergency patch for. It uses a flaw in the databinding feature of Microsoft's XML handler, that causes heap memory to be deallocated incorrectly.

Shellcode is machine code that will run when the bug occurs. Spray and memory are just some space allocated on the heap to help the exploitable condition occur.

Solution 3

Heap Spraying is common way to exploit browser stuff, if you are into it you can find several posts like this : http://sf-freedom.blogspot.com/2006/06/heap-spraying-introduction.html

Solution 4

Any time I see memory that doesn't get addressed in an exploit discussion, my first thought is that the exploit is some sort of buffer overflow, in which case the memory is either causing the buffer to overflow or is being accessed once the buffer overflows.

Share:
45
Gogo
Author by

Gogo

Updated on August 19, 2020

Comments

  • Gogo
    Gogo over 3 years

    I'm trying to create DataTable in CodeIgniter with data from MySql. I'm not sure how to create form on DataTable that will handle checkboxes on each row.

    HTML

    <table id="example" class="table table-striped table-bordered" cellspacing="0" width="100%">
            <thead>
            <tr>
                <th></th>
                <th>Name</th>
                <th>Price</th>
                <th>Discount</th>
            </tr>
            </thead>
            <tfoot>
            <tr>
                <th></th>
                <th>Name</th>
                <th>Price</th>
                <th>Discount</th>
            </tr>
            </tfoot>
            <tbody>
            <?php foreach($data as $d): ?>
            <tr>
                <td></td>
                <td><?=$d->name?></td>
                <td><?=$d->price?></td>
                <td><?=$d->discount?>%</td>
            </tr>
            <?php endforeach; ?>
            </tbody>
        </table>
    
  • Patrick Desjardins
    Patrick Desjardins about 15 years
    I appreciate this effort from you for this explication. +25 reputations and all my respect. Thanks
  • Pyro
    Pyro about 15 years
    In this case it was not a heap corruption, heap-based buffer overrun or stack-based buffer overrun: blogs.msdn.com/sdl/archive/2008/12/18/ms08-078-and-the-sdl.a‌​spx
  • username
    username over 14 years
    great answer but good lord - suddenly i am not good with computer ;-)
  • Martin
    Martin about 14 years
    I'm amazed by people who manage to come up with these kinds of exploits. If they're clever enough to hack someone's bank account with this, they deserve all the money they can steal ;)
  • San Jacinto
    San Jacinto about 14 years
    If there was a shrine of good answers for SO, this would be in it.
  • atk
    atk almost 14 years
    For those who want to learn about shell code, they might try "The Shellcoder's Handbook" from Wiley press.
  • Amarghosh
    Amarghosh about 13 years
    How does the code launch calc.exe? I don't know assembly language - does the disassembled shellcode contain instructions to launch calc.exe - does it make any sense to expect to find the name of executable file in the assembly code? I mean, how does that assembly code load the executable binary into the memory without referring to its name - or is the name hidden somewhere in the code?
  • Juho Östman
    Juho Östman about 13 years
    Disassemby seems nonsensical and completely random. That cannot be right. I tried to swap bytes, assuming that the characters in a string were stored in little-endian, but it did not help.
  • mmx
    mmx about 13 years
    @Juho: Indeed. I simply used ndisasm to disassemble the bytes when I originally wrote this answer. I didn't try to read it. It doesn't make sense to me--probably the instructions don't start at the beginning of the string and there's some other non-executable data that's confusing the disassembler. Before that, we have to make sure the exploit, as posted, works :) I haven't personally tried it. -- Regardless, the general idea should be the same.
  • Behrooz
    Behrooz almost 12 years
    +1, and one unmentioned point, the assembly code should not contain a 0x00 byte.
  • Maël Nison
    Maël Nison over 10 years
    I had to disable Avast to see this answer, but totally worth it :)
  • bad_keypoints
    bad_keypoints over 10 years
    Assembly is effin' powerful. That's one of the most deepest exploits i've ever seen. I have to delve into assembly.
  • bad_keypoints
    bad_keypoints over 10 years
    Do you think some such stuff could happen with Chrome's extensions?
  • sqykly
    sqykly about 10 years
    This asm is nonsense. It has to start on the first byte, too, if they just pepper the heap with it preceded by nops. They would need to already have ring 0 access to use the hlt instruction - unless they're trying to cause an exception. They do clobber ebp to an arbitrary value after discarding the contextual stack frame, but they shouldn't be able to get defined behavior out of doing so. They're either using layers of exceptions in the kernel to elevate their privileges, or this is plain gibberish.
  • mmx
    mmx about 10 years
    @sqykly I presume the exploit posted is gibberish (I never tested it) or I failed to disassemble it correctly, but I suspect the former. Perhaps the website posting the exploit did not want to publish a malicious program.
  • sqykly
    sqykly about 10 years
    @MehrdadAfshari you definitely did it right - I reassembled it and it's identical to the bit. Of course this is all curiosity and speculation at this point, but is it possible that some other transformation is done on the malicious code before spraying it, in order to evade detection by security software? Could it be JVM or .NET bytecode, rather than machine code?
  • mmx
    mmx about 10 years
    @sqykly I doubt it is JVM bytecode or CIL, but perhaps the failing routine transforms the code prior to jump.