Buffer Overrun
Updated:
12. Buffer Overrun
Objectives
- To consider the prevalence of low-level buffer errors in comparison with other types of vulnerability
- To explore how buffer overruns can occur in C code and what the consequences of this can be
Critical Vulnerabilities (1988-2012)
- SQL injection- 2%, code injection 5%, command injection – 3%. Only accounted for 10% of the vulnerability –
- In the analysis of what’s critical is low level buffer error – is sin 5 of the 24 deadly sins
Top 3 Vulnerabilities (Last 25 years)
- Buffer error was always present
- Apart from 2006, where the buffer error wasn’t present, and for 25 years it was always in top 3 vulnerabilities
- This kind of issue is what we should take close look about – SIN 5 buffer overrun
What is a Buffer Overrun?
- Buffer is a chunk of memory – the program allocates a contiguous chunk of memory, and before that contiguous chunk there are no gaps
- This buffer will have a fixed size, and it will store data up to that size, and not beyond.
- What happens if you tryna copy some data and you go beyond the
buffer size?– what happens if other memory gets overwritten? – for
programming languages that don’t step in to prevent these
overwriting
- When memories are being overwritten there could be a variety of harmful consequences
Where do they happen?
- In C/C++ - It doesn’t do bounds checked memory access by default mainly what we are talking
- The program is running as a server somewhere, it could be written in c for efficiency, and the server is running in HIGH privileges – and access over the internet - then serious problem could happen.
- Q- java or python, what happens if you try to go beyond the bound of
array?
- A: array index out of bound exception! – you will get an exception. Its crashing in a controlled way, not letting you overwrite and triggering exceptions
- Q- BUT does this mean that we don’t have low level buffer overruns?
- Nope. Coz they themselves, their runtime systems is implemented in C/C++ - runtime system itself could be vulnerable. But not in normal user mode
Simple Example
char buf [32];
strcpy (buf, input);
- Allocating a buf of size 32 bytes
- And strcpy (dest, source of data) – copy input to buf.
- Buf is a pointer to the memory essentially
- Strcpy - It’s a C function so no checking is done on size. If the input string is longer than 31 characters, and can be affected by the attacker – vulnerable.
- If the input is simply too long but the attacker can’t change – u’ll have normal bug and crash but it won’t be a security problem
- you can suspect strcpy as a problem of buffer overrun but can’t be for sure
Example
- Buffer overflow in D-link routers
- This was affecting d-link routers – stack-based buffer overflow – what we gonna look at.
- When we see these vulnerabilities report, key word to look at is: RCE – remote code execution
- If the buffer overrun allows you to inject the code remotely and execute it serious problem
- Q: but how is it possible that we inject it?
- A: in lower level, we are talking about things being copied into the memory. answer today!
- This example is the worst CVE – authenticated attacker to execute arbitrary code with privileges. you can inject the code into app running with max privileges – so you essentially took over that machine.
Program Memory Layout
- We can map out the address space of the program from low to high memory addresses, and its divided into sections
- Text – instructions of the program and read only. So can’t inject code that overwrites that bit of memory
- BSS, Data – the storage for static/ global variables defined in the program.
- Environment variables - Also dedicated storage for environment variables inherited from the shell which you run the program
-
Main to focus is stack and heap – are used for storage of the program while its running, and is dynamic areas the size changes as the prog executes
- Stack – holds info about the objects of fixed size known at
compile time.
- We can reserve the space with simple machine code instructions that move a pointer up and down in the memory.
- If we got a local variable as an integer and declare inside a function, the process of allocating mem for that variable just referring to what mem address it has and shifting a pointer by 4 bytes to make space for that int variable.
- Stack frame is used for :
- For local variables
- For function parameters
- For saved register information
- Return address for the call- after a function call, we need to know where we came from – when we make a function call we jump from one point of instruction to another – and we need to store where to jump back to – the return address
- All this control information is saved in stack frame along with the data cause of the problem coz if you overwrite the stack frame and damage the control information, we can take control
- Heap – allocation for when we don’t know the requirements at a
compile time
- Same problem of mixing data and control information
Stack Smashing
- When a function is called, a chunk of stack is reserved for use (stack frame) and we have pile of stack frames.
- Functions call a function, so we have to call a stack frame and below that we have a current stack frame
- Stack frame for currently active function is divided up to different
pieces
- Local variable and buffer, that is overrunable
- Saved frame pointer which links the stack frames together for when we leave the function.
- Return address – points back to instruction section. Tell us which instruction to jump back to when current function exits
- Storage for function parameters
- Buffer – we are overrunning, and what we wanna do is to trash the stuff in green (function params)
- Scenario: Attackers can overrun the buffer and they can influence
what’s copied into that buffer
- It first trashes the local variables – this itself causes potential problems. you can affect the program if you can change the value of variable
- Further – if you can overwrite return address – it will now point to an address specified by the attacker !! hence affect the instruction.
Demo
- Then the if statement to check if there’s anything in the buffer- But it doesn’t check that the size of input is not bigger than the max size of the buffer.
- There exists another function bar, which prints dumb shit but no way in main it calls bar.
-
We are using printf to show what’s in the stack – mem address of foo bar and main
- The result
- Thing to note is that the stack frame has sth that could be similar to the address of the function foo and bar – 08048474 and 080483f6 – maybe what could be return addresses!
- But we are gonna make the bar execute by making the pointer point to it. By pushing the ‘….’ Into the stack frame and now we managed to write the address of bar – and we want to push this down to the bottom until it’s the return address of the stack frame. Then it prints ive been cracked!!
- As we copy more, we are moving to sth that could be close to the return address of the program
- So what happened – we made program call a function that wasn’t called by redirecting the execution.
- Limitation – we are redirecting it to what already exists in the program!! But how do attacker make it point to what they specified?
Worst-Case Scenario
- Program is a server running with high privileges, and attacker want to overrun a buffer and supply all the instruction as part of the input and make it point to that instruction, and run malicious machine code.
Crafting an Exploit
- We are creating sth called shellcode – shell code bc one of the standard tasks is to inject a bit of code into vulnerable program that is runs on a shell, which allows to execute OS commands.
- Called Shellcode even if its not opening a shell !!
- How to trigger shellcode
- you write it in C usually, and it will do a sys call useful to the attacker and it compile object code
- Disassemble object code and hand edit the assembly language – tweaking it so that the code injection works - one of the thing is you have to remove NULLs – coz you are relying on strcpy, the function stops when it encounters NULL. not trivial.
Shellcode Example
- It uses execve() to run shell. What you do is compile and play around with the assembly language. Maybe you tryna reduce the size coz then its easer to inject.
- In x84, 3 registers are used as 1,2,3rd arguments. One of the first tasks is to put the right values in to the registers.
- One of the task is we gotta put correct arguments into right
registers then sys call
- But we can’t have NULL which is why we xor itself. Bc the result is 0 it puts 0 in the rdx and rsi
- And we copy the address we want run, into rdI- which is a commend to run a shell
- Then we push the values and set up our arguments and make a system call.
Crafting an Exploit
- And now we gotta take this code and inject to buffer and make sure it runs
- Problem is, that the attacker might flying blind – they can’t see
the source code. They won’t know:
- How bug buffer is
- How much space is occupied by other local variables btwn the
buffer and the return address.
- They dunno where to put the return address/ and they dunno what the return address should be
- So they are just tryna maximize their chances by creating this sandwich and filling up the middle bit with shellcode
- NOP sled- machine code to do nothing. If we have bunch of NOP in the beginning of shell code –it essentially a safe landing zone. Then we can jump within that zone and slide along till we hit the shell code. Then we can enter the shell of the machine
- And you repeat the return address as many times – to maximize the chances that one of those would overwrite the return address in the stack.
- you craft a script of some kind and submit that as an input of program you are attacking
Questions
- Very small stack frames can be susceptible to attack – bc other
areas of memory can be affected.
- Can put shellcode in the environment variable, and you need to redirect to that. you don’t need a big part
-
Overrunning by just one byte also has effects!! Off by one error
- E.g.
- Null byte beyond the end of the array – tried to be smart by limiting the size of the buffer by sizeof(buf) but they somehow put that null byte beyond the array
Off-By-One Error
- Logic error involving the discrete equivalent of a boundary condition, i.e. where an iterative loop iterates one time too many or too few. <= or >=, then the program is able to read or write beyond the bounds of allocated memory!!
- Above buffer is the Saved frame pointer – to point back to the stack frame of the caller, to be restored when currently executing function finishes.
- When it is overwritten by one byte, it points back to the buffer itself rather than the stack frame. When the function exits, restored stack frame will be in memory controlled by the attacker! If they can make return address in the fake stack frame, they can get the shell code to execute.
Heap Overruns
- Can overwrite application variables, just like in stack.
- Can overwrite meta data that is used to manage storage
- Heap is a linked list of allocated chunks of memory, and if you can overwrite the pointers of those links together, this may allow you to write to the memory location of your choice and possibly trigger execution of shell code
- Could happen whether you are using stack or heap to store your data
Defences
- Correct programming to check the bound of the buffer!!!!- quite hard to do
- If we can’t rely on correct programming, we need some sort of protection compiled into applications – special measures in terms of development tools and libraries
- To make the protection transparent so
that the programmer doesn’t need to do anything special, but the OS
supports making sure that these things aren’t happening
- i.e. to intercept dangerous library calls
Protection Via recompilation
- In C, strcpy is v dangerous – so we can use a safer version -
safestrcpy
- But problem is another sys won’t have this library, so the lib also has to be shipped- so not necessarily the solution
- you can have the compiler to add some code to the application for you
- Can get patched version of GCC to do this bound checking.
- But this relies on the compiler implementing the feature correctly
- Compiler stack protection support - you can put sth in the code that monitors the stack and look at the occurrences of the stack machine attack
Stack Protection Using a Canary
- One protection for stack – canary will die first !! – warning on stack
- New thing injected is canary (a randomly chosen integer)–
yellow! just before the return address.
- This could be checked in the end of function execution to see if it still has that random value.
- If the value has changed, then there’s a risk of return address being overwritten!! And is aborted hence does not point to a wrong value
- Generally, stack buffer overflow attacks will have to change the value of the canary as they write beyond the end of the buffer before they get to the return pointer.
- Not perfect – if local variable includes function pointers, you can overwrite those function pointers and make them point to attacker’s shell code and when the program tries to call function via the function pointer, it will call the function pointer and the canary is never overwritten. hence the way you organise the variables and structures in v imp
- So what you do is you rearrange the stack frame so that the buffer
is always in the high memory (diagram) – so that its guaranteed ur
buffer is only called after the canaries have been changed
- Its generally less dangerous if arrays are modified compared to variables holding flags.
- Or sometimes compilers randomize the order of stack variables and frame layout to make it harder to determine the right input with intended malicious input.
Transparent protection
- Using OS and hardware to help you
- Make the stack and heap not executable.
- For eg, on 64-bit CPUs, if bit 63 of a page table entry is set to 1, code in that page won’t be allowed to execute
- Can emulate this in software if you don’t have CPU support
- BUT may not work coz it might redirect to C library code – which must be allowed to execute
- Address Space Layout Randomization
-
Those mem addresses jump around everytime you run the program – positions of stack, heap libraries change randomly on every execution
-
Ultimately making it hard to run shellcode reproducibly
-
Summary
- Reviewed the vulnerability landscape and highlighted the significance of low-level buffer errors
- Explored how the classic ‘stack smashing’ attack works
- Discussed variations such as ‘off-by one’ and heap attacks
- Investigation various protections against buffer overruns, involving recompilation or hardware / OS
- Noted that correct programming is the best defence!
Leave a comment