Command injection & input validation

6 minute read

Updated: February 23, 2020

11. Command injection & input validation

Objectives

To begin looking at software vulnerabilities by considering ways of injecting malicious code into applications
To frame these issues as input validation problems
To examine a specific case study of input validation, involving URLs

Command injection

Where untrusted data which is coming from an outside source that is influenceable by the attacker, crucial thing, combined with trusted data – typically a string format.
Some command could be embedded in the source code, but some comes from an external source.
Assembled string is passed on to assembler/ compiler – that will do sth in response to that command.
If you do this, it’s v easy to make mistake in the sense that the input is always gonna be a data. And failing to recognize that you can craft the input in a way that it can change the meaning of the command that’s executed by a server.

Simple example

import os
import sys

os.system("lpq -P " + sys.argv[1])

A program to see what gets printed on the printer queue
We can do this on command line – lpq -p specifies the printer queue
Q: The question is how do you make this do sth else?
- A: you need to make first command to do the right thing – then you make second command to do sth else. Usually close the first command then run the second command.
If he does python lpq.py ‘Staff; ls -l’ and it lists all the file
- He could do sth malicious as well omg

Warning signs

If you see a command being constructed with simple string concatenation, there’s no checking done for the untrusted string – then you got a danger hazard.
How to look out for the possibility that it could happen in the first place. it can only happen if those stings are used to run sth.
Can use diff commands to run a cod that runs a shell or subprocess
- C/C++ : system(), popen(), execlp()
- Python: os.system(), os.popen() etc
Then you need to look at how the command is constructed – is it using anything from untrusted source and if so, is any kind of checking done?
Need to be extra careful if the code is running with HIGH privileges!!!- they can abs do anything on the system

SQL injection

Sin number 1 – SQL injection and to simply focus on databases
Instead of command being constructed from trusted/untrusted – we are constructing a SQL query
Risk – if no validation is done in the input – attackers can change the meaning of the query!
Again, what we are looking out here is, esp in web app – when you are querying the database, and when query constructed from an untrusted input using a simple string concatenation
There’s a form in the user application and the user input of the form is read by the application, construct a query and execute the query. i.e. Obama healthcare where they wanted to query via search box

Exercise

String q = "SELECT * from user WHERE name='" + name + "'";
Statement stmt = db.createStatement();
ResultSet results = stmt.executeQuery(q);

Name = “joe”
- Result: 0 or more rows return, from table user where the name column has name joe
Name – “joe’ OR 1=1 –”
- Everything from the user table is gonna be returned
What you tryna do is close off what developer expected, and add sth new to change the name.
The ‘ closes off the name but then we modify the query with OR clause – 1=1 all of them!

Examples

Healtcare.gov
- When Obama was a president – he set up healthcare.gov
- In common searches – some users were tryna do ‘; select * from users’ – sneaky!
Finnish patent site
- The trade name was set to Glow Finlad; DROP TABLE users; – Oy
What we can see here is that it’s not being checked!!

Fixing it

Manual filtering is an option – coz then you can make sure none of the bad stuff get through
Fix is not to catch all the bad stuff coz we won’t be able to
Problem was to do with the way we construct the query using concatenation.
Easy solution – to use prepared statements
- We simply leave a placeholder for the input – so no way you can modify!! E.g. name= ? is a placeholder
- All we need to do is to set the placeholder to the variable! - stmt.setString(1, name);

A generic problem

Generically what’s happening is we have untrusted source, and there’s too much trust in format and content, thinking that it won’t be malicious.
If the untrusted data is written into memory, or if we use it to construct a command – bad things happen
We can’t control the fact that the input is untrusted – coz it’s coming from outside but what we CAN do is to check it!! input validation

Handling malicious input

All input needs to be validated either:
- Validate on input
- Validate before we first use it

Trust boundary & choke points

We have a trust boundary in the application– and it’s implicitly trusted in here.
We have environment variables and config data as untrusted - coz it’s outside of the app
All the input goes through a choke point (blue polygon), where we do a checking of validity
We NEED to check it!! – coz if no checking is done, there’s a potential of vulnerability

Case study: URL validation

Some of the things that could go wrong with URL
Who’s responsible for the validation? User? Browser? Server?

Example: Phishing

Email that has a plausible looking link. E.g. web.auth.ox.ac.uk.e-du.ir
It has the same site – mockup of it but it fools the user using a diff or malicious link.
So you could say that it’s users fault to not realize, but sth can be done in the browser too
Problem is the URL cold be less obvious

Homograph Attacks

Some pairing of the domain – original and fake
micrоsoft.com / microsoft.com – it looks the SAME!!
There’s one extra bite – we are using Unicode that looks like o that requires a 16 bit, which is HARDER to tell

2015 example

Security researcher registered a website ‘IIoydsbank’ – I I not ll!!
It’s to do with fonts!! – there’s no visual difference between II and ll – hence could fool the user
He put a warning message and he got a TLS certificate for it.

Use of Punycode

Way of representing Unicode characters into an ascii form
This domain was registered as xn—e1awd7f.com
- The xn tells browser to use Punycode
- you could use code that decodes to sth that looks legitimate
- So punycode translates fake e1awd7f.com into epic.com

URL obfuscation in browsers

Sneaky - URLs can have a user@location format
Use %00 – null byte and old browsers – null bytes terminates the string, and it hits it and display trusted.com, but the latter evil.com isn’t shown!!
- But malicious one can for example have http://www.trusted.com%00@www.evil.com
- Omg it takes you to evil.com
QR code same concept to fooling

Directory Traversal

Imagine my URL contains ../ which moves one directory
Allows the attackers to go outside the root of the web server, and you can path down into other file systems
- ../or..\
- %2e%2e%2for%2e%2e%5c - Or sneakily to encode it
- ..%c1%1cor..%c0%afor..%c1%9c
if you see a lot of ../ etc, suggestion you might be getting directory traversal

Example

Cookie specifies which temp file - Cookie: TEMPLATE=../../../../etc/passwd
And you crafted the cookie value – and you step out of template down into sys password
What you hope for where is you get the sys pwd file

Canonicalization

Canonical = “In its simplest or standard form”
Canonicalization = process of converting the various equivalent forms of a name into a single standard name
How do we solve these problem we make everything to a standardized form!
We need to reduce URL to a standard form and get rid of character encoding, and then you can check for maliciousness

Summary

Explored the general idea of command injection and seen how it is frequently used in SQL injection attacks
Framed the vulnerabilities discussed in recent lectures as problems of input validation
Seen that phishing can bypass casual user validation of URLs (e.g. via a homograph attack)
Noted that browsers can mishandle malicious URLs, as can servers (e.g. a directory traversal)

11. Command injection & input validation

Leave a comment