Command injection & input validation
Updated:
11. Command injection & input validation
Objectives
- To begin looking at software vulnerabilities by considering ways of injecting malicious code into applications
- To frame these issues as input validation problems
- To examine a specific case study of input validation, involving URLs
Command injection
- Where untrusted data which is coming from an outside source that is influenceable by the attacker, crucial thing, combined with trusted data – typically a string format.
- Some command could be embedded in the source code, but some comes from an external source.
- Assembled string is passed on to assembler/ compiler – that will do sth in response to that command.
- If you do this, it’s v easy to make mistake in the sense that the input is always gonna be a data. And failing to recognize that you can craft the input in a way that it can change the meaning of the command that’s executed by a server.
Simple example
import os
import sys
os.system("lpq -P " + sys.argv[1])
- A program to see what gets printed on the printer queue
- We can do this on command line – lpq -p specifies the printer queue
- Q: The question is how do you make this do sth else?
- A: you need to make first command to do the right thing – then you make second command to do sth else. Usually close the first command then run the second command.
- If he does python lpq.py ‘Staff; ls -l’ and it lists all the file
- He could do sth malicious as well omg
Warning signs
- If you see a command being constructed with simple string concatenation, there’s no checking done for the untrusted string – then you got a danger hazard.
- How to look out for the possibility that it could happen in the first place. it can only happen if those stings are used to run sth.
- Can use diff commands to run a cod that runs a shell or subprocess
- C/C++ : system(), popen(), execlp()
- Python: os.system(), os.popen() etc
- Then you need to look at how the command is constructed – is it using anything from untrusted source and if so, is any kind of checking done?
- Need to be extra careful if the code is running with HIGH privileges!!!- they can abs do anything on the system
SQL injection
- Sin number 1 – SQL injection and to simply focus on databases
- Instead of command being constructed from trusted/untrusted – we are constructing a SQL query
- Risk – if no validation is done in the input – attackers can change the meaning of the query!
- Again, what we are looking out here is, esp in web app – when you are querying the database, and when query constructed from an untrusted input using a simple string concatenation
- There’s a form in the user application and the user input of the form is read by the application, construct a query and execute the query. i.e. Obama healthcare where they wanted to query via search box
Exercise
String q = "SELECT * from user WHERE name='" + name + "'";
Statement stmt = db.createStatement();
ResultSet results = stmt.executeQuery(q);
- Name = “joe”
- Result: 0 or more rows return, from table user where the name column has name joe
- Name – “joe’ OR 1=1 –”
- Everything from the user table is gonna be returned
- What you tryna do is close off what developer expected, and add sth new to change the name.
- The ‘ closes off the name but then we modify the query with OR clause – 1=1 all of them!
Examples
- Healtcare.gov
- When Obama was a president – he set up healthcare.gov
- In common searches – some users were tryna do ‘; select * from users’ – sneaky!
- Finnish patent site
- The trade name was set to Glow Finlad; DROP TABLE users; – Oy
- What we can see here is that it’s not being checked!!
Fixing it
- Manual filtering is an option – coz then you can make sure none of the bad stuff get through
- Fix is not to catch all the bad stuff coz we won’t be able to
- Problem was to do with the way we construct the query using concatenation.
- Easy solution – to use prepared statements
- We simply leave a placeholder for the input – so no way you can modify!! E.g. name= ? is a placeholder
- All we need to do is to set the placeholder to the variable! - stmt.setString(1, name);
A generic problem
- Generically what’s happening is we have untrusted source, and there’s too much trust in format and content, thinking that it won’t be malicious.
- If the untrusted data is written into memory, or if we use it to construct a command – bad things happen
- We can’t control the fact that the input is untrusted – coz it’s coming from outside but what we CAN do is to check it!! input validation
Handling malicious input
- All input needs to be validated either:
- Validate on input
- Validate before we first use it
Trust boundary & choke points
- We have a trust boundary in the application– and it’s implicitly trusted in here.
- We have environment variables and config data as untrusted - coz it’s outside of the app
- All the input goes through a choke point (blue polygon), where we do a checking of validity
- We NEED to check it!! – coz if no checking is done, there’s a potential of vulnerability
Case study: URL validation
- Some of the things that could go wrong with URL
- Who’s responsible for the validation? User? Browser? Server?
Example: Phishing
- Email that has a plausible looking link. E.g. web.auth.ox.ac.uk.e-du.ir
- It has the same site – mockup of it but it fools the user using a diff or malicious link.
- So you could say that it’s users fault to not realize, but sth can be done in the browser too
- Problem is the URL cold be less obvious
Homograph Attacks
- Some pairing of the domain – original and fake
- micrоsoft.com / microsoft.com – it looks the SAME!!
- There’s one extra bite – we are using Unicode that looks like o that requires a 16 bit, which is HARDER to tell
2015 example
- Security researcher registered a website ‘IIoydsbank’ – I I not ll!!
- It’s to do with fonts!! – there’s no visual difference between II and ll – hence could fool the user
- He put a warning message and he got a TLS certificate for it.
Use of Punycode
- Way of representing Unicode characters into an ascii form
- This domain was registered as xn—e1awd7f.com
- The xn tells browser to use Punycode
- you could use code that decodes to sth that looks legitimate
- So punycode translates fake e1awd7f.com into epic.com
URL obfuscation in browsers
- Sneaky - URLs can have a user@location format
- Use %00 – null byte and old browsers – null bytes terminates the
string, and it hits it and display trusted.com, but the latter
evil.com isn’t shown!!
- But malicious one can for example have http://www.trusted.com%00@www.evil.com
- Omg it takes you to evil.com
- QR code same concept to fooling
Directory Traversal
- Imagine my URL contains ../ which moves one directory
- Allows the attackers to go outside the root of the web server, and
you can path down into other file systems
- ../or..\
- %2e%2e%2for%2e%2e%5c - Or sneakily to encode it
- ..%c1%1cor..%c0%afor..%c1%9c
- if you see a lot of ../ etc, suggestion you might be getting directory traversal
Example
- Cookie specifies which temp file - Cookie: TEMPLATE=../../../../etc/passwd
- And you crafted the cookie value – and you step out of template down into sys password
- What you hope for where is you get the sys pwd file
Canonicalization
- Canonical = “In its simplest or standard form”
- Canonicalization = process of converting the various equivalent forms of a name into a single standard name
- How do we solve these problem we make everything to a standardized form!
- We need to reduce URL to a standard form and get rid of character encoding, and then you can check for maliciousness
Summary
- Explored the general idea of command injection and seen how it is frequently used in SQL injection attacks
- Framed the vulnerabilities discussed in recent lectures as problems of input validation
- Seen that phishing can bypass casual user validation of URLs (e.g. via a homograph attack)
- Noted that browsers can mishandle malicious URLs, as can servers (e.g. a directory traversal)
Leave a comment