Command injection & input validation

6 minute read

Updated:

11. Command injection & input validation

Objectives

  • To begin looking at software vulnerabilities by considering ways of injecting malicious code into applications
  • To frame these issues as input validation problems
  • To examine a specific case study of input validation, involving URLs

Command injection

  • Where untrusted data which is coming from an outside source that is influenceable by the attacker, crucial thing, combined with trusted data – typically a string format.
  • Some command could be embedded in the source code, but some comes from an external source.
  • Assembled string is passed on to assembler/ compiler – that will do sth in response to that command.
  • If you do this, it’s v easy to make mistake in the sense that the input is always gonna be a data. And failing to recognize that you can craft the input in a way that it can change the meaning of the command that’s executed by a server.

Simple example

import os
import sys

os.system("lpq -P " + sys.argv[1])
  • A program to see what gets printed on the printer queue
  • We can do this on command line – lpq -p specifies the printer queue
  • Q: The question is how do you make this do sth else?
    • A: you need to make first command to do the right thing – then you make second command to do sth else. Usually close the first command then run the second command.
  • If he does python lpq.py ‘Staff; ls -l’ and it lists all the file
    • He could do sth malicious as well omg

Warning signs

  • If you see a command being constructed with simple string concatenation, there’s no checking done for the untrusted string – then you got a danger hazard.
  • How to look out for the possibility that it could happen in the first place. it can only happen if those stings are used to run sth.
  • Can use diff commands to run a cod that runs a shell or subprocess
    • C/C++ : system(), popen(), execlp()
    • Python: os.system(), os.popen() etc
  • Then you need to look at how the command is constructed – is it using anything from untrusted source and if so, is any kind of checking done?
  • Need to be extra careful if the code is running with HIGH privileges!!!- they can abs do anything on the system

SQL injection

  • Sin number 1 – SQL injection and to simply focus on databases
  • Instead of command being constructed from trusted/untrusted – we are constructing a SQL query
  • Risk – if no validation is done in the input – attackers can change the meaning of the query!
  • Again, what we are looking out here is, esp in web app – when you are querying the database, and when query constructed from an untrusted input using a simple string concatenation
  • There’s a form in the user application and the user input of the form is read by the application, construct a query and execute the query. i.e. Obama healthcare where they wanted to query via search box

Exercise

String q = "SELECT * from user WHERE name='" + name + "'";
Statement stmt = db.createStatement();
ResultSet results = stmt.executeQuery(q);
  • Name = “joe”
    • Result: 0 or more rows return, from table user where the name column has name joe
  • Name – “joe’ OR 1=1 –”
    • Everything from the user table is gonna be returned
  • What you tryna do is close off what developer expected, and add sth new to change the name.
  • The ‘ closes off the name but then we modify the query with OR clause – 1=1 all of them!

Examples

  • Healtcare.gov
    • When Obama was a president – he set up healthcare.gov
    • In common searches – some users were tryna do ‘; select * from users’ – sneaky!
  • Finnish patent site
    • The trade name was set to Glow Finlad; DROP TABLE users; – Oy
  • What we can see here is that it’s not being checked!!

Fixing it

  • Manual filtering is an option – coz then you can make sure none of the bad stuff get through
  • Fix is not to catch all the bad stuff coz we won’t be able to
  • Problem was to do with the way we construct the query using concatenation.
  • Easy solution – to use prepared statements
    • We simply leave a placeholder for the input – so no way you can modify!! E.g. name= ? is a placeholder
    • All we need to do is to set the placeholder to the variable! - stmt.setString(1, name);

A generic problem

  • Generically what’s happening is we have untrusted source, and there’s too much trust in format and content, thinking that it won’t be malicious.
  • If the untrusted data is written into memory, or if we use it to construct a command – bad things happen
  • We can’t control the fact that the input is untrusted – coz it’s coming from outside but what we CAN do is to check it!! input validation

Handling malicious input

  • All input needs to be validated either:
    • Validate on input
    • Validate before we first use it

Trust boundary & choke points

  • image
  • We have a trust boundary in the application– and it’s implicitly trusted in here.
  • We have environment variables and config data as untrusted - coz it’s outside of the app
  • All the input goes through a choke point (blue polygon), where we do a checking of validity
  • We NEED to check it!! – coz if no checking is done, there’s a potential of vulnerability

Case study: URL validation

  • Some of the things that could go wrong with URL
  • Who’s responsible for the validation? User? Browser? Server?

Example: Phishing

  • Email that has a plausible looking link. E.g. web.auth.ox.ac.uk.e-du.ir
  • It has the same site – mockup of it but it fools the user using a diff or malicious link.
  • So you could say that it’s users fault to not realize, but sth can be done in the browser too
  • Problem is the URL cold be less obvious

Homograph Attacks

  • Some pairing of the domain – original and fake
  • micrоsoft.com / microsoft.com – it looks the SAME!!
  • There’s one extra bite – we are using Unicode that looks like o that requires a 16 bit, which is HARDER to tell

2015 example

  • Security researcher registered a website ‘IIoydsbank’ – I I not ll!!
  • It’s to do with fonts!! – there’s no visual difference between II and ll – hence could fool the user
  • He put a warning message and he got a TLS certificate for it.

Use of Punycode

  • Way of representing Unicode characters into an ascii form
  • This domain was registered as xn—e1awd7f.com
    • The xn tells browser to use Punycode
    • you could use code that decodes to sth that looks legitimate
    • So punycode translates fake e1awd7f.com into epic.com

URL obfuscation in browsers

  • Sneaky - URLs can have a user@location format
  • Use %00 – null byte and old browsers – null bytes terminates the string, and it hits it and display trusted.com, but the latter evil.com isn’t shown!!
  • QR code same concept to fooling

Directory Traversal

  • Imagine my URL contains ../ which moves one directory
  • Allows the attackers to go outside the root of the web server, and you can path down into other file systems
    • ../or..\
    • %2e%2e%2for%2e%2e%5c - Or sneakily to encode it
    • ..%c1%1cor..%c0%afor..%c1%9c
  • if you see a lot of ../ etc, suggestion you might be getting directory traversal

Example

  • Cookie specifies which temp file - Cookie: TEMPLATE=../../../../etc/passwd
  • And you crafted the cookie value – and you step out of template down into sys password
  • What you hope for where is you get the sys pwd file

Canonicalization

  • Canonical = “In its simplest or standard form”
  • Canonicalization = process of converting the various equivalent forms of a name into a single standard name
  • How do we solve these problem we make everything to a standardized form!
  • We need to reduce URL to a standard form and get rid of character encoding, and then you can check for maliciousness

Summary

  • Explored the general idea of command injection and seen how it is frequently used in SQL injection attacks
  • Framed the vulnerabilities discussed in recent lectures as problems of input validation
  • Seen that phishing can bypass casual user validation of URLs (e.g. via a homograph attack)
  • Noted that browsers can mishandle malicious URLs, as can servers (e.g. a directory traversal)

Leave a comment