Message authentication

7 minute read

Updated:

4. Message authentication

Source code for this post can be found at:
View Source code - MAC

Objectives

  • Review key points from last lecture on symmetric ciphers
  • Explore how cryptographic technology is used for message authentication

Last time

  • Symmetric ciphers perform encryption and decryption using a single shared key
  • We should use a recognized industry standard cipher: AES is a good choice
  • Security rests on secrecy of key (and its resistance to brute forcing)
    • Also its randomness – sooo important!

Brute-Forcing a Key

  • If we create a machine who could do 1012 decryption,
  • And DES is trivial to crack, smallest AES 128, is pass the age of the universe. you cannot brute force small AES in a realistic time

Classic Cipher Modes

  • ECB – encrypt given block of plain – never!
  • CBC – avoid making it dependent on the previous step, and encrypting the XOR of a block of plaintext and the prev ciphertext.
  • CTR – keystream of bit by encrypting the combination of nonce and a counter then XORed with the plain text

Putting things in perspective

  • Provide I use a sensible cipher with a big enough key that is kept secret, and in the correct mode – no need to worry about crypto being a weak point.
  • Something else might be more on risk to keep ur system secure!

Message authentication

  • To guarantee the integrity of the message (not edited) and authenticity (genuinely came from who sent it)
    • Sometimes we don’t want confidentiality! so maybe not encryption. So we might need a slightly different technology
  • How do we know that the code hasn’t been tampered by an attacker?
    • No need to encrypt the code, but would be nice if there’s a way Message Authentication Code

Message Authentication Codes (MAC)

  • We tryna see if the message has been tampered and if it came from who claims to send it

  • image

  1. We got a chunk of msg m, and we feed the shared key kab and the msg to the MAC algorithm (which encrypts it). Then we take the last n bits of the encryption and use it as a MAC.
  2. We stick the last n bit of MAC to the msg and transmit it
  3. Bob can take it, and split MAC from the msg, and bob needs to check it hasn’t been altered. How?
  4. He feeds the same shared key and the msg to the MAC algo which generates his own computed MAC.
  5. Then he compares the computed MAC and received MAC.
  • Drawback
    • There still a possibility of replay, but could solve it by implementing a message counter so that duplicate msg could be detected
  • Q: What encryption mode is not appropriate?
    • ECB – each block is processed independently. MAC is of size n bits. Bcoz the encryption of the final block doesn’t depend on previous; we are free to change anything except for the bits in the final block. even when the message changed, won’t know.

Avoiding Encryption

  • It’s a bit wasteful encrypting the cipher msg just to use the final bit of the MAC.
  • Encryption has a problem of it being slow
  • Also has political issues – restrictions of use of diff cipher coz of patents and export control
    • Banned export cryptographic technologies - relaxed now
  • We don’t need the transformation to be reversible!! If we are just tryna create MAC, we can use sth that’s faster and simpler hash function

Hash Functions

  • Transformation of the bits of the input msg produce the fixed length of output
  • Not a reversible, as you are throwing away information. Squeezing down the msg into small output is why its called message digest or hash
    • We can’t recreate a msg from a message digest
  • Any tampering w the msg changes the value of the msg digest so is very sensitive to change
  • Computationally infeasible to:
    • Given a hash, it should be hard to find the original message x. one-way property
    • Very hard to find y that is not the same as x, that produces the same hash as x
      • Collision is when 2 inputs produce the same hash.
    • Finding any pair of messages that has the same hash.

Using Hash Functions

  • We go back to Alice and Bob with MAC. Now we gonna use hash instead of cipher.
  1. Feed message to hash function and tap that h(MAC) on to the msg
  2. Is transmitted to bob
  3. Detaches the MAC, and feeds it to same hash function as Alice used, and now compare the computed hash and the received hash
  • Q; what if somebody modifies hash and the function
    • They could. So it is not safe. Hence it isn’t a solution

Possible Solution

  1. Take the msg and hash it
  2. Feed a key and hash to a cipher and encrypt it now the hash is encrypted, and we stick it to a msg
  3. Transmit it
  4. Detach the Encrypted hash and decrypt it using the key, to output a hash.
  5. Feed the same key to the hash function to produce a computed hash and compare it with a received hash.
  • Essentially, we brought encryption back. So what’s the advantage?
    • I’m only encrypting a small amt of data. My hash is only few numbers of bytes. less computation
    • Relatively expensive encryption of a small data, and cheaper hash functioning of a bigger message.

Avoiding Encryption - With the benefit of protecting us from attacker modifying hash and the function.

  1. Alice’s shared secret is stuck to the msg, and we compute the hash of all of this.
  2. This is fed to the hash function and is stuck to the msg
  3. Transmit it
  4. Detach the hash, and he adds his own shared secret, and feed it to the same hash function and compares the two
    • can’t attack, provided that attacker doesn’t have a copy of a shared secret
    • Issue is the need for shared secret – somehow should be exchanged in a secret way.
    • Hash Based Message Authentication (HMAC)

Standard Algorithms

  • Some of the standard hashing algorithms
  • Output of hash is of fixed size.
    image

  • As it goes from md5 ~~ others, there’s an Increase of output and block size more secure!!
  • Output size and level of security inversely proportional. So security is half the size of output size
  • RED – md5 and sha1 is not as secure as the theory would suggest.
  • We should be using the SHA -2 family
    • SHA 3 exists as a backup for SHA 2 failing at one point.

Merkle-Damgård Construction

  • image
  • Take input msg (multiple of 512n) so we have to pad it out. And store the actual length of the msg so that we can recover.
  • It is fed into the algorithm
  • Divide it into 512 bit chunks- B0, B1, … , Bn-1
  • For each, we feed to compression function, and we feed the output of processing the prev block together with the B0 B1 B2… So for the first block, we need an IV. Output of the function is a 160 bit value.
    • Compression function is like a cipher, and it takes 2 (IV, B0) as its input.
  • Keep doing it until we done, and the final thing is our HASH this has been affected by every single bit

Length extension Attacks \&HMAC

  • Problem with this is that its prone to length extension attack
  • If they know the length of the msg, attackers can extend it with their msg with the same length, and hash that which would look like the original
    • E.g. 2009 Flickr API vulnerability
  • How do we stop it?
    • image
    • HMAC = Nested hashing - We hash twice. We hash (the secret and the msg) and hash (THAT with another secret.
    • SIN , SOUT are derived from S by padding or hashing it to the size of a block and XORing with constants

The main problem – Collisions!

  • Collision is inevitable. It is when 2 diff input generates the same hash.
  • It’s a problem bc the attacker could supply a diff msg and the hash value would be the same. And we won’t be able to tell that the msg changed.
  • Our input could be anything any size. tremendous possible inputs we have, and are tryna squish that to a 20 bit output when the size of possible output is significantly less, collision is bide to happen.
  • Issue is COLLISION shouldn’t be easy to finndd!!!! so hash needs to get longer and longer

Current State of Play

  • MD5 was broke long ago, but can still be found.
    • E.g. we saw 2 files where 2 bites of the msg has changed, and when we do MD5, the hash is the SAME despite the fact that there was a difference!!!!! Its too vulnerable
  • Its too easy for SHA 1 – so was deprecated by NIST.
  • Now major browsers don’t allow you to use SHA1.
  • Should use SHA 2 fam )))
    • Finding collision in a 256 is way harder

SHA-1 Collision Research

  • The SHAppening (2015): Its been discovered that we can find SHA1 collision when you spend 100k.
  • Shatterd IO (2017): found a pracrical and realistic SHA1 collision for pdfs.
    • 2 pdf document – colour of the banner is different. And yet they have the same SHA 1 hash.

Summary

  • Discussed symmetric cipher modes in more detail
  • Explored how we can use message authentication codes (MACs) to ensure data integrity & authenticity
  • Seen that MACs can be derived from symmetric ciphers or hash functions
  • Noted that the MD5 and SHA-1 hashing algorithms can be brute-forced; use SHA-2 or SHA-3 instead

Leave a comment