Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur - PDF document

Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity • Cryptographic Hash Function: Provides assurance of data integrity • Let h be a hash function and x some data. • The hash creates a fingerprint of the data, often referred to as the message digest. • Typically, x is a large binary string • The digest is a fairly short binary string, say 160 bits. 1

Applications • Say y=h(x), and y is stored in some secured place. • If x is altered to say x’ and if we assume that h(x) ≠ h(x’), then the alteration of the message is readily caught, by verifying y ≠ y’, where y’=h(x’) • Used in digital signature schemes • Used for message authentication codes (MAC) Application: Data Integrity 2

Application: Digital Signatures A Keyed Hash Function • Suppose we also have a key in the computation of the hash functions. • y=h K (x), and the key is kept secret. – Alice and Bob share K – Alice computes y for x, using K and sends to Bob. – Bob receives x’ and computes the hash value. – If the hashes match, the message is unaltered. – Note that here y is not required to be kept secret. Why? 3

What is a Cryptographic Hash Family? • Note: X could be finite or infinite set, but Y is always finite If |X|=N, |Y|=M, then there are M N possible F X,Y (the cardinality of • the set of all functions from X to Y) • Any hash family, is called an (N,M) hash family. ⊆ X Y , F F Security of Hash Functions • There are three important properties which a hash function must satisfy. • The properties are required for the security of the applciations. – Preimage – Second Preimage – Collision • We define them one by one. 4

Preimage • If the Preimage can be solved then (x,y) is a valid pair. • A hash function for which Preimage cannot be efficiently solved is said to be preimage resistant. Second Preimage • If this problem is solved, then the pair (x’,h(x)) is valid • If it cannot be done efficiently then the hash is Second Preimage resistant. 5

Collision • Note that if this is solved, then if (x,y) is a valid pair so is (x’,y) • If not (efficiently solvable) the hash function is called collision resitant The Random Oracle Model • Captures the concept of an ideal hash function • If a hash function, h is ideal then the only way to compute the hash of a given value is by actually computing it: i,e even if many previous values are known. 6

A Non-Ideal Hash Function • Consider a hash function h: Z n � Z n which is a linear function, say – h(x,y)=ax + by mod n, a, b ε Z n , n ≥ 2 is a positive integer – Suppose, h(x 1 ,y 1 )=ax 1 +by 1 , h(x 2 ,y 2 )=ax 2 +by 2 h(rx 1 +sx 2 mod n, ry 1 +sy 2 mod n)= =rh 1 (x 1 ,y 1 )+sh 2 (x 2 ,y 2 ) mod n Thus we can compute the hash of another value apart from (x 1 ,y 1 ) and (x 2 ,y 2 ) without actually computing the hash value. We are computing the new hash value from pre-computed values Note that we do not require the knowledge of a and b also. This is not what is an ideal hash function according to the RO model. What is an Oracle? • It is not an algorithm • neither a formula • imagine this to be a giant book of random numbers and each page is a value x and the number written on that page is h(x) 7

An Independence Theorem • Note that the above is a conditional probability • It states that the knowledge of the previously computed values, does not give any advantage to the future computations of h(x) • This assumption in the RO model will be used in the complexity proofs that follow. Algorithms in the RO model • These algorithms are applicable to all hash functions, since the algorithms are not dependent on the details of the hashing method. • These algorithms are randomized, in the sense that they make random choices • In particular they can fail, but if they succeed they are correct: Las Vegas Algorithms 8

Algorithms in the RO model • Worst case success probability, ε : if for every problem instance, the randomized algorithm returns a correct answer with probability at least ε • Average case success probability: if the probability that the algorithm returns a correct answer, averaged over all problem instances , is at least ε • The average success probability is averaged over all possible random choices of F X,Y , and all possible random choices of x ε X and/or y ε Y, if x and/or y are specified as a part of the problem instance. Algorithm Find-Preimage 9

Algorithm Find-Second Preimage Algorithm FindCollision 10

Relating Q and ε • So, if we hash little over sqrt(M) values, we have a 50% chance of collision • Thus our algorithm is (1/2, O(sqrt(M)) algorithm Comparison of Security Criteria • Solving Collision is easier than solving Preimage or 2 nd Preimage • Can we reduce one problem to the other? • We shall study two reductions: – Collision to 2 nd Preimage – Collision to Preimage 11

Proof Method • Assume that Preimage can be solved using a randomized algorithm • Show that then the Collision can be solved. • Collision Hardness << Preimage Hardness • Resistance against Collision => Preimage Resistance The first reduction • Oracle-2nd-Preimage is an ( ε ,q) algorithm. • Since it is a Las-Vegas algorithm, if it gives an answer it will be correct. Thus, x ≠ x’ and h(x)=h(x’). Thus the collision is also found. • Thus Collision-to-second-preimage is also an ( ε ,q) Las-Vegas algorithm 12

The second reduction • Assume that Oracle-Preimage is a (1,Q) Las Vegas algorithm • We will make some weak assumptions on the size of X and Y, |X| ≥ 2|Y| Reduction • Proof discussed in class. 13

Construction of Iterated Hash Functions • Extending a compression function to a hash function with an infinite domain • A hash function created in this fashion is called an iterated hash function • Consider hash functions whose inputs and outputs are bit strings • |x|: length of a bit string x • x||y: concatenation of strings x and y Outline of the construction • Given, compress:{0,1} m+t � {0,1} m , t ≥ 1 • Preprocessing: – an input string x, where |x| ≥ m+t+1 – output string y, such that |y| ≡ 0 (mod t) – y=y 1 ||y 2 ||y 3 ||…||y r , where |y i |=t for 1 ≤ i ≤ r 14

Optional Output Transformation • g: {0,1} m � {0,1} l • Define h(x)=g(z r ), g is a public function • Sometimes, h(x)=z r Processing • z 0 =IV (public value, called Initialization Vector, |IV|=m) z 1 =compress(z 0 ||y 1 ) z 2 =compress(z 1 ||y 2 ) … … z r =compress(z r-1 ||y r ) 15

A typical preprocessing • y=x||pad(x) – pad(x) is a padding function – it generally has the value of |x|, padded to the left with additional zeros (so that the sum is a multiple of t) • Note that the preprocessing step has to be injective – |y|=rt ≥ |x| Merkle Damgård Construction • Uses compress:{0,1} m+t � {0,1} m , which is collision resistant to construct a collision resistant hash function, h:{0,1} * � {0,1} m – The construction yields a proof for this result. • Typically, we take |x|=m+t+1 (may be because we wish to keep the message length more than double that of the hash value) 16

The Preprocessing • x=x 1 ||x 2 ||…||x k , – where |x 1 |=|x 2 |=…=|x k-1 |=t-1 and |x k |=t-1-d, where 0 ≤ d ≤ t-2 – Thus, + ⎡ ⎤ n d n = = ⎢ k ⎥ − − ⎢ ⎥ 1 1 t t The Algorithm • This step is known as the MD strengthening • Note that y k+1 is also padded to the left with zeros so that |y k+1 |=t-1 • The MD strengthening helps to make the pre-processing step injective 17

A Picture is better than thousand words The Proof • Compress Collision-res => Hash Collision-res • not(Hash Collision-res ) => not(Compress Collision-res ) • If you can find a collision in the Hash function efficiently, then you can find a collision in the compression function efficiently. 18

When t=1 • Here the encoding, f is done in a special way. – f(0)=0, f(1)=01 • The encoding is injective • There does not exist two strings x ≠ x’, such that y(x)=z||y(x’), that is no encoding is a postfix of another encoding. Theorems 19

Attacks: Is an Iterated Hash Ideal? • We shall discuss some attacks against schemes that use Merkle Dåmgard Based Hashing • The pit-fall lies in abstraction as a Black Box • We know a double data type represents real number, but there is a precision involved. – conclusion is we have to know the limits well. Attacks: Is an Iterated Hash Ideal? • In our design of Hash functions (for aiding the proofs) we have assumed that the hash function is ideal. – one important requirement was that the only way to learn the hash of a value is by actually computing it! – This is violated in the Merkle Dåmgard construction. 20

Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur - PDF document

Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity Cryptographic Hash Function: Provides assurance of data integrity Let h be a hash function and x some data. The hash creates a fingerprint of the data,

Hash Functions in Action Hash Functions in Action Lecture 12 Hash Functions Hash Functions

Hash Functions in Action Hash Functions in Action Lecture 11 Hash Functions Hash Functions

Hash Functions Hash Functions 1 Cryptographic Hash Function Crypto hash function h(x) must

Hash Functions and MACs Properties of Cryptographic Hash Functions Introduction to Message

Hash Functions and MACs Properties of Cryptographic Hash Functions Introduction to Message

Cryptographic Hash Functions Cryptographic Hash Functions and their many applications and their

Network Security Cryptography: Cryptographic Hash Functions And Message Authentication Code

HASH FUNCTIONS Mihir Bellare UCSD 1 Mihir Bellare UCSD 2 Hash functions Hash functions

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

Hash Pile Ups: Using Collisions to Identify Unknown Hash Functions R. Joshua Tobin and David

Hash Functions and Hash Tables (2.5.2) A hash function h maps keys of a given type to

SAT Solver Attacks on CubeHash Ben Bloom Hash Functions Variable length input, fixed length

References Cryptographic Hash Functions Hash Functions, Chapter 11 of Understanding

Cryptographic Hash Functions Signatures Requirements MD5 and SHA CSS441: Security and

Practical attacks on AES-like cryptographic hash functions Stefan K olbl, Christian Rechberger

Lecture 07: Race Conditions, Deadlock, Data Integrity The job - list - broken and job - list -

PAYE Modernisation Thesaurus Webinar December 2019 Overview PAYE Modernisation Update

Securing software by enforcing data-flow integrity Manuel Costa Joint work with: Miguel Castro,

Authentication and Data Integrity Authentication with Symmetric Key Encryption Authentication

AN INFORMATICS FRAMEWORK FOR TESTING DATA INTEGRITY AND CORRECTNESS OF FEDERATED BIOMEDICAL

Key Wrapping with the Keccak Permutation Dmitry Khovratovich University of Luxembourg 17 January

-Tree: A Gas-Efficient Structure for Authenticated Range Queries in Blockchain

Integrity Assurance in Resource-Bounded Systems through Stochastic Message Authentication Aron