Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur - - PDF document

cryptographic hash functions
SMART_READER_LITE
LIVE PREVIEW

Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur - - PDF document

Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity Cryptographic Hash Function: Provides assurance of data integrity Let h be a hash function and x some data. The hash creates a fingerprint of the data,


slide-1
SLIDE 1

1

Cryptographic Hash Functions

Debdeep Mukhopadhyay IIT Kharagpur

Data Integrity

  • Cryptographic Hash Function: Provides

assurance of data integrity

  • Let h be a hash function and x some data.
  • The hash creates a fingerprint of the data,
  • ften referred to as the message digest.
  • Typically, x is a large binary string
  • The digest is a fairly short binary string,

say 160 bits.

slide-2
SLIDE 2

2

Applications

  • Say y=h(x), and y is stored in some

secured place.

  • If x is altered to say x’ and if we assume

that h(x)≠h(x’), then the alteration of the message is readily caught, by verifying y≠y’, where y’=h(x’)

  • Used in digital signature schemes
  • Used for message authentication codes

(MAC)

Application: Data Integrity

slide-3
SLIDE 3

3

Application: Digital Signatures A Keyed Hash Function

  • Suppose we also have a key in the computation
  • f the hash functions.
  • y=hK(x), and the key is kept secret.

– Alice and Bob share K – Alice computes y for x, using K and sends to Bob. – Bob receives x’ and computes the hash value. – If the hashes match, the message is unaltered. – Note that here y is not required to be kept secret. Why?

slide-4
SLIDE 4

4

What is a Cryptographic Hash Family?

  • Note: X could be finite or infinite set, but Y is always finite
  • If |X|=N, |Y|=M, then there are MN possible FX,Y (the cardinality of

the set of all functions from X to Y)

  • Any hash family, is called an (N,M) hash family.

, X Y

F F ⊆

Security of Hash Functions

  • There are three important properties which

a hash function must satisfy.

  • The properties are required for the security
  • f the applciations.

– Preimage – Second Preimage – Collision

  • We define them one by one.
slide-5
SLIDE 5

5

Preimage

  • If the Preimage can be solved then (x,y) is

a valid pair.

  • A hash function for which Preimage

cannot be efficiently solved is said to be preimage resistant.

Second Preimage

  • If this problem is solved, then the pair

(x’,h(x)) is valid

  • If it cannot be done efficiently then the

hash is Second Preimage resistant.

slide-6
SLIDE 6

6

Collision

  • Note that if this is solved, then if (x,y) is a

valid pair so is (x’,y)

  • If not (efficiently solvable) the hash

function is called collision resitant

The Random Oracle Model

  • Captures the concept of an ideal hash

function

  • If a hash function, h is ideal then the only

way to compute the hash of a given value is by actually computing it: i,e even if many previous values are known.

slide-7
SLIDE 7

7

A Non-Ideal Hash Function

  • Consider a hash function h: Zn Zn which is a linear

function, say

– h(x,y)=ax + by mod n, a, b ε Zn, n≥2 is a positive integer – Suppose, h(x1,y1)=ax1+by1, h(x2,y2)=ax2+by2 h(rx1+sx2 mod n, ry1+sy2 mod n)= =rh1(x1,y1)+sh2(x2,y2) mod n Thus we can compute the hash of another value apart from (x1,y1) and (x2,y2) without actually computing the hash value. We are computing the new hash value from pre-computed values Note that we do not require the knowledge of a and b also. This is not what is an ideal hash function according to the RO model.

What is an Oracle?

  • It is not an algorithm
  • neither a formula
  • imagine this to be a giant book of random

numbers and each page is a value x and the number written on that page is h(x)

slide-8
SLIDE 8

8

An Independence Theorem

  • Note that the above is a conditional probability
  • It states that the knowledge of the previously

computed values, does not give any advantage to the future computations of h(x)

  • This assumption in the RO model will be used in the

complexity proofs that follow.

Algorithms in the RO model

  • These algorithms are applicable to all

hash functions, since the algorithms are not dependent on the details of the hashing method.

  • These algorithms are randomized, in the

sense that they make random choices

  • In particular they can fail, but if they

succeed they are correct: Las Vegas Algorithms

slide-9
SLIDE 9

9

Algorithms in the RO model

  • Worst case success probability, ε: if for every

problem instance, the randomized algorithm returns a correct answer with probability at least ε

  • Average case success probability: if the probability

that the algorithm returns a correct answer, averaged over all problem instances, is at least ε

  • The average success probability is averaged over all

possible random choices of FX,Y, and all possible random choices of xεX and/or yεY, if x and/or y are specified as a part of the problem instance.

Algorithm Find-Preimage

slide-10
SLIDE 10

10

Algorithm Find-Second Preimage

Algorithm FindCollision

slide-11
SLIDE 11

11

Relating Q and ε

  • So, if we hash little over sqrt(M) values, we have

a 50% chance of collision

  • Thus our algorithm is (1/2, O(sqrt(M)) algorithm

Comparison of Security Criteria

  • Solving Collision is easier than solving

Preimage or 2nd Preimage

  • Can we reduce one problem to the other?
  • We shall study two reductions:

– Collision to 2nd Preimage – Collision to Preimage

slide-12
SLIDE 12

12

Proof Method

  • Assume that Preimage can be solved

using a randomized algorithm

  • Show that then the Collision can be

solved.

  • CollisionHardness << PreimageHardness
  • Resistance against Collision => Preimage

Resistance

The first reduction

  • Oracle-2nd-Preimage is an (ε,q) algorithm.
  • Since it is a Las-Vegas algorithm, if it gives an answer it

will be correct. Thus, x≠x’ and h(x)=h(x’). Thus the collision is also found.

  • Thus Collision-to-second-preimage is also an (ε,q)

Las-Vegas algorithm

slide-13
SLIDE 13

13

The second reduction

  • Assume that Oracle-Preimage is a (1,Q)

Las Vegas algorithm

  • We will make some weak assumptions on

the size of X and Y, |X|≥2|Y|

Reduction

  • Proof discussed in class.
slide-14
SLIDE 14

14

Construction of Iterated Hash Functions

  • Extending a compression function to a

hash function with an infinite domain

  • A hash function created in this fashion is

called an iterated hash function

  • Consider hash functions whose inputs and
  • utputs are bit strings
  • |x|: length of a bit string x
  • x||y: concatenation of strings x and y

Outline of the construction

  • Given, compress:{0,1}m+t{0,1}m, t≥1
  • Preprocessing:

– an input string x, where |x|≥m+t+1 – output string y, such that |y|≡0 (mod t) – y=y1||y2||y3||…||yr, where |yi|=t for 1≤i≤r

slide-15
SLIDE 15

15

Optional Output Transformation

  • g: {0,1}m{0,1}l
  • Define h(x)=g(zr), g is a public function
  • Sometimes, h(x)=zr

Processing

  • z0=IV (public value, called Initialization

Vector, |IV|=m)

z1=compress(z0||y1) z2=compress(z1||y2) … … zr=compress(zr-1||yr)

slide-16
SLIDE 16

16

A typical preprocessing

  • y=x||pad(x)

– pad(x) is a padding function – it generally has the value of |x|, padded to the left with additional zeros (so that the sum is a multiple of t)

  • Note that the preprocessing step has to be

injective

– |y|=rt ≥ |x|

Merkle Damgård Construction

  • Uses compress:{0,1}m+t{0,1}m, which is

collision resistant to construct a collision resistant hash function, h:{0,1}*{0,1}m

– The construction yields a proof for this result.

  • Typically, we take |x|=m+t+1 (may be

because we wish to keep the message length more than double that of the hash value)

slide-17
SLIDE 17

17

The Preprocessing

  • x=x1||x2||…||xk,

– where |x1|=|x2|=…=|xk-1|=t-1 and |xk|=t-1-d, where 0≤d≤t-2 – Thus,

1 1 n d n k t t + ⎡ ⎤ = = ⎢ ⎥ − − ⎢ ⎥

The Algorithm

  • This step is known

as the MD strengthening

  • Note that yk+1 is

also padded to the left with zeros so that |yk+1|=t-1

  • The MD

strengthening helps to make the pre-processing step injective

slide-18
SLIDE 18

18

A Picture is better than thousand words

The Proof

  • CompressCollision-res => HashCollision-res
  • not(HashCollision-res) => not(CompressCollision-res)
  • If you can find a collision in the Hash function efficiently,

then you can find a collision in the compression function efficiently.

slide-19
SLIDE 19

19

When t=1

  • Here the encoding, f is done in a special way.

– f(0)=0, f(1)=01

  • The encoding is injective
  • There does not exist two strings x≠x’, such that y(x)=z||y(x’), that is no

encoding is a postfix of another encoding.

Theorems

slide-20
SLIDE 20

20

Attacks: Is an Iterated Hash Ideal?

  • We shall discuss some attacks against

schemes that use Merkle Dåmgard Based Hashing

  • The pit-fall lies in abstraction as a Black

Box

  • We know a double data type represents

real number, but there is a precision involved.

– conclusion is we have to know the limits well.

Attacks: Is an Iterated Hash Ideal?

  • In our design of Hash functions (for aiding

the proofs) we have assumed that the hash function is ideal.

– one important requirement was that the only way to learn the hash of a value is by actually computing it! – This is violated in the Merkle Dåmgard construction.

slide-21
SLIDE 21

21

Commitment Scheme

  • Consider an auction, where the parties

submit their bids

  • The least bit gets the deal
  • If we wish to develop a digital method

without having a central trusted authority, we can try to use a one-way function

  • The function binds the bid to a value, the

bid leaking no information about the

  • riginal commitment

Commitment Scheme

  • Hash functions are engaged for this
  • Security Argument:

– If f is one-way recovering x from f(x) is hard – If f is collision-resistant, the commitment can

  • nly be x…otherwise the cheater is creating a

collision

  • But if the attacker creates hash of all

possible values and compare with the bid

  • A cheater simply copies others bid values.
slide-22
SLIDE 22

22

MAC Construction

  • Message Authentication Code (MAC) is a keyed

hash function.

  • Used to verify the integrity and authentication of

information.

  • Alice appends M||HK(M).
  • Bob collects it and checks validity of the pair.
  • Prevents adversary from tampering the message

(integrity) and forcing Bob to believe that it actually came from Alice (authentication).

MAC Construction

  • MAC’s security is thus based on the property,

that the attacker can query the function on inputs

  • f his own choice – but is not able to evaluate

the function on any other input with non- negligible probability.

  • It is easy to make such a function, when we

have an ideal hash function

  • But the problem happens when we replace the

ideal function by an iterated hash, like SHA-I.

slide-23
SLIDE 23

23

The Attack

  • Key length: 80 bits
  • Adversary intercepts a message M of

length 256 bits with a valid 160 bits tag, t

t=SHA1(k||M)=C(k||M||10…0||0…0101010000)

  • To forge construct, M’ which includes M, 112

bits of padding, 64 bit encoding of 336, followed by arbitrary text T. M’=k|| 10…00…0101010000||T

The Forging of MAC

  • Note that:

SHA(k||M’)=C(C(k||M|| 10…00…0101010000),

T||padding + length)=C(t,T||padding + length)

  • In other words, we may compute a valid tag on M’, by

applying the compression function to t, T and some padding all of which are known!

  • Thus the naive MAC construction based on iterative

hash function is totally broken, if the length of a previous message is known (thus violating the RO assumption)!!!

slide-24
SLIDE 24

24

Security of cascaded hash functions: Joux’s Attack

  • A generic collision-finding algorithm takes

2n/2 order, where n is the block length

  • Suppose we have two functions,

G,H:{0,1}*{0,1}n, each having an ideal security 2n/2

  • Can we construct a collision-resistant hash

with ideal security 2n ?

– define F(M)=G(M)||H(M)

  • This indeed works, but if both the hashes are

ideal…we claim that it fails if one of the hash functions is iterative.

  • Assume that one of the functions is based on

MD construction

– let C be its compression function – let us find a collision on C, we can do that in time

  • rder 2n/2. Let the messages be:

– Continuing this for k times, in time O(k2n/2) we have:

(0) (1) 1 1

and M M

(0) (1) 1 1 1

( , ) ( , ) , where 1

i i i i i

C h M C h M h i k

+ + +

= = ≤ ≤

slide-25
SLIDE 25

25

  • We thus now have a treasure of collisions.
  • Any message that has the form:

collides.

  • There are thus 2k such messages, many times more than what one

would have found in time k2n/2, had G been ideal!

  • By B. Paradox, even if H is ideal there is a high probability that there

is a collision in these 2k=2n/2 messages (set k=n/2).

  • Thus we have a collision of the hash F in time O(n2n/2).

– This is lesser than O(2n), thus proving we do not get the security of a hash of 2n bits

1 2

( ) ( ) ( ) 1 2 1

|| ||...|| , where ,..., {0,1}

k

b b b k k

M M M b b ∈

Meaningful Collisions

  • Theoreticians said this does not work in

practice.

  • As the colliding string is almost always

meaningless, and hence detectable.

  • But we shall see that this attack is very

much practical.

  • First a demo…
slide-26
SLIDE 26

26

Meaningful Collisions

  • Consider a message M=M1||M2||…||Mk
  • Create, C(hj,N)=C(hi,N’)

– thus two messages that differ in the jth block will collide

  • M=M1||M2||…Mj-1||N||Mj+1||…||Mk
  • M’=M1||M2||…Mj-1||N’||Mj+1||…||Mk
  • Thus N and N’ may be complete gibberish, they

are now part of a longer text, which may be carefully constructed to accommodate them!

Message Authentication Codes

  • Keyed hash functions
  • one common way would be to make the IV

secret.

  • Consider for simplicity, a hash which does

not have the pre-processing steps and the final output transformation.

  • Given x and hk(x) (MAC) we have to

construct another valid pair. Can we do that?

slide-27
SLIDE 27

27

MAC

  • Consider x||x’, where x,x’ are of t bits.
  • hk(x||x’)=compress(hk(x),x’)

– which can always be computed, even though key is secret! – this can be also attacked to those cases where padding is required and there is a pre- processing step (refer Stinson)

What is security of MAC?

  • Attacker is allowed to request for q valid

MACs on x1,x2,…,xq

  • Thus he obtains the list:

((x1,y1),(x2,y2),…,(xq,yq)) If he is able to output (x,y), where x is not among the q values queried for, then we say there is a forgery. If the probability is ε, then we say (ε,q) forger.

slide-28
SLIDE 28

28

Nested MAC (NMAC)

  • Suppose that: and are

hash families.

  • We compose them to make: in

which and where for all

A Result

  • The nested MAC is secure provided that

the following two conditions hold:

– H is a secured MAC, given a fixed unknown key. – G is collision-resistant, given a fixed unknown key.

slide-29
SLIDE 29

29

Adversaries

  • Three kinds of adversaries:

– forger for the nested MAC (big MAC attack) – forger for the little MAC (small MAC attack) – collision finder for the hash, when the key is secret (unknown key collision attack)

Theorem

  • Result Proved in the class…
slide-30
SLIDE 30

30

CBC-MAC

Endomorphic Block Cipher Each is of block length t

Attack on CBC-MAC

  • Set q≈1.17 x 2t/2, be an integer.
  • Choose distinct q bit-strings of length t:
  • Choose random q bit strings of length t:
  • Let be fixed bit-strings of length t.
  • Construct: , where for

and Note that

slide-31
SLIDE 31

31

Attack on CBC-MAC

  • The attacker now queries the hash value
  • f the q xi values.
  • Due to the B. Paradox, there is bound to

be a collision with probability ½

  • Let . This happens if and only if

which happens if and only if

Attack on CBC-MAC

  • Let be a non-zero bit string of length t.
  • Define:

and

  • The attacker now requests the MAC of v.
  • The MAC of w also is the MAC of v.
  • So, he publishes the MAC of v and w as a valid pair.
  • Thus, we have an forger.