Proofs of Retrievability via Fountain Code Sumanta Sarkar and - - PowerPoint PPT Presentation

proofs of retrievability via fountain code
SMART_READER_LITE
LIVE PREVIEW

Proofs of Retrievability via Fountain Code Sumanta Sarkar and - - PowerPoint PPT Presentation

Proofs of Retrievability via Fountain Code Sumanta Sarkar and Reihaneh Safavi-Naini Department of Computer Science, University of Calgary, Canada Foundations and Practice of Security October 25, 2012 Outsourcing Data into Cloud Storage


slide-1
SLIDE 1

Proofs of Retrievability via Fountain Code

Sumanta Sarkar and Reihaneh Safavi-Naini

Department of Computer Science, University of Calgary, Canada

Foundations and Practice of Security October 25, 2012

slide-2
SLIDE 2

Outsourcing Data into Cloud Storage

◮ Suppose a user generates lots of electronic data: videos,

photos, emails, text documents.

◮ He also has many devices: desktop, laptop, tablet,

  • smartphone. But none of them are capable of storing huge

data.

◮ Cloud storage comes with the solution:

◮ Outsource the data into the cloud. ◮ Access all data from all the devices and from anywhere. ◮ Cloud keeps the whole data intact as long as the client wants.

slide-3
SLIDE 3

Risk of Outsourcing Data into Cloud Storage

◮ Completely rely on the cloud for the integrity of the data. ◮ No control over the infrastructure of the cloud. ◮ Device failure may erase some portions of the data. ◮ A dishonest cloud may erase some portions of the data to

reduce its own storage cost.

slide-4
SLIDE 4

Checking the Integrity of the Data

◮ Store a MAC of the data locally. ◮ Can download the whole file, compute the MAC and check

with the previously stored one.

◮ Not a practical solution when the data is big.

slide-5
SLIDE 5

Proofs of Retrievability (PoR)

◮ Juels and Kaliski 2007 introduced Proofs of Retrievability

(PoR) protocol which verifies the integrity of the data through an audit protocol.

slide-6
SLIDE 6

Proofs of Retrievability (PoR)

◮ Juels and Kaliski 2007 introduced Proofs of Retrievability

(PoR) protocol which verifies the integrity of the data through an audit protocol.

◮ The client applies an erasure code on the file M and stores

the encoded file M′ in the cloud.

◮ M can be decoded from a fraction, say ρ of M′.

slide-7
SLIDE 7

Proofs of Retrievability (PoR)

◮ Juels and Kaliski 2007 introduced Proofs of Retrievability

(PoR) protocol which verifies the integrity of the data through an audit protocol.

◮ The client applies an erasure code on the file M and stores

the encoded file M′ in the cloud.

◮ M can be decoded from a fraction, say ρ of M′. ◮ Along with M′, the client also stores some extra information

∆(M) which will be used in the audit.

◮ An audit is a challenge-response protocol. In the audit the

client (verifier) challenges on some random location of the file and cloud’s (prover) correct response proves that file blocks are intact in those locations.

slide-8
SLIDE 8

Proofs of Retrievability (PoR)

◮ Juels and Kaliski 2007 introduced Proofs of Retrievability

(PoR) protocol which verifies the integrity of the data through an audit protocol.

◮ The client applies an erasure code on the file M and stores

the encoded file M′ in the cloud.

◮ M can be decoded from a fraction, say ρ of M′. ◮ Along with M′, the client also stores some extra information

∆(M) which will be used in the audit.

◮ An audit is a challenge-response protocol. In the audit the

client (verifier) challenges on some random location of the file and cloud’s (prover) correct response proves that file blocks are intact in those locations.

◮ The security of a PoR scheme is formalized by showing the

existence of an extractor which retrieves the file with very high probability from an erasing adversary that can pass the audit protocol with some reasonable probability.

slide-9
SLIDE 9

Efficiency of PoR System

◮ The computational cost of preparing a file for storing in the

cloud, and calculating the response,

◮ Communication cost required during an audit and, ◮ The extra storage (overhead) needed for storing the file M.

slide-10
SLIDE 10

Efficiency of PoR System

◮ The computational cost of preparing a file for storing in the

cloud, and calculating the response,

◮ Communication cost required during an audit and, ◮ The extra storage (overhead) needed for storing the file M. ◮ So small size challenge improves the communication cost of

the protocol, and also the computation cost of the prover as less blocks will be involved in the computation of response.

slide-11
SLIDE 11

Bounded/Unbounded-use PoR and Private/Public Verifiability

◮ PoR that allows “unlimited” number of challenge-response

interactions is unbounded-use, otherwise it is bounded-use.

◮ A private verifiable PoR allows only the owner of the file who

stores the file can run the challenge-response protocol, whereas in public verifiable PoR, anyone knowing the appropriate public key can perform the verification.

slide-12
SLIDE 12

Main Contribution

◮ We present an unbounded-use private PoR scheme that

improves the cost of response computation and the cost of communication of challenges in the average case.

◮ Our construction closely follows that of Shacham and Waters

2008 and uses Fountain code.

slide-13
SLIDE 13

Related Work on PoR

◮ PoR was introduced by Juels and Kaliski 2007 and

subsequently has been extended and improved by Shacham and Waters 2008; Bowers, Juels and Oprea 2009; Dodis, Vadhan and Wichs 2009.

◮ JK07 scheme has quadratic communication complexity (in

terms of security parameter) for response.

◮ This was improved to linear complexity in SW08 by using

homomorphic linear authenticators.

◮ Dodis et al. viewed the set of all correct responses

corresponding to the file M′ = Enc(M) stored in the cloud as a codeword C which is a challenge-response encoding of M.

◮ The set of all responses for the same file M′ from the prover

form a word C ′ which may differ from C. The extractor decodes M from C ′.

slide-14
SLIDE 14

Background on PoR

We follow SW08.

◮ Kg(): This randomized algorithm generates a secret key sk

and the public key pk.

◮ St(sk,M): This randomized algorithm takes the secret key sk

and the client file M ∈ {0, 1}∗. Then it processes M and

  • utputs M∗ which is stored in the cloud.

◮ P, V : The randomized algorithms that correspond to the

prover and the verifier. At the end of the prover-verifier interaction: {0, 1} R ← (V(pk, sk, t) ⇋ P(pk, t, M∗)).

slide-15
SLIDE 15

PoR properties: Correctness and Soundness

◮ Correctness means that if the prover is honest then

(V(pk, sk, t) ⇋ P(pk, t, M∗)) = 1.

◮ A PoR is sound if any prover that convinces the verification

means that it actually holds the file.

slide-16
SLIDE 16

ǫ-adversary and the Extractor

◮ Adversary is assumed to erase some portion of the file with

probability bounded by a fixed value.

◮ A prover is ǫ-admissible if it convincingly answers an ǫ fraction

  • f challenges.

◮ A PoR scheme is ǫ-sound if there exists an extraction

algorithm (Extractor) which by interacting (challenge-response) with the ǫ-admissible adversary can recover the file except with negligible probability.

slide-17
SLIDE 17

Fountain Codes

◮ In Fountain codes the sender generates potentially a limitless

string of encoded symbols. The receiver can recover the message from sufficiently many encoded symbols.

◮ Examples: LT code [Luby 2002] and Raptor code [Shokrollahi

2006] are two well known Fountain codes.

slide-18
SLIDE 18

Raptor Code: Encoding

Precoding

◮ The message is (x1, . . . , xk), where each xi is of ℓ-bits. ◮ First (x1, . . . , xk) is encoded to (y1, . . . , yn) by an erasure

code Cn which can recover (x1, . . . , xk) from any ρn number

  • f symbols.

LT coding To generate Raptor encoding symbols, LT code is applied on (y1, . . . , yn). For that, a degree distribution defined by a polynomial w(x) =

n

  • i=1

wixi where wi is the probability of choosing i, i ∈ {1, . . . , n} is chosen.

◮ Randomly choose a degree, say j, using w(x). ◮ Choose uniformly at random, j symbols from the set

{y1, . . . , yn}, and XOR them to produce the encoded symbol (output symbol) ri = yi1 ⊕ . . . ⊕ yij.

slide-19
SLIDE 19

Raptor Code structure

slide-20
SLIDE 20

Raptor Code: Decoding

◮ After collecting ri symbols little more than k in amount, apply

BP decoding and get ρ fraction of {y1, . . . , yn}, and then applying decoding of Cn receiver can recover (x1, . . . , xk).

slide-21
SLIDE 21

Raptor Code parameters

◮ The following are from the Raptor code construction given in

[Shokrollahi 2006].

◮ Let α > 0 be a real number, set D = ⌈4(1 + α)/α⌉ and define

wD(x) = 1 µ + 1(µx +

D

  • i=2

xi (i − 1)i + xD+1 D ), (1) where µ = (α/2) + (α/2)2.

◮ The average of wD is

ln(1/α) + β + O(α), (2) where 1 < β < 1 + γ + ln(9), the constant γ is the Euler’s constant.

slide-22
SLIDE 22

Results on decoding Raptor Code

Lemma (Shokrollahi 2006)

There exists a positive real number c (depending on α) such that with an error probability of at most e−cn any set of (1 + α/2)n + 1

  • utput symbols of the LT-code with distribution wD and n-input

symbols y1, . . . , yn are sufficient to recover at least ρn input symbols from {y1, . . . , yn} via belief propagation decoding, where ρ = 1 − α/4

1+α.

Theorem (Shokrollahi 2006)

Let α > 0 be a real number, k an integer, D = ⌈4(1 + α)/α⌉, R = (1 + α/2)/(1 + α), n = ⌈k/R⌉. Let Cn be an erasure code which can decode (1 − R)/2 erasures. Then the Raptor code with precode Cn and the LT-code with the distribution wD(x) which encodes k symbols, can decode from (1 + α)k output symbols.

slide-23
SLIDE 23

PoR of SW08

◮ Suppose F ′ = (m1, . . . , mn) is the erasure encoded file of the

client file F. Each mi ∈ Zp.

◮ Choose θ ∈ Zp randomly and create authenticators

σi = PRF(i) + θmi.

◮ Challenge: Q = {(i1, v1), . . . , (iw, vw)}, where ij randomly

chosen from {1, . . . , n} and vj chosen randomly from Zp.

◮ Response: r = (i,vi)∈Q vimi and σ = (i,vi)∈Q viσi. ◮ Verify: σ ?

=

  • (i,vi)∈Q viPRF(i) + θr.
slide-24
SLIDE 24

RAPTOR-PoR: Choosing Key

Kg(): A random symmetric encryption key kenc

R

← Kenc and a random MAC key kmac

R

← Kmac are chosen. The secret key is sk = (kenc, kmac). Since this is private verification, there is no public key pk.

slide-25
SLIDE 25

RAPTOR-PoR: Preparing File for Storing

◮ First M = (x1, . . . , xk), xi is ℓ-bits, is encoded by an erasure

code Cn to obtain M′ = (y1, . . . , yn), where Cn is such that any ρn symbols from (y1, . . . , yn) will be enough for the reconstruction of M.

slide-26
SLIDE 26

RAPTOR-PoR: Preparing File for Storing

◮ First M = (x1, . . . , xk), xi is ℓ-bits, is encoded by an erasure

code Cn to obtain M′ = (y1, . . . , yn), where Cn is such that any ρn symbols from (y1, . . . , yn) will be enough for the reconstruction of M.

◮ Choose a PRF key kprf R

← Kprf and a random binary ℓ × ℓ matrix A = [A1, . . . , Aℓ]T, where each Ai is an ℓ-bit row vector.

slide-27
SLIDE 27

RAPTOR-PoR: Preparing File for Storing

◮ First M = (x1, . . . , xk), xi is ℓ-bits, is encoded by an erasure

code Cn to obtain M′ = (y1, . . . , yn), where Cn is such that any ρn symbols from (y1, . . . , yn) will be enough for the reconstruction of M.

◮ Choose a PRF key kprf R

← Kprf and a random binary ℓ × ℓ matrix A = [A1, . . . , Aℓ]T, where each Ai is an ℓ-bit row vector.

◮ Let t0 = n||Enckenc(kprf ||A1|| · · · ||Aℓ), and

t = t0||MACkmac(t0) be the file tag.

slide-28
SLIDE 28

RAPTOR-PoR: Preparing File for Storing

◮ First M = (x1, . . . , xk), xi is ℓ-bits, is encoded by an erasure

code Cn to obtain M′ = (y1, . . . , yn), where Cn is such that any ρn symbols from (y1, . . . , yn) will be enough for the reconstruction of M.

◮ Choose a PRF key kprf R

← Kprf and a random binary ℓ × ℓ matrix A = [A1, . . . , Aℓ]T, where each Ai is an ℓ-bit row vector.

◮ Let t0 = n||Enckenc(kprf ||A1|| · · · ||Aℓ), and

t = t0||MACkmac(t0) be the file tag.

◮ For each i, where 1 ≤ i ≤ n, create authenticators σ1, . . . , σn

as σi = PRFkprf (i) ⊕ yiA for 1 ≤ i ≤ n. Each σi is also an ℓ-bit symbol.

◮ Then M∗ = (y1, . . . , yn, σ1, . . . , σn) is the processed file. Send

M∗ and t to the cloud.

slide-29
SLIDE 29

RAPTOR-PoR: Audit (1)

V.Tagcheck(sk, t) :

◮ Obtains kmac and kenc from the secret key sk. ◮ t0 = n||Enckenc(kprf ||A1|| · · · ||Aℓ)

t = t0||MACkmac(t0) Receives the tag t from the prover and verify it by the kmac, if MAC does not match, quit the audit. Otherwise, using the symmetric key kenc, decrypt Enckenc(kprf ||A1|| · · · ||Aℓ) and recover n, kprf and the matrix A.

slide-30
SLIDE 30

RAPTOR-PoR: Audit (1)

V.Tagcheck(sk, t) :

◮ Obtains kmac and kenc from the secret key sk. ◮ t0 = n||Enckenc(kprf ||A1|| · · · ||Aℓ)

t = t0||MACkmac(t0) Receives the tag t from the prover and verify it by the kmac, if MAC does not match, quit the audit. Otherwise, using the symmetric key kenc, decrypt Enckenc(kprf ||A1|| · · · ||Aℓ) and recover n, kprf and the matrix A. V.Chal(n) :

◮ Choose an integer w using the degree distribution with the

generator polynomial wD(x) = n

i=1 wixi. Then choose w

indices, say {i1, . . . , iw}, uniformly from {1, . . . , n} and choose

  • ne index, say c, uniformly at random from {i1, . . . , iw}. Send

Q = ({i1, . . . , iw}, {c}) to the prover.

slide-31
SLIDE 31

RAPTOR-PoR: Audit (2)

◮ P(Q, M∗) : In response to the challenge Q compute

r = yi1 ⊕ . . . ⊕ yiw (3) σ = σi1 ⊕ . . . ⊕ σiw . Send resp = (r, σ, yc, σc) to the verifier.

slide-32
SLIDE 32

RAPTOR-PoR: Audit (2)

◮ P(Q, M∗) : In response to the challenge Q compute

r = yi1 ⊕ . . . ⊕ yiw (3) σ = σi1 ⊕ . . . ⊕ σiw . Send resp = (r, σ, yc, σc) to the verifier.

◮ V.Ver(A, kprf , resp) : After receiving prover’s response, check

whether σ ? = rA ⊕

  • i∈{i1,...,iw}

PRFkprf (i), σc

?

= PRFkprf (c) ⊕ ycA.

slide-33
SLIDE 33

Parameters for RAPTOR-PoR

◮ Refer to Raptor code parameters: we take α = 1/ℓ. ℓ is the

security parameter.

◮ Rate of the precode Cn is R = 2ℓ+1 2ℓ+2. Then n = poly(ℓ), if

k = poly(ℓ).

◮ The erasure probability that Cn can handle is 1 − ρ = 1 4(ℓ+1). ◮ D = 4(ℓ + 1), µ = 1 2ℓ + 1 4ℓ2 , the degree distribution is

wD(x) = 2ℓ + 1 4ℓ2 + 2ℓ + 1 x + 4ℓ2 2ℓ + 1

  • x2

1.2 + x3 2.3 + . . . + x4(ℓ+1) (4ℓ + 3)(4ℓ + 4) + x4ℓ+5 4ℓ + 4

  • .

The mean of this distribution is ln(ℓ) + β + O(1/ℓ) = O(log ℓ), where 1 < β < 1 + γ + ln(9), the constant γ is the Euler’s constant.

slide-34
SLIDE 34

RAPTOR-PoR: Result on Extractor

Theorem

If the prover is ǫ-admissible then running the Audit protocol for

(1+1/ℓ)k ǫ

iterations, the extractor will be able to retrieve the file with error probability e−poly(ℓ).

slide-35
SLIDE 35

RAPTOR-PoR: Comparison with the other PoR

◮ All previous schemes challenge a fixed number of blocks of

  • rder O(ℓ).

◮ In our scheme, the size of the challenge set is chosen from the

interval [1, 4ℓ + 5] according to the probability distribution wD(x).

◮ So in the worst case, it is O(ℓ). However, in the average case

it is ln(ℓ) + β + O(1/ℓ), where 1 < β < 1 + γ + ln(9), i.e., O(log ℓ).

◮ This also means that the cloud has to consider O(log ℓ)

number of blocks while computing a response in the average case.

◮ In RAPTOR-PoR, the response is formed just by XORing w

ℓ-bit-elements, whereas forming a response for a challenge on w elements in SW 2008 scheme, one has to compute w-multiplications and (w − 1)-additions over Zp.

slide-36
SLIDE 36

Conclusion

◮ We have proposed a PoR construction based on the SW 2008

PoR and improved the response computation.

◮ Notably we use challenge of variable length which are chosen

probabilistically.

◮ The next task is to have an efficient implementation of our

scheme, which requires additional measures. For instance applying erasure encoding on a big file is not practical, so file should be divided into stripes and then we can apply erasure encoding on each stripes. So this require completely new analysis of the scheme.

slide-37
SLIDE 37

THANK YOU