Hybrid scheme Kerberos protocol Public-key: nice solution for key - - PowerPoint PPT Presentation

hybrid scheme kerberos protocol
SMART_READER_LITE
LIVE PREVIEW

Hybrid scheme Kerberos protocol Public-key: nice solution for key - - PowerPoint PPT Presentation

Hybrid scheme Kerberos protocol Public-key: nice solution for key distribution, but Question in Homework 4 computational expensive Two Key Distribution Centers (KDC): AS, Secret-key: efficient, but one requirement. TGS. In


slide-1
SLIDE 1

1

1

Hybrid scheme

Public-key: nice solution for key distribution, but

computational expensive

Secret-key: efficient, but one requirement. In applications (particularly, huge data), a hybrid

scheme is used

Easy of key distribution Efficiency

Question: what are the advantages of hybrid

scheme?

2

Kerberos protocol

Question in Homework 4 Two Key Distribution Centers (KDC): AS,

TGS.

Two types of tickets: ticket-granting ticket

(TGT), service-granting ticket (SGT).

AS Exchange

  • 1. Client C requests a TGT (on behalf of the

user U) by sending its user’s ID and TGS ID to AS.

  • 2. AS replies with a encrypted TGT, which is

used by the client C later in a TGS Exchange.

2.1 When message arrives, C asks U for the

password, generates the key, and decrypt the incoming message.

TGT has two parts: one part is for the client;

the other part is for TGS.

Each part contains the session key to be shared

between C and TGS. Also, timestamp + lifetime.

slide-2
SLIDE 2

2

3

Kerberos protocol

  • AS Exchange
  • TGS Exchange
  • 3. C requests a SGT (on behalf of the user U) by

sending its user’s ID, Server S ID, and TGT to AS.

  • 4. TGS decrypts the TGT and verifies it (ID, lifetime).

Then issues a encrypted SGT to C.

  • SGT has the same structure as TGT.
  • Each part contains another session key to be shared

between C and S. Also, timestamp + lifetime.

  • AP Exchange
  • 5. C requests access to a service (on behalf of the user

U), with User ID, and the SGT.

  • Why two Key Distribution Centers: AS and TGS?
  • User doesn’t need to reenter password for different
  • services. (binding password to a TGT)
  • Application servers belong to different network

domains, organized by different TGS in different

  • domains. Similarly, a fixed user may use one fixed AS.

In this protocol, this user can be served by many TGSs and as a result, can be severed by a large number of application servers.

4

Transport Layer Security (TLS)

Two layers

TLS Record Protocol TLS Handshake Protocol

slide-3
SLIDE 3

3

5

TLS Record Protocol

Runs on top of a connection-oriented

protocol: TCP;

provides two services for SSL connections

confidentiality, integrity. Keys for symmetric encryption and keys used to

form MAC are generated by the TLS Handshake Protocol. Input: a message to be transmitted. Its operations: fragment data into blocks;

compress data (optionally); apply a MAC for data-integrity; encrypt for confidentiality; append SSL record header; and transmit the result to the receiving process.

At the receiving side, it receives cipher data

blocks, decrypts them, verifies the MAC,

  • ptionally decompressed, reassembles the

blocks and delivers the result to higher level application processes.

6

The TLS Handshake Protocol

Its operations

allows the server and client to authenticate each other negotiate encryption and MAC algorithms, agree on keys for the TLS Record Protocol. May be invoked to change the specification of a secure

channel. The Handshake protocol is used before transmitting

application data.

During Handshake Protocol, pending states are

  • created. After successfully executing one

Handshake Protocol, the pending states become the current states.

Client Server ClientHello ServerHello Certificate Certificate Request ServerHelloDone Certificate Certificate Verify Change Cipher Spec Finished Change Cipher Spec Finished Establish protocol version, session ID, cipher suite, compression method, exchange random values

Optionally send server certificate and

request client certificate

Send client certificate response if

requested Change cipher suite and finish handshake

slide-4
SLIDE 4

4

7

Domain Name System (DNS)

When the size of Internet was small,

a host file: two columns. Every host store one copy and update it

periodically from a master host file. Impossible for today’s Internet One simple solution: server

Disadvantages: inefficient; unreliable.

Another solution: distribution & replication.

client/server group model

Two ways to organize name space

Flat: a name is a sequence of characters without

structure

  • cannot be used in a large system such as the Internet.

Hierarchy: each name is composed of several

parts.

  • called domain name space
  • each organization can choose the prefix name for its

host independently.

8

Domain Name System (DNS)

Each node in the tree has a label, and a

domain name.

Root label is an empty string Children of a node have different labels Domain name is a sequence of labels from the

current node up to the root, separated by dots.

Fully Qualified Domain Name (FQDN): a

complete domain name

Partially Qualified Domain Name (PQDN): a

domain name is ended at some node except the root

slide-5
SLIDE 5

5

9

DNS in the Internet

generic domains country domains inverse domain

map an address to a name Example: a server has a list of authorized

clients, but only IP address from packet.

  • the server may ask its resolver to send a query to the

DNS server and ask for a mapping of address to name.

  • inverse query (or pointer query)
  • “inverse-IP.in-addr.arpa”

1

Domain Name System (DNS)

Two approaches

Recursive resolution: the resolver expects the server to

supply the final answer

Iterative Resolution

  • it returns to the client the IP address of the server that it

thinks can resolve the query.

  • The client is responsible to repeat the query to this

second server. Caching technique in DNS

recursive resolution Store the mapping before send it to client One problem: cache some mapping for a long time. So the

client receives an out-of-date mapping. two simple techniques: “time-to-live” (TTL)

Original server binds a mapping with a TTL value.

  • It defines the time in seconds that the other servers

can cache the mapping information.

Receiving server sets a TTL for each mapping in its

cache.

slide-6
SLIDE 6

6

1 1

Time

Synchronizing physical clocks

External synchronization: clock-draft-rate is

bounded by some constant.

  • Time server: Cristian’s method, the Network Time

Protocol

Internal synchronization: the difference

between any two computer clocks is bounded by some constant.

  • Master/slaves: the Berkeley’s algorithm

1 2

Cristian’s method: time server

  • 1. Client process sends a time request to time

server.

  • 2. After receiving a request, the server replies with

the time according to its clock.

Analysis

no upper bound on message transmission

delays.

Its success is based on that the round-trip times

for messages exchange are short compared with the required accuracy.

a group of synchronized time servers

  • multicast its request to all the time servers in

the LAN, and use the first replied time.

  • Better performance:

– server failure, reply message omission failure; – the first replied time has smaller value (more close to the perfect time).

slide-7
SLIDE 7

7

1 3

The Berkeley’s algorithm

One computer is chosen to be a master The master computer periodically selects the other

computers to synchronize their clocks, called slaves.

The slaves send back their clock values to master. The master estimates their clock times, and

computes the average values of all the clock times

T + (round-trip time/2).

The master sends the adjustment amount for each

individual slave.

The reason for not sending the updated current time

to avoid the further uncertainty introduced by message

transmission time One possible problem: readings from faulty clocks One simple fix: select a subset of clocks whose

mutual difference is bounded by some specified value

1 4

The Network Time Protocol (NTP)

Understand its basic ideas, especially ideas

  • n accuracy of NTP.
slide-8
SLIDE 8

8

1 5

Logical time and logical clocks

Knowing the ordering of events is important

not enough with physical time

Two simple points [Lamport 1978]

the order of two events in the same process the event of sending message always happens before the

event of receiving the message. happened-before relations: partial order,

HB1, HB2 HB3 means happened-before relation is transitive p1 p2 p3 a b c d e f m1 m2 Physical time

Not all events are related by →, e.g., a → e and e → a they are said to be concurrent; write as a || e a → b (at p1) c →d (at p2) b → c (m1) also d → f (m2)

1 6

Lamport’s logical clocks

It is a monotonically increasing software counter. It

need not relate to a physical clock

Each process pi has a logical clock Li

LC1: Li is incremented by 1 before each event

at process pi

LC2: (a) when process pi sends message m, it

piggybacks t = Li (b) when pj receives (m,t), it sets Lj := max(Lj, t) and applies LC1 before timestamping the event receive (m)

e → e’ ⇒ L(e) < L(e’) but not vice versa

Example: event b and event e shortcoming of Lamport’s clock a b c d e f m1 m2 2 1 3 4 5 1 p1 p2 p3 Physical time

slide-9
SLIDE 9

9

1 7

Vector clocks (Mattern [1989] and Fidge [1991])

Fix the problem in Lamport’s clock Vector clock: an array of N integers for a system

with N processes. Each process Pi has its own local vector clock Vi.

Rules for updating clocks:

VC1:initially Vi[j] = 0 for i, j = 1, 2, …N VC2:before pi timestamps an event it sets Vi[i] := Vi[i] +1 VC3: pi piggybacks t = Vi on every message it sends VC4: when pi receives (m,t) it sets Vi[j] := max(Vi[j] , t[j])

j = 1, 2, …N (then adds I to its own element using VC2)

  • Merge operation

E.g. at p2, (0, 0, 0) -> (0, 1, 0) -> (0, 2, 0) ->

(0, 3, 0) …

  • > (1, 4, 3)

Now, received a mes. from p3 that piggybacks t = (1,0,3).

Vi[i] is precise information; Vi[j] ( j≠ i) is updated

from received messages.

In RIP, periodic updates and triggered updates

  • nly triggered updates by received messages

1 8

Compare vector timestamps

Meaning of =, <=, < for vector timestamps

(1) V = V’ iff

V[j] = V’[j] for j = 1, 2, …, N

(2) V ≤ V’ iff

V[j] ≤ V’[j] for j = 1, 2, …, N

(3) V < V’ iff

V ≤ V’ and V ≠ V’ Examples: (1, 3, 2)<(1, 3, 3); (1, 3, 2)| |(2, 3, 1) Note that e → e’ implies V(e) < V(e’). The

converse is also true.

a b c d e f m1 m2 (2,0,0) (1,0,0) (2,1,0) (2,2,0) (2,2,2) (0,0,1) p1 p2 p3 Physical time

slide-10
SLIDE 10

10

1 9

Global states

Hard to obtain a global state of distributed system

consists of states of multiple processes and channel states concurrency, independent failure, no global clock

  • nly by message passing the state of each process (data

and variables), is private information. If all processes do agree on the time, the state

recorded at processes is a global state of the system.

But, no perfect clock synchronization

How to obtain a meaningful global state from local

states recorded at different real times?

Some definitions

A history hi of process pi is a series of events happened at

process pi.

The state of process pi just before the k-th event is

denoted by si

k.

A global history H is the union of the N process histories. A cut is a subset of its global history that is a union of

prefixes of process histories.

The global state of a cut is the set of states S=(s1,…,sN),

where si is the state of pi just after the last event of pi in the cut.

2

Cut

A cut C divides all events to PC (those happened

before C) and FC (future events)

A Cut C is consistent if there is no message whose

sending event is in FC and whose receiving event is in PC

Inconsistent cut: an ‘effect’ without a ‘cause’ it’s enough to check message sending and receiving

events in the cut

Consistent/inconsistent states.

m

1

m

2

p

1

p

2

Physical time e1 Consistent cut Inconsistent cut e1

1

e1

2

e1

3

e2 e2

1

e2

2

slide-11
SLIDE 11

11

2 1

Check if one global state is consistent

Let S=(s1,…,sN) be a global state received from the

state messages.

Let V(si) be the vector timestamp of state si,

received from pi.

S is a consistent global state if and only if:

V(si)[i] >= V(sj)[i] for i,j=1,…,N.

Sij = global state after i events at process 1 and j events at process 2 S00 S10 S20 S21 S30 S31 S32 S22 S23 S33 S43 Level 0 1 2 3 4 5 6 7 m1 m2 p1 p2 Physical time Cut C1 (1,0) (2,0) (4,3) (2,1) (2,2) (2,3) (3,0) x1= 1 x1= 100 x1= 105 x2= 100 x2= 95 x2= 90 x1= 90 Cut C 2 2 2

Transactions

Atomicity of transactions

they are not affected by operations being

performed for other concurrent clients (called “isolation”);

either all of the operations are completed

successfully or they have no effect at all in the presence of server crashes (called “all or nothing” effect)

slide-12
SLIDE 12

12

2 3

Transactions

Isolation

Synchronize operations at server side One way: perform the transaction serially

  • not suitable for servers whose resources are shared by

multiple users

  • The aim for any server that supports transactions is to

maximize concurrency.

concurrency control

“All or nothing”

the objects must be recoverable When a server acknowledges the completion of a client’s

transaction, record the objects in permanent storage How to add transaction capabilities to servers?

Each transaction is created and managed by a coordinator A transaction: cooperation between a client program, some

recoverable objects, and a coordinator.

invokes “openTransaction” to introduce a new transaction

(TID: transaction identifier), e.g. deposit(trans, amount)

invokes “closeTransaction” to indicate its end.

  • penTransaction() -> trans;

closeTransaction(trans) -> (commit, abort); abortTransaction(trans);

2 4

Concurrency control

‘lost update’ problem

two transactions both read the old value of a variable and

use it to calculate a new value ‘Inconsistent retrieval’ problem

a retrieval transaction runs concurrently with an update

transaction. There is no such problem if transactions are done

  • ne at a time

Serially equivalent interleaving

An interleaving of the operations of transaction such that

its effect is the same as if the transactions are performed

  • ne at a time

avoid these problems

the same effect means

the read operations return the same values the instance variables of the objects have the same values

at the end

slide-13
SLIDE 13

13

2 5

Recoverability from aborts

Dirty reads

caused by the interaction between a read operation in one

transaction U and an earlier write operation in another transaction T on the same object, and after U is committed, T is aborted.

a transaction that committed with a ‘dirty read’ is not

recoverable

Fix: delays the commit operation Cascading aborts: the aborting of the transactions may

cause other transactions to be aborted.

To avoid it, transactions are only allowed to read objects

that were written by committed transactions.

Avoidance of cascading aborts is a stronger condition

than recoverability Premature writes

caused by the interaction between ‘write’ operations on

the same object, in different transactions. Strict executions of transactions

to avoid both ‘dirty reads’ and ‘premature writes’.

  • delay both read and write operations

executions of transactions are called strict if both read

and write operations on an object are delayed until all transactions that previously wrote that object have either committed or aborted.

2 6

Concurrency control approaches

Understand the basic steps and main ideas of the

following three techniques.

Locking

Used by most practical systems set a lock on each object just before it is accessed, and

remove these locks when the transaction has completed.

The lock is labeled with the transaction ID. Only the corresponding transaction can access that locked

  • bject. Other transaction may wait or in some cases, share

the lock (such as sharing read locks).

Problem: deadlock

  • ptimistic concurrency control

a transaction proceeds until it asks to commit before it’s allowed to commit, the server will check if this

transaction has some performed operations on objects that conflict with the operations of other concurrent transactions. timestamp ordering

For each object, the server records the most recent time of

reading and writing operation on it;

For each operation, the timestamp of the transaction is

compared with the timestamp of the object to determine whether the operation can be done, delayed or rejected.

slide-14
SLIDE 14

14

2 7

Nested Transactions

Flat transactions Nested Transactions

Structured in an invert-root tree The outermost transaction is the top-level

  • transaction. Others are sub-transactions.

Sub-transactions at the same level can run

concurrently

Each sub-transaction can fail independently of

its parent and of the other sub-transactions. Main advantages of nested transactions

Additional concurrency in a transaction:

Sub-transactions at one level may run concurrently with other sub-transactions at the same level in the hierarchy.

More robust: Sub-transactions can commit or

abort independently.

  • For example, a transaction to deliver a mail message

to a list of recipients.

2 8

Nested Transactions

  • The rules for committing of nested transactions
  • A transaction commits or aborts only after its child

transactions have completed;

  • When a sub-transaction completes, it makes an

independent decision on provisionally commit or abort. Its decision to abort is final.

  • When a parent aborts, all of its sub-transactions are

aborted, even though some of them may have provisionally committed.

  • When a sub-transaction aborts, the parent can decide

whether to abort or not.

  • When the top-level transaction commits, then all of the

sub-transactions that have provisionally committed can commit.

T : top-level transaction T1= openSubTransaction T2 = openSubTransaction

  • penSubTransaction
  • penSubTransaction
  • penSubTransaction
  • penSubTransaction

T1: T2 : T11: T12: T211: T21: prov.commit

  • prov. commit

abort

  • prov. commit
  • prov. commit
  • prov. commit

commit

slide-15
SLIDE 15

15

2 9

The coordinator of a distributed transaction

a client starts a transaction by sending an

  • penTransaction request to a coordinator of any

server.

transaction ID must be unique within the distributed

system.

A simple way: TID<server ID, a number unique to the

server>

the coordinator that opened the transaction becomes the

coordinator of the distributed transaction, all the servers involved are participants. During the progress of the transaction, the

coordinator records a list of references to the participants, and each participant records a reference to the coordinator.

Join(Trans, reference to participant) Informs a coordinator that a new participant has joined the transaction Trans.

3

Atomic commit protocols

Two-phase commit protocol

allow any participant to abort its part of a

transaction

And if one part of a transaction is aborted, then

the whole transaction must be aborted. General idea

In the first phase, each participant votes for the

transaction to be committed or aborted

  • Once a participant has voted to commit a

transaction, it is not allowed to abort it. It is in a prepared state

In the second phase of the protocol, every

participant in the transaction performs the joint decision. The problem is to ensure that all the

participants vote and ensure that they all reach the same decision, with server failures, lost messages.

slide-16
SLIDE 16

16

3 1

Two-phase commit protocol

When client requests “abortTransaction”, or one

participant is aborted, the coordinator informs the participants immediately.

The two-phase commit protocol is used when the

client asks the coordinator to commit the transaction.

3 2

Failures in two-phase commit protocol

Server failure

each server saves information about two-phase commit

protocol in its permanent storage. Communication failure

There are several stages, where the coordinator or a

participant cannot progress until it receives another request or reply message from others.

Timeouts: to avoid process blocking, caused by waiting

for reply, request messages.

For example, after a participant has voted “Yes”, it will

wait for the coordinator to report the vote result.

  • send a “getDecision” request to the coordinator to

determine the result.

  • Problem: coordinator failure wait for a long time
  • Fix: obtain the vote result by contact other

participants instead of only contacting the coordinator.

2nd example: a participant hasn’t received a

“canCommit?” call from the coordinator after it has done all the client requests in the transaction.

  • Detect by no request from a particular transaction for a
  • while. Abort.

Another example: coordinator waiting for votes from the

  • participants. Abort the transaction after a timeout.
slide-17
SLIDE 17

17

3 3

Two-phase commit protocol for nested transactions

Each sub-transaction starts after its parent and

finishes before it.

When a sub-transaction completes, it makes an

independent decision about commit provisionally

  • r abort.

Difference between provisional commit and

prepared to commit

Provisional commit: it’s not saved on permanent storage;

It only means it has finished correctly and will agree to commit when it is asked to.

Prepared commit: guarantees a sub-transaction will be

able to commit After all sub-transactions are completed, the

provisionally committed sub-transactions without aborted ancestors participate in a two-phase commit protocol.

When a top-level transaction completes, its coordinator

performs a two-phase commit protocol. Sub-transaction ID is an extension of its parent’s

ID

Get IDs of all its ancestors.

3 4

Two-phase commit protocol for nested transactions

the coordinator of a parent transaction has a list of

its child sub-transactions.

When a sub-transaction provisionally commits, it

reports its status and the status of its descendants to its parent.

When a sub-transaction aborts, it just reports abort

to its parent

The client completes a set of nested transctions by

invoking “closeTransaction” or “abortTransaction”

  • peration on the coordinator of the top-level

transaction (coordinator of this set of nested trans.).

Participants: the coordinators of all the sub-

transactions in the tree that have provisionally committed but do not have aborted ancestors

The two-phase commit protocol may be performed

in a hierarchy manner or in a flat manner.

slide-18
SLIDE 18

18

3 5

Hierarchy two-phase commit protocol

The coordinator of the top-level transaction

communicates with the coordinators of its child sub-transactions, … …

“canCommit” call

the second argument is the TID of the participant making

the “canCommit?” call. When the participant receives the call, it will look

its transaction list for any provisionally committed transaction that matches the TID in the second argument.

The coordinator of T12, T21.

If a participant finds any sub-transactions, it

prepares the objects and replies with a Yes vote.

If it fails to find any, then it replies with a No vote. Each participant collects the replies from its

descendants before replying to its parent.

3 6

Flat two-phase commit protocol

The coordinator of the top-level transaction sends

“canCommit?” messages to the coordinators of all the provisionally committed sub-transactions

“abortList” in “canCommit?” call, why?

T12, T21 are both provisionally committed. a list of aborted sub-transactions

A participant can commit sub-transactions with no

aborted ancestors.

When a participant receives a “canCommit?”

request,

If the participant has some provisionally committed sub-

transactions:

  • Check that they do not have aborted ancestors in the

“abortList”. Then prepare to commit;

  • Those with aborted ancestors are aborted.
  • Send a Yes vote to the coordinator.

If no provisionally committed sub-transaction, it sends a

No vote to the coordinator. Compared with hierarchy protocol

In hierarchy protocol, at each stage, the participant only

need look for sub-transactions according to the information in the second argument.

Flat protocol needs to use the abort list to remove

transactions whose parents have aborted

The advantage of flat protocol: coordinator of top-level

transaction can directly communicate with all the participants.

slide-19
SLIDE 19

19

3 7

Distributed deadlocks

Detection: find a cycle in the global wait-for graph Simple approach: centralized deadlock detection

One server is selected as global deadlock detector Each server will send the latest copy of its local wait-for

graph to this distinguished server.

Problems:

  • poor availability, lack of fault tolerance, no ability to

scale, and high traffic

  • Phantom deadlock: a situation where a deadlock that

is detected but is not really a deadlock.

  • It takes time to transmit local wait-for graphs. During

that time, it’s possible some locks are released and there is no cycle any more in the new global wait-for graph.

  • Fix phantom deadlock: Since, in two-phase lock

scheme, transactions cannot release objects before committing or aborting. So a phantom deadlock only happens when some transactions abort. So, a phantom deadlock can be detected by informing aborted transactions.

3 8

Distributed deadlock detection: edge chasing

A distributed approach

No global wait-for graph Servers try to find cycles by forwarding probe messages.

A probe message contains transaction wait-for

relationships representing a path in the global wait- for graph.

When a server sends out a probe message?

Ans.: if there is a new edge inserted and this insert-

  • peration may cause a potential distributed deadlock.

Example

If the server X adds the edge WU and at this moment,

U is waiting to access object B at server Y, in this case, X will send a probe message to server Y.

Otherwise, X doesn’t need to send a probe message.

How X knows that U is waiting or not?

the coordinator of U knows that whether U is active or U

is waiting for an object at some server

slide-20
SLIDE 20

20

3 9

Distributed deadlock detection: edge chasing

Initiation:

When a server X finds that T starts waiting for U, and U

is waiting to access an object at another server Y, X will initiate detection by sending a probe message containing the edge TU to Y. Detection:

consists of receiving probe messages and deciding

whether deadlock has happened and whether to forward the probe messages.

i.e., first, Y finds that U is waiting for V, then it inserts

the edge UV, check if there is a cycle, and if no cycle and transaction V is waiting for another object at other server, the new probe message is forwarded.

The path in probe message is increased, one edge at a

time Resolution: a transaction in the cycle is selected to

abort.

Example:

Server X initiates detection by sending probe message

<WU> to the server Y;

Y appends V to produce <WUV>, forward it to Z; Z appends W to produce <WUVW>. A cycle is

detected.

4

Distributed deadlock detection: edge chasing

Another version: after TU is inserted, let the

coordinator of U to decide to forward this probe message.

There is no phantom deadlock in edge-chasing.

Why?

One problem of edge-chasing algorithm is that, in

theory, it needs to forward N-1 messages to detect a cycle involving N transactions.

Fortunately, in practice, most deadlocks only contain two

transactions. Another problem: in a deadlock cycle, every

transaction can cause the imitation of deadlock

  • detection. And it’s possible to result in more than
  • ne transaction is aborted.

Fix: transaction priorities

Timestamps: always abort the transaction with the lowest

priority in a cycle. Transaction priorities also can be used to reduce the

number of initiation of deadlock detection.

i.e., the detection is initiated only when a higher-priority

transaction starts to wait for a transaction with lower priority.

slide-21
SLIDE 21

21

4 1

Transaction Recovery

Main task of a recovery manager:

To save objects in permanent storage (i.e., a recovery file)

for committed transactions

To restore the server’s objects after a crash To reorganize the recovery file to improve the

performance of recovery Intentions list of a particular transaction

A list of the references and the values of all objects that

are updated by this transaction.

Logging

the recovery file represents a log containing the

history of all the transactions performed by a server.

The history consists of values of objects, transaction

status entries and intentions lists of transactions.

In practice, the recovery file contains a recent snapshot of

the values of all the objects in the server followed by a history of transactions after that snapshot.

4 2

Logging

During normal operation of a server, its recovery

manager is called,

When a transaction prepares to commit,

  • appends all the objects in its intentions list to the

recovery file, followed by the current status of that transaction and its intentions list

When commit or abort a transaction,

  • appends the corresponding status of the transaction

Each transaction status entry contains a pointer to

the previous transaction status entry; the first transaction status entry points to the snapshot.

Server failure

  • nly the last write is affected.

Any transaction without a committed status in the log, is

aborted.

slide-22
SLIDE 22

22

4 3

Logging: recovery of objects

After a crash, a new server process first sets

default initial values for its objects, then calls its recovery manager.

First approach: starts from the beginning

  • f the log

Restore the values of all the objects from the

most recent checkpoint (snapshot).

For committed transactions, replace the values

  • f objects.

Problem: there may be a large of updating

  • perations.

Second approach: read the recovery file

backwards

Use the pointers in the transaction status entries For committed transactions, restore the values

  • f objects if their values haven’t been updated.

Advantage: each object is updated only once.

4 4

Logging: reorganizing the recovery file

Goal: to make the process of recovery faster and to

reduce space.

Checkpointing: a process of writing the current

committed values of a server’s objects to a new recovery file, together with transaction status entries and intentions lists of transactions that have not been committed.

Checkpointing needs to be done from time to time, since

recovery may not happen very often. Its steps:

Add a mark to the current recovery file Write the values of objects in a new log file Copy entries before that mark that relate to uncommitted

transactions

Copy all entries after the mark.

Current log file is in use until a new one is

complete.