Transaction Support in Windows Transaction Support in Windows NTFS - - PowerPoint PPT Presentation
Transaction Support in Windows Transaction Support in Windows NTFS - - PowerPoint PPT Presentation
Transaction Support in Windows Transaction Support in Windows NTFS NTFS Surendra Verma Surendra Verma Development Manager Development Manager Windows File Systems Windows File Systems Microsoft Microsoft Transactional NTFS (TxF)
Transactional NTFS (TxF) Transactional NTFS (TxF)
_ _ Adds transaction support for all NTFS file
Adds transaction support for all NTFS file
- perations:
- perations:
- Full Atomicity, Consistency, Isolation, Durability
Full Atomicity, Consistency, Isolation, Durability
- Allows arbitrary number of file system
Allows arbitrary number of file system
- perations to be treated as an atomic unit
- perations to be treated as an atomic unit
- Reads, writes, file creations, deletions, renames,
Reads, writes, file creations, deletions, renames, security, object-id, named-streams, quota etc. security, object-id, named-streams, quota etc.
Semantics - isolation Semantics - isolation
_ _ Transaction 1:
Transaction 1:
- File 1
File 1
- File 2 <- New File
File 2 <- New File
- File 3 -< Deleted File
File 3 -< Deleted File
_ _ Transaction 2:
Transaction 2:
- File 1
File 1
- File 3
File 3 Transactions don’t see changes made by other transactions Same for Non-Transactions Contemplating Dirty Reader as an isolation level Committed Read without blocking Reader for Writers
Semantics - locking Semantics - locking
_ _ File is the unit of locking
File is the unit of locking
_ _ File locked for update for the duration of the
File locked for update for the duration of the transaction transaction
_ _ Other handles in the same transaction
Other handles in the same transaction allowed to update allowed to update
_ _ Can be read concurrently (& consistently) by
Can be read concurrently (& consistently) by the others the others
_ _Demo
Demo
Volumes and RMs Volumes and RMs
_ _ Each volume comes with a Transactional
Each volume comes with a Transactional Resource Resource Manager Manager (RM) by default (RM) by default
_ _ Its log is resident on the volume. Recovery
Its log is resident on the volume. Recovery
- automatic. Admin-free.
- automatic. Admin-free.
_ _ Many
Many secondary secondary RMs may be created in various RMs may be created in various places within the volume places within the volume
_ _ Their logs can be anywhere on the machine
Their logs can be anywhere on the machine
_ _ Their admin/recovery is user-controlled via APIs
Their admin/recovery is user-controlled via APIs
_ _ Designed to be embedded in other
Designed to be embedded in other stores/applications stores/applications
Logging Modes Logging Modes
_ _ Undo-Only
Undo-Only logging: Minimizes logging and logging: Minimizes logging and supports ACID semantics supports ACID semantics
_ _ Redo-Undo
Redo-Undo logging: logging: “ “redo redo” ” is logged as well is logged as well => log contains => log contains complete complete history of history of changes changes
_ _ Allows playback from a backup to achieve
Allows playback from a backup to achieve consistency at a chosen point in time consistency at a chosen point in time
_ _ Logging Mode can be set for secondary RMs
Logging Mode can be set for secondary RMs and toggled live and toggled live
Implementation Details Implementation Details
OLE APIs
Win32
Kernel Mode Log APIs
Win32 Log APIs
Common Log Manager Kernel Mode Trans Mgr
Kernel Mode TM APIs Kernel Mode Log APIs Win32 I/O APIs
Transactional NTFS
IRP Based
DTC Other Resources like SQL WinFX APIs
TxF Recovery TxF Recovery
TxF TxF builds Two types of content treated differently. builds Two types of content treated differently.
_ _ Metadata
Metadata – – Names, attributes, security etc. Names, attributes, security etc. upon Ntfs recovery. upon Ntfs recovery.
_ _ Data
Data – –
Built from scratch. Built from scratch.
_ _ Both use Write Ahead Logging Technique.
Both use Write Ahead Logging Technique.
Namespace Isolation Namespace Isolation
_ _ Main-memory balanced binary trees used.
Main-memory balanced binary trees used.
TxF Data Recovery TxF Data Recovery
TOPS Stream TxF Log
Undo Page Write Log record Undo Image
- f Page
TxF Data Recovery TxF Data Recovery
_ _ For undo only logging mode
For undo only logging mode – – files flushed files flushed
- n commit.
- n commit.
_ _ TOPS stream pages and log written
TOPS stream pages and log written independently of each other. independently of each other.
_ _ A page changed multiple times in a
A page changed multiple times in a transaction gets logged only once. transaction gets logged only once.
_ _Questions
Questions
Transaction Management (KTM) Transaction Management (KTM)
_ _ Coordinates commit/abort processing
Coordinates commit/abort processing between the various actors: between the various actors:
- Resource Managers (RMs) (eg, SQL, TxF)
Resource Managers (RMs) (eg, SQL, TxF)
- Applications
Applications
_ _ Persistently maintains outcome of
Persistently maintains outcome of transactions using the common-log transactions using the common-log
_ _ Lives in the kernel with Win32 and kernel
Lives in the kernel with Win32 and kernel interfaces interfaces
Why Common Logging Why Common Logging
- Group log writes from multiple clients into a
Group log writes from multiple clients into a single physical Disk I/O single physical Disk I/O
- Single logical view log for tightly coupled
Single logical view log for tightly coupled but distinct resources but distinct resources
- Ease of configuration, archival, and media
Ease of configuration, archival, and media recovery, and administration recovery, and administration
- Single paradigm for high-bandwidth logging
Single paradigm for high-bandwidth logging
- n Windows
- n Windows
Common Logging Common Logging
- Multiple clients sharing a single logical log
Multiple clients sharing a single logical log stream stream
- Each client has exclusive use of a virtual
Each client has exclusive use of a virtual log stream log stream
- Common log manager multiplexes
Common log manager multiplexes multiple client streams to single logical log multiple client streams to single logical log stream stream
- Multiplexing separated from I/O
Multiplexing separated from I/O
Example: two file updates Example: two file updates
_ _ Program writes to file1, then to file2
Program writes to file1, then to file2
_ _ System/application crashes
System/application crashes
_ _ Are the updates on disk?
Are the updates on disk?
_ _ Both on disk? Some? None?
Both on disk? Some? None?
_ _ Do others see updates as they occur?
Do others see updates as they occur?
_ _ What if the system showed the previous
What if the system showed the previous (consistent) state until app ready to expose them? (consistent) state until app ready to expose them?
_ _ Same issues if only
Same issues if only one
- ne file is involved!
file is involved!
Scenario: Update a web-site Scenario: Update a web-site
_ _ Hide temporary inconsistencies
Hide temporary inconsistencies
_ _ System handles data recovery on app failure
System handles data recovery on app failure
- r system crash
- r system crash
_ _ System guarantees that updates survive
System guarantees that updates survive crash once committed crash once committed
_ _ On high-end, archive transaction logs for
On high-end, archive transaction logs for shipping or media recovery shipping or media recovery
Scenario: Remote file Copy Scenario: Remote file Copy
_ _ Reliable copy of file over the network
Reliable copy of file over the network
_ _ Cheap, low-level message transfer
Cheap, low-level message transfer coordinated with other transaction work. coordinated with other transaction work.
_ _ Pass data between branch office and central
Pass data between branch office and central
- ffice (financial institutions, retail)
- ffice (financial institutions, retail)
_ _ Frequently mentioned scenario by our
Frequently mentioned scenario by our customers customers
Document Management Document Management
_ _ Files in the file-system, file-attributes in a
Files in the file-system, file-attributes in a relational database relational database
_ _ Transaction maintains consistency between
Transaction maintains consistency between the two the two
_ _ Makes it possible to integrate administration