Add Type Awareness to File Systems Daniel Peek Jason Flinn Facebook - - PowerPoint PPT Presentation
Add Type Awareness to File Systems Daniel Peek Jason Flinn Facebook - - PowerPoint PPT Presentation
TrapperKeeper: The Case for Using Virtualization to Add Type Awareness to File Systems Daniel Peek Jason Flinn Facebook University of Michgan Trapper Keeper Need access to type-specific metadata Searching, Organization, Presentation
University of Michigan 2
Trapper Keeper
- Need access to type-specific metadata
– Searching, Organization, Presentation
- Extracting metadata is hard
– Lots of file types out there – Custom code required for each type
- A better way to get metadata
University of Michigan 3
The Plug-in Solution
- Developers make plug-ins for each type
Metadata Engine JPEG Plug-in MP3 Plug-in … Metadata-Using Application Query: MP3s with Composer = Mozart
University of Michigan 4
The Plug-in Solution
- Lots of work for developers
Mac OS X Spotlight Metadata Preview
University of Michigan 5
The Long Tail
- How big is this problem?
- Big (Agrawal et al.)
- Uneconomical to support all types
University of Michigan 6
The TrapperKeeper Solution
- Already have apps that parse these files
– Apps expose information through GUI
Date Time Original = 2007:11:22 19:21:14
University of Michigan 7
- Once Per Application
– Trap the application
- Once Per File
– Use trapped application to parse the file – Capture displayed output
The TrapperKeeper Process
file metadata preview …
University of Michigan 8
Trapping Applications
- Run app inside a VM
– Contains app effects
- Make app open a dummy file
- Snapshot at moment of open()
– About to execute file parsing behavior
Dummy
University of Michigan 9
Parsing with Trapped Apps
- Restart VM
- Switch files
Dummy File To Parse
University of Michigan 10
Accessibility
Window Tab Pane … Text Label: “Date Time Original” Text: “2007:11:22 19:21:14” …
University of Michigan 11
Accessibility
Window Tab Pane … Text Label: “Date Time Original” Text: “2007:11:22 19:21:14” …
University of Michigan 12
Guided Extraction
University of Michigan 13
Guided Extraction
University of Michigan 14
Guided Extraction
University of Michigan 15
Execute Features
- Snapshot window in VM
metadata Preview Metadata System
University of Michigan 16
TrapperKeeper Results
- Makes it easy to extract metadata
– No development skill – No source code – Just be able to use the application
- Successful use
– All GUI apps in Ubuntu 7.10 in a day – Parses over 100 file types – Rate of 318 files/hour
University of Michigan 17
Tricky Situations
- Application has no accessibility support
- Application does not expose metadata
- Application needs external info to parse
– Configuration files – License servers – Internet connections
University of Michigan 18
Tricky Situations
- Performance: a sudden influx of files
– Fresh installation – Download from digital camera
- Which metadata is the right metadata?
University of Michigan 19