Cross-checking Semantic Correctness: The Case of Finding File System - PowerPoint PPT Presentation

Cross-checking Semantic Correctness: The Case of Finding File System Bugs Changwoo Min , Sanidhya Kashyap, Byoungyoung Lee, Chengyu Song, Taesoo Kim Georgia Institute of Technology School of Computer Science

Two promising approaches to make bug-free software ● Formal proof require “proof” → – Guarantee high-level invariants (e.g., functional correctness) ● Model checking require “model” → – Check if code fjts with domain model (e.g., locking rules) 2

Two promising approaches to make bug-free software ● Formal proof require “proof” → – Guarantee high-level invariants (e.g., functional correctness) ● Model checking require “model” → – Check if code fjts with domain model (e.g., locking rules) In practice, many software are (already) built without such theories 3

There exist many similar implementations of a program ● File systems: >50 implementations in Linux ● JavaScript: ECMAScript, V8, SpiderMonkey, etc ● POSIX C Library: Gnu Libc, FreeBSD, eLibc, etc Without proof or model, can we leverage these existing implementations? 4

There exist many similar implementations of a program ● File systems: >50 implementations in Linux ● JavaScript: ECMAScript, V8, SpiderMonkey, etc ● POSIX C Library: Gnu Libc, FreeBSD, eLibc, etc Without proof or model, can we leverage these existing implementations? 5

File system bugs are critical 2013-01-07 6

File system bugs are critical 2013-01-07 2014-10-17 7

File system bugs are critical 2013-01-07 2014-10-17 2015-03-19 8

A majority of bugs in fjle systems are hard to detect Memory bugs: Semantic bugs: NULL dereference Use-after-free Incorrect condition check ... 12.6% Incorrect statue update Incorrect argument Incorrect error code ... 87.4% 9

A majority of bugs in fjle systems are hard to detect Memory bugs: Semantic bugs: NULL dereference Use-after-free Incorrect condition check ... 12.6% Incorrect statue update Incorrect argument Incorrect error code ... 87.4% 10

Example of semantic bug: Missing capability check in OCFS2 ocfs2: trusted xattr missing CAP_SYS_ADMIN check Signed-ofg-by: Sanidhya Kashyap <sanidhya@gatech.edu> ... @@ static size_t ocfs2_xattr_trusted_list + if (!capable(CAP_SYS_ADMIN)) + return 0; 11

Example of semantic bug: Missing capability check in OCFS2 ocfs2: trusted xattr missing CAP_SYS_ADMIN check Signed-ofg-by: Sanidhya Kashyap <sanidhya@gatech.edu> ... @@ static size_t ocfs2_xattr_trusted_list + if (!capable(CAP_SYS_ADMIN)) + return 0; Can we fjnd this bug by leveraging other implementations? 12

A majority of fjle system already implemented capability check ocfs2: trusted xattr missing CAP_SYS_ADMIN check ● ext2 Signed-ofg-by: Sanidhya Kashyap <sanidhya@gatech.edu> ... static size_t ext2_xattr_trusted_list() @@ static size_t ocfs2_xattr_trusted_list if (!capable(CAP_SYS_ADMIN)) return 0; + if (!capable(CAP_SYS_ADMIN)) ● ext4 + return 0; static size_t ext4_xattr_trusted_list() if (!capable(CAP_SYS_ADMIN)) return 0; ● XFS static size_t xfs_xattr_put_listent() if ( (fmags & XFS_ATTR_ROOT) && !capable(CAP_SYS_ADMIN)) return 0; ... 13

A majority of fjle system already implemented capability check ocfs2: trusted xattr missing CAP_SYS_ADMIN check ● ext2 Signed-ofg-by: Sanidhya Kashyap <sanidhya@gatech.edu> ... static size_t ext2_xattr_trusted_list() @@ static size_t ocfs2_xattr_trusted_list if (!capable(CAP_SYS_ADMIN)) return 0; + if (!capable(CAP_SYS_ADMIN)) ● ext4 + return 0; static size_t ext4_xattr_trusted_list() if (!capable(CAP_SYS_ADMIN)) return 0; Deviant implementation ● XFS → potential bugs? static size_t xfs_xattr_put_listent() if ( (fmags & XFS_ATTR_ROOT) && !capable(CAP_SYS_ADMIN)) return 0; ... 14

A majority of fjle system already implemented capability check ocfs2: trusted xattr missing CAP_SYS_ADMIN check ● ext2 Signed-ofg-by: Sanidhya Kashyap <sanidhya@gatech.edu> ... static size_t ext2_xattr_trusted_list() @@ static size_t ocfs2_xattr_trusted_list if (!capable(CAP_SYS_ADMIN)) return 0; + if (!capable(CAP_SYS_ADMIN)) ● ext4 + return 0; static size_t ext4_xattr_trusted_list() if (!capable(CAP_SYS_ADMIN)) return 0; Deviant implementation ● XFS → potential bugs? static size_t xfs_xattr_put_listent() if ( (fmags & XFS_ATTR_ROOT) && !capable(CAP_SYS_ADMIN)) A new bug we found return 0; It has been hidden for 6 years ... 15

Case study: Write a page ● Each fjle system defjnes how to write a page ● Semantic of writepage() – Success return locked page → – Failure return unlocked page → ● Document/fjlesystems/vfs.txt specifjes such rule – Hard to detect without domain knowledge What if 99% fjle systems follow above pattern, but not one fjle system? bug? 16

Our approach can reveal such bugs without domain specifjc knowledge ● 52 fjle systems follow the locking rules ● But 2 fjle systems don't (Ceph and AFFS) -------------------------------- fs/ceph/addr.c -------------------------------- index fd5599d..e723482 100644 @@ static int ceph_write_begin + if (r < 0) + page_cache_release(page); + else + *pagep = page; 17

Our approach can reveal such bugs without domain specifjc knowledge ● 52 fjle systems follow the locking rules ● But 2 fjle systems don't (Ceph and AFFS) -------------------------------- fs/ceph/addr.c -------------------------------- index fd5599d..e723482 100644 @@ static int ceph_write_begin + if (r < 0) + page_cache_release(page); + else + *pagep = page; We found 3 bugs in 2 fjle systems Hidden for over 5 years 18

Our approach in fjnding bugs Intuition: Bugs are rare Majority of implementations is correct Idea: Find deviant ones as potential bugs 19

Our approach is promising in fjnding semantic bugs (Example: fjle systems) ● New semantics bugs – 118 new bugs in 54 fjle systems ● Critical bugs – System crash, data corruption, deadlock, etc ● Bugs diffjcult to fjnd – Bugs were hidden for 6.2 years on average ● Various kinds of bugs – Condition check, argument use, return value, locking, etc 20

Technical challenges ● All software are difgerent one way or another – e.g., disk layout in fjle system ● How to compare difgerent implementation? – Q1: Where to start? – Q2: What to compare? – Q3: How to compare? 21

Juxta : the case of fjle system ● All fjle systems should follow VFS API in Linux – e.g., vfs_rename() in each fjle system ● How to compare difgerent fjle systems? – Q1: Where to start? VFS entries in fjle system → – Q2: What to compare? symbolic environment → – Q3: How to compare? statistical comparison → 22

Juxta overview Juxta Statistical Per-Filesystem Path Comparison Path Database File System Symbolic Execution Source Code 23

Juxta overview 7 Checkers Path Condition Argument ... Checker Checker Juxta Statistical Per-Filesystem Path Comparison Path Database File System Symbolic Execution Source Code 24

Comparing multiple fjle systems ● Q1: Where to start? – Identifying semantically similar entry points ● Q2: What to compare? – Building per-path symbolic environment ● Q3: How to compare? – Statistically comparing each path 25

Identifying semantically similar entry points ● Linux Virtual File System (VFS) – Use common data structures and behavior (e.g., inode and page cache) – Defjne fjlesystem-specifjc interfaces (e.g., open, rename) 27

Example: inode_operations rename() → struct inode_operations { int (*rename) (struct inode *, ...); int (*create) (struct inode *,...); int (*unlink) (struct inode *,..); int (*mkdir) (struct inode *,...); }; Compare *_rename() to fjnd deviant rename() implementations. 28

Example: inode_operations rename() → struct inode_operations { int (*rename) (struct inode *, ...); btrfs _rename(...); int (*create) (struct inode *,...); ext4 _rename(...); int (*unlink) (struct inode *,..); xfs _vn_rename(…); int (*mkdir) (struct inode *,...); ... }; Compare *_rename() to fjnd deviant rename() implementations. 29

Building per-path symbolic environment ● Context/fmow-sensitive symbolic execution – C language level – Build symbolic environment per path (e.g., path cond, return values, side-efgect, function calls) ● Key idea: return-oriented comparison – Error codes represent per-path semantics (e.g., comparing all paths returning EACCES in rename() implementations) 31

Example: Per-path symbolic environment Execution Path Information int foo_rename (int fmag) { if (fmag == RO) return -EACCES; inode fmag = fmag; → kmalloc(…, GFP_NOFS) return SUCCESS; } 32

Example: Per-path symbolic environment Execution Path Information int foo_rename (int fmag) { if (fmag == RO) return -EACCES; inode fmag = fmag; → kmalloc(…, GFP_NOFS) return SUCCESS; } 33

Cross-checking Semantic Correctness: The Case of Finding File System - PowerPoint PPT Presentation

Cross-checking Semantic Correctness: The Case of Finding File System Bugs Changwoo Min , Sanidhya Kashyap, Byoungyoung Lee, Chengyu Song, Taesoo Kim Georgia Institute of Technology School of Computer Science Two promising approaches to make

Proving Program Correctness The Axiomatic Approach What is Correctness? Correctness:

Checking & Spot-Checking the Correctness of Priority Queues Matthew Chu & Sampath Kannan

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

4. Semantic Processing and Attributed Grammars 1 Semantic Processing The parser checks only the

From Model Checking to Proof Checking ... and Back Kedar Namjoshi Bell Labs April 29, 2005

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

Semantic Analysis/Checking Symbol tables Semantic analysis: the final part of analysis half of

The Case for Run- -Time Error Checking Time Error Checking The Case for Run Todd Austin

Creating Semantic Mashups: Bridging Web 2.0 and the Semantic Web Jamie Taylor, Colin Evans, Toby

: on the Semantic Web : on the Semantic Web Building a Semantic Prototype for Danish Building a

Semantic Processing Augmenting CFGs Currying Quantifier scope Semantic Grammars L445 / L545

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Module 13 Introduction to Semantic Technology, Ontologies and the Semantic Web Module 13 Outline

Real Real Real Time Real-Time Time Time Model Checking Model Model Checking Model

Data Matthew James The central role that peoples health and well -being play in social

Optimization Algorithms for Data Analysis Stephen Wright University of Wisconsin-Madison Fields

Security, Privacy, Ethics and Sheep Professor Stephen Hailes UCL New Frontiers in IoT UCL New

Personalized Medicine Redefining Cancer Treatment RAMONA BENDIAS, FRIDA BRNFORS Is there a way

Disclosures Aneurysms:Open Repair is the Gold Standard NONE Michael S. Conte MD Division of

Elodie Laine July 3 rd 2012, JOBIM, Rennes BiMoDyM, Molecular Oncology and Pharmacology Team,

II. Homotopy of Curves on Surfaces Jeff Erickson University of Illinois, Urbana-Champaign The

Krivines Classical Realizability from a Categorical Prespective Thomas Streicher (TU

Cross-checking Semantic Correctness: The Case of Finding File System - PowerPoint PPT Presentation

Cross-checking Semantic Correctness: The Case of Finding File System Bugs Changwoo Min , Sanidhya Kashyap, Byoungyoung Lee, Chengyu Song, Taesoo Kim Georgia Institute of Technology School of Computer Science Two promising approaches to make

Proving Program Correctness The Axiomatic Approach What is Correctness? Correctness:

Checking &amp; Spot-Checking the Correctness of Priority Queues Matthew Chu &amp; Sampath Kannan

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

4. Semantic Processing and Attributed Grammars 1 Semantic Processing The parser checks only the

From Model Checking to Proof Checking ... and Back Kedar Namjoshi Bell Labs April 29, 2005

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

Semantic Analysis/Checking Symbol tables Semantic analysis: the final part of analysis half of

The Case for Run- -Time Error Checking Time Error Checking The Case for Run Todd Austin

Creating Semantic Mashups: Bridging Web 2.0 and the Semantic Web Jamie Taylor, Colin Evans, Toby

: on the Semantic Web : on the Semantic Web Building a Semantic Prototype for Danish Building a

Semantic Processing Augmenting CFGs Currying Quantifier scope Semantic Grammars L445 / L545

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Module 13 Introduction to Semantic Technology, Ontologies and the Semantic Web Module 13 Outline

Real Real Real Time Real-Time Time Time Model Checking Model Model Checking Model

Data Matthew James The central role that peoples health and well -being play in social

Optimization Algorithms for Data Analysis Stephen Wright University of Wisconsin-Madison Fields

Security, Privacy, Ethics and Sheep Professor Stephen Hailes UCL New Frontiers in IoT UCL New

Personalized Medicine Redefining Cancer Treatment RAMONA BENDIAS, FRIDA BRNFORS Is there a way

Disclosures Aneurysms:Open Repair is the Gold Standard NONE Michael S. Conte MD Division of

Elodie Laine July 3 rd 2012, JOBIM, Rennes BiMoDyM, Molecular Oncology and Pharmacology Team,

II. Homotopy of Curves on Surfaces Jeff Erickson University of Illinois, Urbana-Champaign The

Krivines Classical Realizability from a Categorical Prespective Thomas Streicher (TU

Checking & Spot-Checking the Correctness of Priority Queues Matthew Chu & Sampath Kannan