Software Streams Big Data Challenges in Dynamic Program Analysis - PowerPoint PPT Presentation

Intro Software streams Case studies Conclusions Software Streams Big Data Challenges in Dynamic Program Analysis Irene Finocchi Dept. Computer Science – Sapienza U. Rome 1 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Theory versus practice Theory is when you know something, but it doesn’t work. 2 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Theory versus practice Theory is when you know something, but it doesn’t work. Practice is when something works, but you don’t know why. 2 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Theory versus practice Theory is when you know something, but it doesn’t work. Practice is when something works, but you don’t know why. Programmers combine theory and practice: Nothing works, and they don’t know why. (Anonymous) 2 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Topic of the talk Algorithm engineering talk: boosting practice with theory 3 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Topic of the talk Algorithm engineering talk: boosting practice with theory Theory: data stream algorithmics Application area: dynamic program analysis 3 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Program analysis Development of techniques and tools for analyzing the structure and the behavior of a software system 4 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Program analysis Development of techniques and tools for analyzing the structure and the behavior of a software system Goals: conclude properties about the program: e.g., correctness, resource consumption seek opportunities for optimization error detection and correction: e.g., type checking, memory safety, data structure repair, protection against security attacks study how the program or its parts are used: e.g., usage patterns, intrusion detection program understanding 4 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Static vs. dynamic analysis Static analysis: based on knowledge of code (source, object, ...) Examples: compilers formal verification systems theoretical analysis of algorithms 5 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Static vs. dynamic analysis Static analysis: based on knowledge of code (source, object, ...) Examples: compilers formal verification systems theoretical analysis of algorithms Dynamic analysis: exploits information gathered at runtime Examples: debuggers, memory checkers performance profilers platforms for the experimental evaluation of algorithms 5 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Program analysis in algorithm engineering 6 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Soundness vs. accuracy Static analysis huge success in software design, but dynamic nature of modern computing scenarios makes it increasingly more inaccurate 7 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Program analysis community Many disciplines involved: programming languages, SE, architectures, algorithms, statistics. . . 8 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues This talk: algorithmics for dynamic program analysis Events of interest: routine calls system calls memory accesses cache misses low-level instructions interrupts ... 9 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues What’s difficult? Capturing events hardware support (counters, watchpoints) programmable interrupts/signals program instrumentation (source code or binary code) 10 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues What’s difficult? Capturing events hardware support (counters, watchpoints) programmable interrupts/signals program instrumentation (source code or binary code) Intrusiveness: Heisenberg effects (the act of observing a system causes the system to change) 10 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues What’s difficult? Capturing events hardware support (counters, watchpoints) programmable interrupts/signals program instrumentation (source code or binary code) Intrusiveness: Heisenberg effects (the act of observing a system causes the system to change) Performance : analysis inlined with program execution, slow down analyzed programs, real-time performance (billions of events per second) 10 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues What’s difficult? Capturing events hardware support (counters, watchpoints) programmable interrupts/signals program instrumentation (source code or binary code) Intrusiveness: Heisenberg effects (the act of observing a system causes the system to change) Performance : analysis inlined with program execution, slow down analyzed programs, real-time performance (billions of events per second) Massive data : dynamic analysis tools process huge amounts of data, cannot store all of them 10 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Program analysis Static vs. dynamic Dynamic issues Efficient algorithms can make a difference Automated dynamic analysis less explored than static analysis from an algorithmic perspective... 11 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Execution traces Some properties Software streams 12 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Execution traces Some properties An example: performance profiling Form of dynamic program analysis that typically measures: execution time of instructions, basic blocks, routines frequency of portions of code Our goal: identify routines that contribute most to the running time (hot routines) Mainly useful for performance optimization 13 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Execution traces Some properties Profiler characteristics Granularity Basic blocks Routines 14 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Execution traces Some properties Profiler characteristics Granularity Basic blocks Routines Metrics Time Number of routine calls Cache misses, I/Os . . . 14 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Execution traces Some properties Profiler characteristics Granularity Basic blocks Routines Metrics Time Number of routine calls Cache misses, I/Os . . . Data aggregation level Vertex: how many times is routine f called? Edge: how many times is f called from g ? Calling context: how many times is f called along path main → g → h → f ? 14 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Intro Software streams Case studies Conclusions Execution traces Some properties Vertex vs. calling context profiling Vertex profiling: Stream Σ = = � main , g , h , f , ... � = = � f 1 , f 2 , ..., f n � Item universe: f i ∈ { routines } Query: find most frequently called routines 15 / 41 Irene Finocchi CiE 2013 special session on data streams and compression

Software Streams Big Data Challenges in Dynamic Program Analysis - PowerPoint PPT Presentation

Intro Software streams Case studies Conclusions Software Streams Big Data Challenges in Dynamic Program Analysis Irene Finocchi Dept. Computer Science Sapienza U. Rome 1 / 41 Irene Finocchi CiE 2013 special session on data streams and

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

Environmental Health Science Data Streams Data Streams Health Data Health Data Brian S.

Data Streams Many large sources of data are generated as streams of updates: IP Network

Data Streams Many large sources of data are generated as streams of updates: IP Network

Stream Bank Stabilization in Open Space Streams in open space There are approximately 35

CSE 143 Streams as C++ Classes Streams are C++ classes Streams have lots of built-in

Comparing Data Streams Using Hamming Norms Graham Cormode, Mayur Datar, Piotr Indyk, S.

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

A P A P A Proposal for Publishing Data A Proposal for Publishing Data l f l f P bli hi P bli

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data

Streams and File I/O Fundamentals of Computer Science Outline Overview of Streams and File

Type Systems: Big Idea Static vs. Dynamic Typing Expressiveness (+ Dynamic) Dont have

Data-Intensive Distributed Computing CS 431/631 451/651 (Winter 2019) Part 3: Analyzing Text

SITOLA Network Performing Arts Production Workshop 20130312 1/32 UltraGrid Platform GPU

Pharmacys Mission in a Changing World: A Christian Perspective (ACPE#: ) Jeffrey Copeland,

Input/Output Cmd Line Input Formatted I/O Formatted Output Formatted Input Volker Sorge

One FlAw over the Cuckoos Nest on , Ricardo J. Rodr guez I naki Rodr

Basic Number Theory The integers are the natural numbers, 0 and the additive inverses of the

Internet content HTML SGML CSS XML XHTML MIME HTTP DD1335 (Lecture 2) Basic Internet

MARKET STRUCTURE AND MARKET POWER Measuring market power One firm: margin m = p MC p

Software Streams Big Data Challenges in Dynamic Program Analysis - PowerPoint PPT Presentation

Intro Software streams Case studies Conclusions Software Streams Big Data Challenges in Dynamic Program Analysis Irene Finocchi Dept. Computer Science Sapienza U. Rome 1 / 41 Irene Finocchi CiE 2013 special session on data streams and

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data &amp; Real Time Data Streams

WITH C++ Prof. Amr Goneid AUC Part 9. Streams &amp; Files Prof. amr Goneid, AUC 1 Streams

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

Environmental Health Science Data Streams Data Streams Health Data Health Data Brian S.

Data Streams Many large sources of data are generated as streams of updates: IP Network

Data Streams Many large sources of data are generated as streams of updates: IP Network

Stream Bank Stabilization in Open Space Streams in open space There are approximately 35

CSE 143 Streams as C++ Classes Streams are C++ classes Streams have lots of built-in

Comparing Data Streams Using Hamming Norms Graham Cormode, Mayur Datar, Piotr Indyk, S.

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

A P A P A Proposal for Publishing Data A Proposal for Publishing Data l f l f P bli hi P bli

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES &amp; OPPORTUNITIES Paris Big Data

Streams and File I/O Fundamentals of Computer Science Outline Overview of Streams and File

Type Systems: Big Idea Static vs. Dynamic Typing Expressiveness (+ Dynamic) Dont have

Data-Intensive Distributed Computing CS 431/631 451/651 (Winter 2019) Part 3: Analyzing Text

SITOLA Network Performing Arts Production Workshop 20130312 1/32 UltraGrid Platform GPU

Pharmacys Mission in a Changing World: A Christian Perspective (ACPE#: ) Jeffrey Copeland,

Input/Output Cmd Line Input Formatted I/O Formatted Output Formatted Input Volker Sorge

One FlAw over the Cuckoos Nest on , Ricardo J. Rodr guez I naki Rodr

Basic Number Theory The integers are the natural numbers, 0 and the additive inverses of the

Internet content HTML SGML CSS XML XHTML MIME HTTP DD1335 (Lecture 2) Basic Internet

MARKET STRUCTURE AND MARKET POWER Measuring market power One firm: margin m = p MC p

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data