Outline Introduction and Motivation 1 Analysis and Optimization - PowerPoint PPT Presentation

Introduction and Motivation Introduction and Motivation SOAP and XML Benchmarks SOAP and XML Benchmarks Parallel XML Parallel XML Related Work Related Work Conclusions and Future Work Conclusions and Future Work Outline Introduction and Motivation 1 Analysis and Optimization for Processing XML and SOAP Grid-Scale XML Datasets Ubiquity of Multi-processing Capabilities Contributions SOAP and XML Benchmarks 2 Michael R. Head SOAPBench Ph.D. Candidate XMLBench Grid Computing Research Laboratory Parallel XML 3 Department of Computer Science Binghamton University Investigating System Cache Effects mike@cs.binghamton.edu Piximal : Parallel Approach for Processing XML Tuesday, May 12, 2009 4 Related Work 5 Conclusions and Future Work 1 / 52 2 / 52 Introduction and Motivation Introduction and Motivation XML and SOAP XML and SOAP SOAP and XML Benchmarks SOAP and XML Benchmarks Ubiquity of Multi-processing Capabilities Ubiquity of Multi-processing Capabilities Parallel XML Parallel XML Contributions Contributions Related Work Related Work Thesis statement Thesis statement Conclusions and Future Work Conclusions and Future Work XML Defined <?xml version="1.0" encoding="UTF-8"?> <ns1:MoleculeType xsd:type="ns1:MoleculeType" xmlns:ns1="http://nbcr.sdsc.edu/chemistry/types" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <moleculeName xsi:type="xsd:string" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 1kzk Text based (usually UTF-8 encoded) </moleculeName> <moleculeRadius xsi:type="xsd:double" xsi:nil="true" Tree structured xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/> Language independent <atom xsi:type="ns1:AtomType" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> Generalized data format <fieldName xsi:type="ns1:FieldNameType">ATOM</fieldName> ... </atom> <atom xsi:type="ns1:AtomType" ... </atom> ... </ns1:MoleculeType> 3 / 52 4 / 52

Introduction and Motivation Introduction and Motivation XML and SOAP XML and SOAP SOAP and XML Benchmarks SOAP and XML Benchmarks Ubiquity of Multi-processing Capabilities Ubiquity of Multi-processing Capabilities Parallel XML Parallel XML Contributions Contributions Related Work Related Work Thesis statement Thesis statement Conclusions and Future Work Conclusions and Future Work Motivation from SOAP Importance of High Performance XML Processors Generalized RPC mechanism (supports other models, too) Becoming standard for many scientific datasets Broad industrial support HapMap - mapping genes Web Services on the Grid Protein Sequencing OGSA: Open Grid Services Architecture NASA astronomical data WSRF: Web Services Resource Framework Many more instances At bottom, SOAP depends on XML 5 / 52 6 / 52 Introduction and Motivation Introduction and Motivation XML and SOAP XML and SOAP SOAP and XML Benchmarks SOAP and XML Benchmarks Ubiquity of Multi-processing Capabilities Ubiquity of Multi-processing Capabilities Parallel XML Parallel XML Contributions Contributions Related Work Related Work Thesis statement Thesis statement Conclusions and Future Work Conclusions and Future Work Explosion of Data Benchmark Motivation Enormous increase in data from sensors, satellites, experiments, Scientific applications place a wide range of requirements on the and simulations ∗ communication substrate and data formats. Use of XML to store these data is also on the rise Simple and straightforward implementations can have a severe XML is in use in ways it was never really intended (GB and large performance impact. size files) 7 / 52 8 / 52

Introduction and Motivation Introduction and Motivation XML and SOAP XML and SOAP SOAP and XML Benchmarks SOAP and XML Benchmarks Ubiquity of Multi-processing Capabilities Ubiquity of Multi-processing Capabilities Parallel XML Parallel XML Contributions Contributions Related Work Related Work Thesis statement Thesis statement Conclusions and Future Work Conclusions and Future Work Prevalence of Parallel Machines XML and Multi-Core All new high end and mid range CPUs for desktop- and laptop-class computers have at least two cores Most string parsing techniques rely on a serial scanning process The future of AMD and Intel performance lies in increases in the number of cores Challenge: Existing (singly-threaded) XML parsers are already very efficient [Zhang et al 2006] Despite extant SMP machines, many classes of software applications remain single threaded Multi-threaded programming considered ‘‘hard’’ 9 / 52 10 / 52 Introduction and Motivation Introduction and Motivation XML and SOAP XML and SOAP SOAP and XML Benchmarks SOAP and XML Benchmarks Ubiquity of Multi-processing Capabilities Ubiquity of Multi-processing Capabilities Parallel XML Parallel XML Contributions Contributions Related Work Related Work Thesis statement Thesis statement Conclusions and Future Work Conclusions and Future Work Contributions Contributions Continued We propose techniques to modify the lexical analysis phase for We present the design and implementation of a comprehensive processing large-scale XML datasets to leverage opportunities for benchmark suite for XML and SOAP implementations with parallelism. ( Piximal ) standard mechanisms to quantify, compare, and evaluate the We present an analysis of the scalability that can be achieved performance of each toolkit and study the strengths and with our proposed parallelization approach as the number of weaknesses for a wide range of use case scenarios. processing threads and size of XML-data is increased. We present an analysis of pre-fetching and piped implementation We present an analysis on the usage of various states in the techniques that aim to offset disk I/O costs while processing processing automaton to provide insights on why the performance large-scale XML datasets on multi-core CPU architectures. varies for differently shaped input data files. 11 / 52 12 / 52

Introduction and Motivation Introduction and Motivation XML and SOAP XML and SOAP SOAP and XML Benchmarks SOAP and XML Benchmarks Ubiquity of Multi-processing Capabilities Ubiquity of Multi-processing Capabilities Parallel XML Parallel XML Contributions Contributions Related Work Related Work Thesis statement Thesis statement Conclusions and Future Work Conclusions and Future Work Publications Thesis Statement ‘‘A Benchmark Suite for SOAP-based Communication in Grid Web Services,’’ in The Proceedings of Supercomputing 2005 ‘‘Benchmarking XML Processors for Applications in Grid Web In this thesis we present a comprehensive benchmark suite that Services,’’ in The Proceedings of Supercomputing 2006 facilitates the study of the strengths and weaknesses of XML and SOAP ‘‘Approaching a Parallelized XML Parser Optimized for Multi-Core toolkits for a wide range of use case scenarios. Processors,’’ in The Proceedings of SOCP 2007 , workshop held in conjunction with HPDC 2007 We propose a parallel processing model for some application-based ‘‘Parallel Processing of Large-Scale XML-Based Application large-scale XML datasets that can effectively leverage opportunities for Documents on Multi-core Architectures with PiXiMaL,’’ in The parallelism in emerging multi-core CPU architectures. Proceedings e-Science 2008 ‘‘Performance Enhancement with Speculative Execution Based Parallelism for Processing Large-scale XML-based Application Data,’’ to appear in The Proceedings of HPDC 2009 13 / 52 14 / 52 Introduction and Motivation Introduction and Motivation SOAP and XML Benchmarks SOAP and XML Benchmarks SOAPBench SOAPBench Parallel XML Parallel XML XMLBench XMLBench Related Work Related Work Conclusions and Future Work Conclusions and Future Work SOAP Benchmark Suite XML Benchmark Suite A chosen set of XML documents 1 Defines a set of operations to implement within a SOAP toolkit Low level probes Tests both serialization and deserialization of a variety of data Application-based benchmarks structures over a range of input sizes 2 A driver application for each XML processor Simple types: integers, strings, and floats Runs the parser on the input, but does not act on the data Base64 encoded data Eliminates application-level performance differences Complex types: event streams, mesh interface objects One for each interface style (SAX/DOM) 15 / 52 16 / 52

Outline Introduction and Motivation 1 Analysis and Optimization - PowerPoint PPT Presentation

Introduction and Motivation Introduction and Motivation SOAP and XML Benchmarks SOAP and XML Benchmarks Parallel XML Parallel XML Related Work Related Work Conclusions and Future Work Conclusions and Future Work Outline Introduction and

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

Introduction to Magnetic Recording Laurent Ranno laurent.ranno@grenoble.cnrs.fr Dept

Big Data Programming: an Introduction Spring 2015, X. Zhang Fordham Univ. Outline What the

Scaling Security for Big, Parallel File Systems Andrew Leung and Ethan Miller University of

Putting Big Data in its Place Mike Amundsen, API Academy at CA @mamund HH Camp Strasbourg,

Society Expanding context: Fairness A simple problem:

Tuni ng means di fferent thi ngs to di fferent peopl e The Tyranny of Carlo J. D. Bjorken

From%Deep%Blue%to%Monte%Carlo:%% ! ! An%Update%on%Game%Tree%Research%

Your Results A few thoughts on your results Javier Estrada 4asset optimization IESE