trip report
play

Trip Report FINAL MEETING AND SUMMER SCHOOL OF DFG PRIORITY PROGRAM - PowerPoint PPT Presentation

Trip Report FINAL MEETING AND SUMMER SCHOOL OF DFG PRIORITY PROGRAM ALGORITHM ENGINEERING DFG PP 1307: Algorithm Engineering DFG Priority Program: nationwide funding program over 6 years for up to 30 individual projects PP 1307: Algorithm


  1. Trip Report FINAL MEETING AND SUMMER SCHOOL OF DFG PRIORITY PROGRAM ALGORITHM ENGINEERING

  2. DFG PP 1307: Algorithm Engineering DFG Priority Program: nationwide funding program over 6 years for up to 30 individual projects PP 1307: Algorithm Engineering • 28 research projects • 267 publications • 17 software projects, e.g.: • Multi-Core STL (MCSTL) – now gcc parallel mode • STL for Extra Large Datasets (STXXL) 2 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  3. Recap: Algorithm Engineering realistic models 1. hardware and problem “ The distance between theory design 2. and practice is closer in theory efficient, implementable algorithms than in practice ” analyze 3. beyond worst-case [Y. Matias (Google) in his invited talk at ESA ‘12] implement 4. with hardware peculiarities in mind experiment 5. repeatable, thorough interpretation 3 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  4. Final Meeting (17.09.2014) 9 talks, covering wide range of topics ◦ route planning in road and public transport networks ◦ graph clustering and partitioning ◦ data compression ◦ linear and mixed integer optimization ◦ sequence analysis no Indico used, slides only partially available 4 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  5. Summer School (18.-19.09.2014) Two days of lectures and hands-on sessions ◦ data compression (lecture only) ◦ linear and mixed integer optimization ◦ network analysis - graph clustering and partitioning ◦ shortest paths algorithms (lecture only) about 30 PhD students lots of discussion among students and lecturers 5 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  6. Selected Topics 6 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  7. Network Analysis Networks are everywhere ◦ Computer networks ◦ Social networks ◦ … 7 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  8. Network Analysis Network analysis mainly concerned with complex networks ◦ Small diameter ◦ Varying degree distribution ◦ Lots of triangles 8 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  9. Network Analysis GRAPH CLUSTERING GRAPH PARTITIONING ◦ Find (non-overlapping) internally dense, ◦ Partition vertex set into k (nearly) equally sized externally sparse subgraphs blocks ◦ Unknown: Number of subgraphs, their size ◦ Objective functions aim at small interfaces ◦ Goals / Applications: ◦ Applications: o Uncover community structure (analysis, ...) ◦ Numerical simulations ◦ route planning o Prepartition network (distributed storage, ...) ◦ distributed graph algorithms 9 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  10. Network Analysis GRAPH CLUSTERING GRAPH PARTITIONING Algorithms: Algorithms: ◦ Label propagation algorithm ◦ Size-constrained label propagation ◦ Louvain greedy method ◦ Diffusion-based partitioning Many different metrics: ◦ Conductance ◦ Expansion ◦ Modularity ◦ … 10 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  11. Network Analysis NetworKit: ◦ Toolkit developed during the project for network analysis – C++ with Python bindings ◦ Includes wide range of tools for graph analysis ◦ Excellent IPython notebook-based tutorial ◦ Includes algorithms proposed for evolving networks ◦ Analyze changing social networks – e.g. ITI email graph Interest for CERN: ◦ Community detection on the grid  planning of file transfers ◦ Track reconstruction  ongoing work 11 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  12. Shortest Paths and Routing Problem: find shortest path between s and t in weighted graph G Algorithms: ◦ Dijkstra’s algorithm too slow for large graphs ◦ Manifold speedup techniques [survey] ◦ A ∗ : search with Euclidean bounds (classic) ◦ ALT: A ∗ search with landmarks, preprocessing computes distances to landmarks ◦ Contraction Hierarchies: introduce shortcuts between “important” vertices of the graph ◦ Hub Labeling: every vertex stores distance to several hubs, covering the graph ◦ Most techniques rely on (more or less) expensive pre-computations 12 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  13. Shortest Paths and Routing Problem: User-defined cost functions render pre-computations futile Solution: Three-stage processing [Delling et al. 2013] 1. Metric-independent pre-processing ≈ hr Recursively partition graph Generate arcs between entry and exit nodes to neighboring partitions 2. Metric-dependent pre-processing ≈ s Compute metric between all shortcut arcs 3. Query ≈ μ s Find shortest-path in contracted graph and unpack it in original one 13 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  14. Shortest Paths and Routing Routing in public transport networks is a much harder problem ◦ Inherent time-dependence ◦ Solved using (potentially huge!) event-activity networks Interest for CERN: ◦ Grid tiers already define contraction hierarchy  examine actual data flows for missing/misplaced hubs 14 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  15. Data Compression Problem: compress once, decompress many times Compressor Compressed Decompressi on dataset space (MB) on time MINGW (1gb) (secs) Requirements: Gzip 344 5.5 ◦ Compressed space Trade-off ◦ Decompression time Lzma 188 8.3 ◦ Compression time is not much an issue Snappy 461 0.9 “Snappy is widely used inside Google, in everything from BigTable and MapReduce …” 15 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  16. Data Compression Reminder: Lempel-Ziv compression a a c a a c a b c a a d a a a a c <6,3> <0,d> <3,2> <11,3> This part has been already compressed Greedy approach only optimal if every pair takes constant space ◦ but variable number of bits required for distances  non-optimal Bit-optimal LZ parsing [Ferragina et al. 2013] ◦ Solve shortest path problem on DAG describing possible compression pairs 16 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  17. Data Compression Bi-criteria Compression [Farruggia et al. 2014] : ◦ Space and decompression time edge weight in DAQ ◦ Fix space constraint, search for lowest decompression time and vice versa 17 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  18. Data Compression Different approach to compression: Burrows-Wheeler Transform [introduction] 18 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  19. Data Compression Different approach to compression: Burrows-Wheeler Transform ◦ Yields smaller compression size but longer decompression time ◦ Construction of BWT closely related to suffix-array construction ◦ Allows decompression of any substring FM index [Ferragina and Manzini 2000] ◦ Used BWT and auxiliary data structures to answer count and locate queries on compressed text Interest for CERN: ◦ Compression of ROOT files + access of individual entries ◦ Compression of and search in dictionaries 19 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  20. Miscellaneous Linear programming ◦ Disprove of Hirsch conjecture poses thread to simplex method  still well in practice ◦ Anecdote: interior point method patented by AT&T  circumvent patent by polar transformation of problem and usage of barrier method SeqAn ◦ Package for analysis of (genome) sequences ◦ Developers face similar problems as HEP: Bridge gap between computer science and real world problems External memory algorithms ◦ Flow computations for massive LiDAR terrain data sets ◦ General trick of time forward processing to reduce I/O 20 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

  21. Conclusions ◦ Final meeting gave good overview of broad activity in DFG PP 1307 “Algorithm Engineering” ◦ Summer school expanded on four focus topics of the PP ◦ Similar research continues in DFG PP DFG 1736 “Algorithms for Big Data” ◦ Funding period 2013-2019 ◦ Currently 16 projects covering graph analysis, energy efficient scheduling, search and text indexing, genome assembly,… ◦ Most projects concerned with computer science problems ◦ Computational biology problems present in both PPs HEP community needs to explore how to exploit this resource of expertise and funding 21 2014-10-27 TRIP REPORT: ALGORITHM ENGINEERING

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend