High Performance Linear System Solvers with Focus on Graph - - PowerPoint PPT Presentation

high performance linear system solvers with focus on
SMART_READER_LITE
LIVE PREVIEW

High Performance Linear System Solvers with Focus on Graph - - PowerPoint PPT Presentation

High Performance Linear System Solvers with Focus on Graph Laplacians Richard Peng Georgia Tech Co-PIs: John Gilbert (UCSB), Gary Miller (CMU) OUTLINE Problem of Lx = b Benchmarks and Evaluations Tree Based Solvers GRAPH


slide-1
SLIDE 1

High Performance Linear System Solvers with Focus on Graph Laplacians

Richard Peng Georgia Tech Co-PIs: John Gilbert (UCSB), Gary Miller (CMU)

slide-2
SLIDE 2

OUTLINE

  • Problem of Lx = b
  • Benchmarks and Evaluations
  • Tree Based Solvers
slide-3
SLIDE 3

GRAPH LAPLACIANS

1 1 2 -1 -1

  • 1 1 0
  • 1 0 1

Matrices that correspond to undirected graphs

  • Variables  vertices
  • Non-zeros  edges
slide-4
SLIDE 4

SOLVING Lx = b

  • [ST`04]: O(mlogcnlog(1/ε)) time
  • 2004 – 2014: c halved every 2 years
  • Multigrid methods widely used in scientific computing
  • Good runtimes for systems with as many as 109 nonzeros
  • MATLAB: pcg(L, ichol(L), b, ε) ‘works’ for 106 nonzeros

loglogc:

2004

year:

70 2006 2008 2009 32 15 6 2010 2 2011 1 2014 1/2

slide-5
SLIDE 5

THE LAPLACIAN PARADIGM

Directly related: Elliptic systems Few iterations: Eigenvectors, Heat kernels Many iterations / modify algorithm Graph problems Image processing

slide-6
SLIDE 6

NEW WAYS OF USING SOLVERS

Problem Lx=b Sequence of (adaptively) generated linear systems:

  • Power iteration
  • Interior point method
  • Iterative least squares

What makes such L and b hard:

  • Widely varying weights
  • Multiscale behavior
  • Difficulties of the graph problems
slide-7
SLIDE 7

OUTLINE

  • Problem of Lx = b
  • Benchmarks and Evaluations
  • Tree Based Solvers
slide-8
SLIDE 8

[KRS`15]: ISOTONIC REGRESSION

README file

we suggest rerunning the program a few times and /

  • r using a different solver.

An alternate solver based

  • n incomplete Cholesky is

provided with the code. https://github.com/sachdevasushant/Isotonic

slide-9
SLIDE 9

GOAL: BENCHMARKS

Structured graphs

  • Grids / cubes
  • Cayley graphs
  • Graph products

Hard graph problems

  • Maxflow problems from DIMACS

implementation challenges

  • Linear systems arising from second-
  • rder optimization (IPM)
slide-10
SLIDE 10

NUMERICS + COMBINATORICS:

Spanning trees:

  • finite approximation
  • linear time solve

Numerical methods (e.g. CG) rely on preconditioners

  • Good approximation to L
  • Easy (easier) to solve on
slide-11
SLIDE 11

NUMERICS + COMBINATORICS

Better convergence using 1024 bit MPFR floats compared to 53 bit C++ double Conjugate gradient (CG) with tree preconditioner:

  • [textbook]: m1/2 iters, even with round-off errors
  • [SW`09]: with exact arithmetic, takes m1/3 iters

https://github.com/serbanstan/TreePCG https://github.com/danspielman/Laplacians.jl

slide-12
SLIDE 12

QUESTION: NUMERICAL PRECISION

  • Can numerical precision be analyzed

through the graph theoretic components?

  • Primal-dual view of precision? CG?

https://github.com/serbanstan/TreePCG https://github.com/danspielman/Laplacians.jl

slide-13
SLIDE 13

OUTLINE

  • Problem of Lx = b
  • Benchmarks and Evaluations
  • Tree Based Solvers
slide-14
SLIDE 14

GOAL: FAST TREE-BASED SOLVERS

Gradually transform a tree-based solution to a solution on the entire graph

Method Cycle Toggle Ultrasparsifier Cost / Iter logn m + (m/k)2 # Iters mlog1/2nlog(1/ε) k1/2log(1/ε) Related to SGD

  • Grad. descent

Step uses Data structures Mat-Vec multiply

Claim: these ideas lead to code that can solve any Lx = b with 109 edges in ≤ 10 seconds on ≤ 64 cores

slide-15
SLIDE 15

CYCLE TOGGLING

  • Pick one off tree edge e at a time, make

progress using T + e as preconditioner

  • Speed up calculations using data structures
  • [KOSZ `13]: akin to toggling

dual flow along cycle, mlogn toggles, each costing O(logn)

  • [LS `13]: CG-like acceleration

to O(mlog1/2n) toggles

slide-16
SLIDE 16

AUGMENTED TREES

  • Add some edges to a tree to form a

`batched’ preconditioner

  • Use exact methods on preconditioner
  • [Vaidya `91]: MST + edges
  • [KMP`10]: O(mlog2n/k)

edges  k1/2 iters

  • Optimize: m5/4log1/2n

Exists recursive versions, but those gains only kick in at around 109 edges

slide-17
SLIDE 17

MOVING PIECES

  • Trees: MST / bottom-up / top-down / adaptive
  • Data structures: offline / static / dynamic
  • Numerics: batched / local, accelerated / CG
  • Initialization: tree solution / recursive

Method Cycle Toggle Ultrasparsifier Cost / Iter logn m + (m/k)2 # Iters mlog1/2nlog(1/ε) k1/2log(1/ε) Related to SGD

  • Grad. descent

Step uses Data structures Mat-Vec multiply

vs

slide-18
SLIDE 18

BENCHMARK FOR TREE BASED ALGOS: HEAVY PATH GRAPHS

  • Bad case for PCG,
  • `easy’ for tree data structures

Pick a Hamiltonian path, weight all

  • ther edges so each has stretch 1
slide-19
SLIDE 19

CYCLE TOGGLING VS. PCG

https://arxiv.org/abs/1609.02957 https://github.com/sxu/cycleToggling

slide-20
SLIDE 20

VARIANTS OF CYCLE TOGGLING

https://arxiv.org/abs/1609.02957 https://github.com/sxu/cycleToggling

slide-21
SLIDE 21

THANK YOU

  • Collaborators:
  • Hui Han Chin (CMU),
  • Kevin Deweese (UCSB),
  • John Gilbert (UCSB),
  • Gary Miller (CMU),
  • Saurabh Sawlani (GaTech),
  • Serban Stan (Yale),
  • Haoran Xu (MIT),
  • Shen Chen Xu (CMU)
  • Repos & Papers:
  • https://github.com/sxu/cycleToggling
  • https://github.com/serbanstan/TreePCG
  • https://github.com/danspielman/Laplacians.jl
  • https://arxiv.org/abs/1609.02957