Implement Distributed Alternating Least Squares Algorithm for Matrix - - PowerPoint PPT Presentation

implement distributed alternating least squares algorithm
SMART_READER_LITE
LIVE PREVIEW

Implement Distributed Alternating Least Squares Algorithm for Matrix - - PowerPoint PPT Presentation

Implement Distributed Alternating Least Squares Algorithm for Matrix Completion Varun Gandhi (vg292) Computer Laboratory Netflix Problem V: m*n matrix complete the matrix W: m*r (row-factor matrix) H: r*n


slide-1
SLIDE 1

Implement Distributed Alternating Least Squares Algorithm for Matrix Completion

Varun Gandhi (vg292)

Computer Laboratory

slide-2
SLIDE 2

2

Netflix Problem

  • V: m*n matrix
  • complete the matrix
  • W: m*r (row-factor matrix)
  • H: r*n (column-factor matrix)
  • W*H approx V
  • Loss function (Vij - WHij)2
slide-3
SLIDE 3

3

Motivation

Large applications involve matrices with

  • millions of rows x columns;
  • billions of entries

To achieve high-performance

  • parallel & distributed factorisation
  • keep the loss to minimum
slide-4
SLIDE 4

4

Algorithm

Sequential Computation

  • Initial point W0 and H0
  • ALS solved for every row & column
  • Parallel Computation
  • Parallelise computation for rows and columns respectively
slide-5
SLIDE 5

5

Algorithm

Distributed Computation

  • Partition (block) the matrix with mb*nb matrices
  • every node updates a matrix block
  • Why Spark?
  • In-memory algorithm
  • Matrix versions cached in memory
slide-6
SLIDE 6

6

Progress

  • Revising all linear algebra concepts
  • Getting familiar with Scala and Spark
  • Trying examples in Python