cluster for data mining algorithms Joo Saffran, Gabriel Garcia, - - PowerPoint PPT Presentation

cluster for data mining algorithms
SMART_READER_LITE
LIVE PREVIEW

cluster for data mining algorithms Joo Saffran, Gabriel Garcia, - - PowerPoint PPT Presentation

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms Joo Saffran, Gabriel Garcia, Matheus A. Souza , Pedro H. Penna, Mrcio Castro , Lus F. W. Ges and Henrique C. Freitas matheus.alcantara@sga.pucminas.br


slide-1
SLIDE 1

João Saffran, Gabriel Garcia, Matheus A. Souza, Pedro H. Penna, Márcio Castro, Luís F. W. Góes and Henrique C. Freitas matheus.alcantara@sga.pucminas.br

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms

UnConventional High Performance Computing 2016 @ Euro-Par 2016 August 23

slide-2
SLIDE 2

Summary

  • Introduction
  • Related work
  • Platforms and algorithms
  • Experimental setup
  • Results
  • Final remarks

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 2

slide-3
SLIDE 3

Introduction

  • Computation surpassed the Petaflop barrier
  • Massively parallel architectures:

GPUs and Many-cores

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23

Intel Xeon Phi 61 cores Nvidia Tesla K40 2880 CUDA cores

3

slide-4
SLIDE 4

Introduction

  • Surpass the Exaflop barrier
  • Challenges:

Power consumption and Financial cost

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 4

slide-5
SLIDE 5

Introduction

The use of those architectures might be unworkable... ... but we can not disregard the demand for performance!

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 5

slide-6
SLIDE 6

Introduction

Big Data:

  • Complex datasets whose size are very big
  • They (data) come from sensors, devices, multimedia,

social media, etc...

  • Must be processed in real-time and in very large scale

Big Data must exploit High Performance Computing!

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 6

slide-7
SLIDE 7

Our work

Given the cost and energy-efficiency constraints... ... we must consider low-cost and low-power architectures! The goal is:

To evaluate the performance, power and energy consumption of an energy-efficient and low-cost Raspberry Pi Cluster running two data mining algorithms: Apriori and K-Means

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 7

slide-8
SLIDE 8

Our work

Contributions:

  • A verification of a low-cost energy-efficient Cluster

and whether it can be used as an alternative for HPC

  • A comparison of this Cluster with a High Performance

Many-Core Processor (The Intel Xeon Phi)

  • An evaluation which clarifies whether this platform can be

used in the context of Big Data

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 8

slide-9
SLIDE 9

Related work

Kruger, M.J. (2015): Building a Parallella board cluster

  • The cluster proved to be better than an Intel i5
  • Parallella lacks of hardware support for complex operations
  • No rigorous power evaluation was conducted

d‘Amore et al. (2015): A practical approach to big data in tourism: A low cost Raspberry Pi Cluster

  • Focused on how to retrieve and use the data
  • No power or quantitative performance evaluation

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 9

slide-10
SLIDE 10

Other initiatives

The use of ARM processors:

  • Mont Blanc and Mont Blanc 2 projects

HPC based on energy-efficient platforms

  • Glasgow Raspberry Pi Cloud (PiCloud)

Cluster of Raspberry boards for cloud computing

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 10

slide-11
SLIDE 11

Our work

Platforms:

  • Intel Xeon Phi coprocessor
  • Raspberry Pi Cluster (8 boards interconnected)

Application kernels:

  • Apriori
  • K-Means

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 11

slide-12
SLIDE 12

Intel Xeon Phi

  • 61 cores (244 threads) interconnected by a bidirectional ring
  • 32 kB instruction and 32 kB data L1 caches per core
  • 256 kB of L2 cache per core
  • 16 GB of main memory

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 12

slide-13
SLIDE 13

Raspberry Pi Cluster

  • 8 quad-core nodes (32 threads) interconnected by a network switch
  • 64 kB instruction and 64 kB data L1 caches per core
  • 512 kB of L2 cache shared by the 4 cores
  • 1 GB of main memory (each board)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 13

slide-14
SLIDE 14

Apriori kernel

  • Association rule machine-learning algorithm
  • Given a list of itemsets, identify association rules between those

items based on their frequency

  • Highlight general trends in the database

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 14

slide-15
SLIDE 15

Apriori kernel

  • 1. MapReduce parallel pattern
  • 2. Identify “starter sets” and distribute items among threads or

processes

  • 3. Each thread or node calculates the frequency
  • 4. The subsets are regrouped
  • 5. Check the minimum support

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 15

slide-16
SLIDE 16

K-Means kernel

  • Widely used clustering approach
  • Given a set of n points, partition these points into k partitions, to

minimize the mean squared distance from each point to the center Five partition K-Means example

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 16

slide-17
SLIDE 17

K-Means kernel

  • 1. Assign a thread or process a unique set of points and partitions
  • 2. Each thread:

i. Re-cluster its own points into the k partitions ii. Recalculate centroids

  • 3. Synchronize the partitions between phases i and ii

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 17

slide-18
SLIDE 18

Experimental Setup

Intel Xeon Phi

  • OpenMP library
  • Intel C Compiler (ICC)
  • MICSMC tool, to monitor power consumption

Raspberry Pi Cluster

  • OpenMPI and OpenMP libraries
  • GNU C Compiler (GCC)
  • Watt-meter installed before the power supply, to measure

energy consumption (except the switch)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 18

slide-19
SLIDE 19

Experimental Setup

Workload sizes Apriori (minimum support)

  • Standard: 70
  • Large: 60
  • Huge: 50

K-Means (# of data points)

  • Standard: 214
  • Large: 215
  • Huge: 216

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 19

slide-20
SLIDE 20

Experimental Setup

The number of available resources varied proportionally

  • 30, 60, 120 and 240 threads in the Intel Xeon Phi
  • 4, 8, 16 and 32 threads in the Raspberry Pi Cluster

A total of 10 runs were conducted

  • At most 11,04% standard deviation for Raspberry Pi Cluster
  • At most 7,07% standard deviation for Intel Xeon Phi

Metrics: Execution time, Power consumption, Energy consumption Energy consumption = Execution time X Power consumption thus lower Energy consumption values means better Energy Efficiency

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 20

slide-21
SLIDE 21

Results – Execution time (s)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Apriori scales better than K-Means Apriori has more independent work units than K-Means 21

slide-22
SLIDE 22

Results – Execution time (s)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Apriori scales better than K-Means Apriori has more independent work units than K-Means

56.79% 95.02% 54.84% 93.76%

22

slide-23
SLIDE 23

Results – Execution time (s)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Xeon Phi was faster than the Cluster Xeon Phi has cores with more processing power With almost equal number of threads (30 and 32), sometimes the Cluster presented better results than Xeon Phi

23.81% 69.89%

23

slide-24
SLIDE 24

Results – Execution time (s)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

The Cluster scales better than Xeon Phi The network switch between 8 nodes was not a bottleneck in the Cluster The synchronization time between 240 threads in the Xeon Phi was a bottleneck

78.51% 82.05%

24

73.85% 64.59%

slide-25
SLIDE 25

Results – Power consumption (W)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Irregular kernels: Different execution time for each working unit 25

slide-26
SLIDE 26

Results – Power consumption (W)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Irregular kernels: Different execution time for each working unit Average power decreases in Cluster when the workload increases: Slave processes finish their computation, thus nodes reduces their power consumption drastically In Xeon Phi, idle cores keep consuming a portion of power

44.34% 68.06% 1.39% 15.42%

26

slide-27
SLIDE 27

Results – Power consumption (W)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Generally, the Cluster is better than Xeon Phi

88.35% 85.17%

27

slide-28
SLIDE 28

Results – Energy Consumption (J)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Energy = Power x Time J = W x s 28

slide-29
SLIDE 29

Results – Energy Consumption (J)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Workloads increase: Energy consumption increases due to the execution time

160.33% 954.33% 121.62% 1587.88%

29

slide-30
SLIDE 30

Results – Energy Consumption (J)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Resources increase: Energy consumption decreases due to the execution time was more determinant than the power consumption

81.59% 68.16% 56.38% 66.18%

30

slide-31
SLIDE 31

Results – Energy Consumption (J)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

Apriori is more energy efficient in the Cluster K-Means is more energy efficient in Xeon Phi

45.51% 59.12%

31

slide-32
SLIDE 32

Results – Energy Consumption (J)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 APRIORI K-MEANS RASPBERRY PI CLUSTER INTEL XEON PHI

Standard Large Huge

With almost equal number of threads (30 and 32), generally the Cluster is more energy efficient than Xeon Phi

318.79% 21.61%

32

slide-33
SLIDE 33

Concluding remarks

  • The Raspberry Pi Cluster proved to be more energy efficient than

the Intel Xeon Phi for Apriori kernel

  • K-Means presented different results due to the fact it takes more

time to solution than Apriori. But with almost equal number of threads (30 and 32), the Raspberry Pi Cluster proved to be more energy efficient than the Intel Xeon Phi

  • The financial cost of the Intel Xeon Phi infrastructure was 10X more

expensive than the Raspberry Pi Cluster infrastructure

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 33

slide-34
SLIDE 34

Future work

  • To apply load balancing strategies on both application kernels
  • To implement parallel versions of these applications for Graphical

Processor Units (GPUs)

  • To use more HPC devices, for instance, a cluster of Xeon Phi boards
  • To study the impacts on the energy efficiency and performance of

the Raspberry Pi Cluster when running application kernels from

  • ther domains (image processing and computational fluid dynamics)

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 34

slide-35
SLIDE 35

Acknowledgement

This work was developed in the context of EnergySFE and ExaSE cooperation projects. The authors would like to thank:

A low-cost energy-efficient Raspberry Pi cluster for data mining algorithms - UCHPC 2016 - August 23 35