Evaluating the Impact of Transactional Characteristics on the - - PowerPoint PPT Presentation

evaluating the impact of transactional characteristics on
SMART_READER_LITE
LIVE PREVIEW

Evaluating the Impact of Transactional Characteristics on the - - PowerPoint PPT Presentation

Introduction Methodology Performance Evaluation Conclusions References Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications 1 Fernando Rui, 2 Mrcio Castro, 1 Dalvan Griebler, 1 Luiz


slide-1
SLIDE 1

Introduction Methodology Performance Evaluation Conclusions References

Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications

1Fernando Rui, 2Márcio Castro, 1Dalvan Griebler, 1Luiz Gustavo Fernandes Email: fernando.rui@acad.pucrs.br, mbcastro@inf.ufrgs.br, dalvan.griebler@acad.pucrs.br, luiz.fernandes@pucrs.br

1Pontifícia Universidade Católica do Rio Grande do Sul - PUCRS - GMAP 2Universidade Federal do Rio Grande do Sul - UFRGS - INF

February 2014

1 / 16

slide-2
SLIDE 2

Introduction Methodology Performance Evaluation Conclusions References

Summary

1

Introduction

2

Methodology

3

Performance Evaluation

4

Conclusions

5

References

2 / 16

slide-3
SLIDE 3

Introduction Methodology Performance Evaluation Conclusions References

Introduction

1

Motivation

Multi-core Applications are not embarrassingly parallel Traditional synchronization structures (locks, mutexes and semaphores)

Low-level mechanisms Cause Blocking Hard to manage Vulnerable to failures and faults

3 / 16

slide-4
SLIDE 4

Introduction Methodology Performance Evaluation Conclusions References

Introduction

1

Transactional Memory (TM)

High-level abstraction Allows to write parallel code as transactions In runtime detect conflicts and solve them

4 / 16

slide-5
SLIDE 5

Introduction Methodology Performance Evaluation Conclusions References

Introduction

1

Challenge of TM systems

What kind of applications can really take advantage of TM? Why some TM applications present low performance?

2

Contributions of this research

Performance evaluation of the state-of-art STM systems and applications Extend the analysis of [1], including the RSTM [2] system We find out characteristics that affect the performance TM We identify bottlenecks of TM App. that limit their scalability We show possible improvements to achieve better performance.

5 / 16

slide-6
SLIDE 6

Introduction Methodology Performance Evaluation Conclusions References

Methodology

1

Comparative Analysis

1

Four state-of-the-art STM systems using the Stanford Transactional Applications for Multi-Processing (STAMP) benchmark [3];

2

Evaluation of STM systems using EigenBench [1];

3

We evaluate the impact of certain transactional characteristics using EigenBench.

2

Environment of Tests

All experiments were performed on a Dell PowerEdge R610 machine with two quad-core Intel Xeon E5520 2.27 GHz processors with 8MB of L2 cache and 16GB of shared memory; All results are arithmetic means of at least 30 runs to guarantee a confidence level of 95%.

6 / 16

slide-7
SLIDE 7

Introduction Methodology Performance Evaluation Conclusions References

STM Systems Using STAMP Benchmark

1

STM Systems

Transactional Locking (TL2) [4]: second version of the

  • riginal TL;

TinySTM [5]: uses shared counter as clock to control the conflicts between transactions and locks to protect shared memory locations; SwissTM [6]: its innovations is the hybrid conflict detection scheme; Rochester Software Transactional Memory (RSTM) [2]: reduces cache misses by employing a single level of indirection to access shared objects.

7 / 16

slide-8
SLIDE 8

Introduction Methodology Performance Evaluation Conclusions References

STM Systems Using STAMP Benchmark

1

Performance Evaluation

1 2 3 4 5 b a y e s g e n

  • m

e i n t r u d e r k m e a n s l a b y r i n t h s s c a 2 v a c a t i

  • n

y a d a

SwissTM

Speedups 1 2 3 4 5 b a y e s g e n

  • m

e i n t r u d e r k m e a n s l a b y r i n t h s s c a 2 v a c a t i

  • n

y a d a

RSTM

1 2 3 4 5 b a y e s g e n

  • m

e i n t r u d e r k m e a n s l a b y r i n t h s s c a 2 v a c a t i

  • n

y a d a

TinySTM

1 2 3 4 5 b a y e s g e n

  • m

e i n t r u d e r k m e a n s l a b y r i n t h s s c a 2 v a c a t i

  • n

y a d a

TL2

Applications k m e l a v a 2 cores 4 cores 8 cores Legend

8 / 16

slide-9
SLIDE 9

Introduction Methodology Performance Evaluation Conclusions References

SwissTM vs. RSTM using EigenBench

1

Set-up:

STM systems which presented better performance; STAMP applications with poor (ssca2), medium (intruder and vacation) and good (labyrinth and genome) scalability; The evaluation is based on speedup and aborts per commit (ApC).

2

EigenBench Input Parameters

Table: Applications characteristics from STAMP benchmark

Characteristic ssca2 intruder vacation labyrinth genome Working-set Size 400 MB 20 MB 256 MB 16 MB 20 MB Transactional Lenght 3 24 226 357 88 Pollution 33% 5% 2% 50% 5% Temporal Locality 0.33 0.52 0.59 0.77 0.58 Contention 0.0005% 22% 0.2% 5% 0.5% Predominance Low Low High Low High Density High High High Low High

9 / 16

slide-10
SLIDE 10

Introduction Methodology Performance Evaluation Conclusions References

SwissTM vs. RSTM using EigenBenach (Cont.)

1

Performance Evaluation

k m e l a v a 2 cores 4 cores 8 cores Legend 0% 2% 4% 6% 8% 10% 12% 14% 16% 2 4 8 Number of cores

Aborts per commit

1 2 3 4 5 6 7 8 genome intruder labyrinth ssca2 vacation Applications

Speedups

SwissTM RSTM 1 2 3 4 5 6 7 8 genome intruder labyrinth ssca2 vacation Applications

Speedups

0% 1% 2% 3% 4% 5% 6% 2 4 8 Number of cores

Aborts per commit

genome intruder labyrinth ssca2 vacation Legend

10 / 16

slide-11
SLIDE 11

Introduction Methodology Performance Evaluation Conclusions References

SwissTM vs. RSTM using EigenBenach (Cont.)

1

Findings

TM applications that use large amounts of memory did not present good performance, since STM systems need to keep track of much more data to detect conflicts; The variation in terms of transaction lengths during the execution is not well treated by most of the STM systems; Low degrees of predominance and density help TM applications to perform better; High levels of ApC generally limit the performance of TM applications.

11 / 16

slide-12
SLIDE 12

Introduction Methodology Performance Evaluation Conclusions References

Evaluating the Impact of Transactional Characteristics

1 2 3 4 5 Original V1 V2 V3 V4

Genome - Transactional Length

1 2 3 4 5 Original V1 V2 V3 V4

Intruder - Temporal Locality

1 2 3 4 5 Original V1 V2 V3 V4

Ssca2 - Working-set Size

1 2 3 4 5 Original V1 V2 V3 V4

Vacation - Working-set Size

Speedups Versions k m e l a v a 2 cores 4 cores 8 cores Legend

12 / 16

slide-13
SLIDE 13

Introduction Methodology Performance Evaluation Conclusions References

Conclusions About this paper

Some Characteristics drive the performance of TM applications; Applications must be analysed carefully to identify relevant characteristics;

Future Opportunities

We intend to extend this work using some tracing mechanisms as proposed in [7]; We intend to study the impact of the TM characteristics on the performance of TM applications when executed on a real HTM processor such as the Intel Haswell.

13 / 16

slide-14
SLIDE 14

Introduction Methodology Performance Evaluation Conclusions References

References I

Sungpack Hong et al. Eigenbench: A Simple Exploration Tool for Orthogonal TM Characteristics. In IEEE International Symposium on Workload Characterization (IISWC), pages 1–11, Washington, USA,

  • 2010. IEEE Computer Society.

Virendra J. Marathe, Michael F. Spear, Christopher Heriot, Athul Acharya, David Eisenstat, William N. Scherer III, and Michael L. Scott. Lowering the Overhead of Nonbacterial Software Transactional Memory. In ACM SIGPLAN Workshop on Transactional Computing. Jun 2006. Cao Minh et al. STAMP: Stanford Transactional Applications for Multi-Processing. In IEEE International Symposium on Workload Characterization (IISWC), pages 35–46, Seattle, USA, 2008. IEEE Computer Society. Dave Dice et al. Transactional Locking II. In International Symposium on Distributed Computing (DISC), pages 194–208, 2006. Pascal Felber, Christof Fetzer, and Torvald Riegel. Dynamic Performance Tuning of Word-based Software Transactional Memory. In Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 237–246, Salt Lake City, USA, 2008. ACM. Aleksandar Dragojevi´ c, Rachid Guerraoui, and Michal Kapalka. Stretching Transactional Memory. In Programming Language Design and Implementation (PLDI), pages 155–165, 2009. 14 / 16

slide-15
SLIDE 15

Introduction Methodology Performance Evaluation Conclusions References

References II

Márcio Castro et al. Analysis and Tracing of Applications Based on Software Transactional Memory on Multicore Architectures. In Euromicro International Conference on Parallel, Distributed and Network-Based Computing (PDP), pages 199–206. IEEE Computer Society, 2011. 15 / 16

slide-16
SLIDE 16

Introduction Methodology Performance Evaluation Conclusions References

Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications

1Fernando Rui, 2Márcio Castro, 1Dalvan Griebler, 1Luiz Gustavo Fernandes Email: fernando.rui@acad.pucrs.br, mbcastro@inf.ufrgs.br, dalvan.griebler@acad.pucrs.br, luiz.fernandes@pucrs.br

1Pontifícia Universidade Católica do Rio Grande do Sul - PUCRS - GMAP 2Universidade Federal do Rio Grande do Sul - UFRGS - INF

February 2014

16 / 16