Message-passing Two Steps Least Square Algorithms for Simultaneous - PowerPoint PPT Presentation

Message-passing Two Steps Least Square Algorithms for Simultaneous Equations Models José Juan López Espín Universidad Miguel Hernández (Elche, Spain) Domingo Giménez Cánovas Universidad de Murcia (Murcia, Spain) 1

Contents  Introduction  Simultaneous equations models  OLS and 2SLS techniques  Three different versions of 2SLS algorithm  General  Inverse decomposition  QR decomposition  Experimental results  Conclusions and future works 2

Introduction  The solution of a S.E.M. in high performance parallel systems is studied using 2SLS.  Three different versions of 2SLS are studied.  Parallel algorithms for distributed memory have been developed for the three versions.  The methods have been analyzed in different parallel systems. 3

Simultaneous Equations Models The scheme of a system with M equations, M endogenous variables and k predetermined variables is (structural form) = β + β + + β + γ + + γ + Y Y Y Y X X u ... ... 1 t 12 2 t 13 3 t 1 M Mt 11 1 t 1 k kt 1 t = β + β + + β + γ + + γ + Y Y Y ... Y X ... X u t t t M Mt t k kt t 2 21 1 23 3 2 21 1 2 2 … = β + β + β + + β + γ + + γ + Y Y Y Y ... Y X ... X u − − Mt M 1 1 t M 2 2 t M 3 3 t MM 1 M 1 t M 1 1 t Mk kt Mt These equations can be represented in matrix form +G + = BY X u 0 t t t 4

Simultaneous Equations Models The structural form can be expressed in reduced form = P + Y X v t t t B - - 1 1 with and = - P = - G v B u t t = + + + p p Y X ... X v 1 t 11 1 t 1 k kt 1 t … = + + + p p Y X X v ... Mt M 1 1 t Mk kt Mt 5

OLS (Method) OLS ( Ordinary Least Square ) can be used to solve a regression model = a + + a + Y X ... X u t 1 1 t n nt t In matrix form = + b Y X u The expression of the estimator is ˆ - 1 b = ( X X ) X Y 6

2SLS (Two Step Least Squares)  OLS can not be used in  The proxy of Y is structural form because calculated using OLS with random variable and Y and the exogenous in endogenous variables are the system. correlated  When the endogenous  Endogenous variables have been replaced, OLS are replaced for is used again in the approximations (proxys equation variables) 7

Parallel Algorithm for distributed memory  Try to parallelize at the upest level  Share the maximum of information.  Each call to 2SLS must share more information to reduce the number of operations.  Perform the maximum number of operations between all the processors at the beginning of the algorithm to be used for any processor in other parts of the algorithm.  ScaLAPACK and PBLAS libraries are used to make a portable program 8

OLS p (Parallel OLS) In the experiments pdgemm has been used to perform the multiplications, and pdgesv to compute the inverse. The use of ScaLAPACK allows us to obtain a portable routine. 9

2SLS for a system (Parallel 2SLS)  Three different versions of the 2SLS algorithm are presented.  The first is a basic algorithm which will be improved in the second and the third versions.  In the first version, the structure of the parallel 2SLS algorithm is stated. In the others versions, the same structure is followed but matrix decompositions are used to obtain lower costs. 10

The first version of 2SLS  All the proxys are calculated at the beginning of the algorithm  All the proxys are distributed in all the processors  Each processor solves an equation using OLS sequentially 11

The 2nd v. of 2SLS (inverse decomposition) Solve an equation where the proxy variables have been substituted before (they are calculated at the beginning) = a + a + + a + g + + g + e ˆ ˆ y y ... y x ... x j 0 1 j m j 1 j k j 1 m 1 k ˆ Y The set of endogenous variables of the equation is and 1 X 1 is the set of predetermined, and then the variables of ˆ Y ˆ 1 Y the equation are the matrix [ X 1 ] 1 ˆ ˆ ˆ Y Y Y And ([ X 1 ] t [ X 1 ]) -1 [ X 1 ] t y j must be solved 1 1 1 12

The 2nd v. of 2SLS (inverse decomposition) - 1 ˆ The inverse: X ' X X X Y ' ' 1 ˆ 1 1 1 1 = = X Y 1 1 ˆ ˆ ˆ ˆ Y ' Y X Y Y ' ' 1 1 1 1 1 - ˆ 1 - 1 - ( X ' X ) 0 X X X Y ( ' ) ' ˆ ˆ ˆ - ˆ - ˆ - 1 1 1 1 1 + - - 1 1 1 1 ( Y Y ' Y X ' ( X ' X ) X Y ' ) ( Y X ' ( X ' X ) , Id ) 1 1 1 1 1 1 1 1 1 1 1 1 0 0 Id Using - 1 - - 1 1 A B - A 0 A B - - - 1 1 1 = + - - ( D B A B ' ) ( A B Id , ) B ' D 0 0 Id 13

The 2nd v. of 2SLS (inverse decomposition) (X 1 ’X 1 ) is taken from X’X (X 1 ’X 1 ) -1 is calculated ˆ Y X 1 ’ is taken from X ’ Y 1 ˆ Y 1 (X 1 ’X 1 ) -1 X 1 ’ is calculated (cost 2k 2 m+ 2 /3k 3 ) ˆ ˆ Y Y 1 1 ’X 1 (X 1 ’X 1 ) -1 X 1 ’ is calculated (cost 2m 2 k) ˆ ˆ ˆ Y Y Y ' Y 1 1 ’ is taken from ˆ ˆ ˆ ˆ Y Y Y Y 1 1 1 1 ( ‘ - ‘ X 1 (X 1 ’X 1 ) -1 X 1 ’ ) -1 is calculated (cost 2 /3m 3 ) 14

The 2nd v. of 2SLS (inverse decomposition) ˆ To calculate [ X 1 ]’ y j Y 1  X’ 1 y j can be taken from X t Y which was calculated to obtain Pi ˆ ˆ Y Y ' Y  ( ’y j ) can be taken from 1 15

The 2nd v. of 2SLS (inverse decomposition) Finally, the algorithm is 16

The 3rd v. of 2SLS (QR decomposition) X is decomposed as QR using Householder method, where Q is orthogonal and R upper triangular. 17

The 3rd v. of 2SLS (QR decomposition) The algorithm is 18

Computer System  Kefren: A cluster of 20 biprocessors Pentium Xeon 2 Ghz interconnected by a SCI net with a Bull 2D topology in a mesh of 4 £ 5. Each node has 1 Gigabyte RAM.  Marenostrum: A supercomputer based on PowerPC processors, BladeCenter architecture, a Linux system and a Myrinet interconnection. The main characteristics are: 10240 IBM Power PC 970MP processors at 2.3 GHz (2560 JS21 blades), 20 TB of main memory, 280 + 90 TB of disk storage and a peak Performance of 94,21 Teraflops. Marenostrum is the most powerful supercomputer in Europe and the fifth in the world, according to the last TOP500 list. 20

The first version of 2SLS 21

The first version of 2SLS 22

The 2nd v. of 2SLS (inverse decomposition) 23

The 2nd v. of 2SLS (inverse decomposition) 24

The 3rd v. of 2SLS (QR decomposition) 25

Comparison between the three techniques 26

Comparison of the precisions between the three techniques Endogenous Exogenous Sample dif Inv-Qr dif. Inv-Normal 500 200 500 9,13657E-12 9,08442E-12 1000 400 500 3,00996E-12 3,00927E-12 1000 400 1000 4,65709E-13 4,64279E-13 1500 600 1000 2,13886E-08 2,18451E-08 1500 600 1500 2,63E-09 2,49918E-09 2000 800 1500 7,81023E-12 7,78951E-12 2000 800 2000 2,7896E-12 2,79031E-12 27

Conclusions and Future works  Sometimes a Simultaneous  Application to real Equations Model problems needs special  Develop an algorithm software and be to find the best model solved in High Performance Systems  Tools will be made freely available to the scientific community 28

Message-passing Two Steps Least Square Algorithms for Simultaneous - PowerPoint PPT Presentation

Message-passing Two Steps Least Square Algorithms for Simultaneous Equations Models Jos Juan Lpez Espn Universidad Miguel Hernndez (Elche, Spain) Domingo Gimnez Cnovas Universidad de Murcia (Murcia, Spain) 1 Contents

COMP31212: Concurrency Topics 4.3: Message Passing Topic 4.3: Message Passing Outline Topic

Message Passing Concepts Message Passing Model The message passing model is based on the

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

Message-Passing Programming with MPI Message-Passing Concepts Overview This lecture will

MPI - Message Passing Interface MPI is the mostly used message passing-standard By

Interference Alignment via Message-Passing Message-Passing M. Guillaud Motivation Maxime

Distributed Objects Message Passing vs. Distributed Objects Message Passing versus Distributed

+ Design of Parallel Algorithms Introduction to the Message Passing Interface MPI + Principles

Fault Tolerance in Message Passing Fault Tolerance in Message Passing and in Action and in

Message Passing Dr. Liam OConnor University of Edinburgh LFCS (and UNSW) Term 2 2020 1

Lecture 5: Message Passing & Other Communication Mechanisms (SR & Java) Intro:

A little introduction to MPI Jean-Luc Falcone July 2017 Message Passing Basics Point to point

Message passing and channels INF4140 - Models of concurrency Message passing and channels Fall

Message Passing Dr. Liam OConnor University of Edinburgh LFCS (and UNSW) Term 2 2020 1

c p e c Writing Message-Passing Parallel Programs with MPI Edinburgh Parallel Computing Centre

Hydrocarbon Impacted Clayey Sites Aime Schryer, Lisa Moehlman and Steven D. Siciliano Presenting

Cobb County Design Guidelines V I N I N G S V I L L A G E Community Design Presentation

Using AI to solve performance problems Salesforce Performance Engineering Jasmin Nakic | Jackie

PUBLIC INVOLVEMENT IN EFFICIENCY The University of Winnipeg MANITOBAS PLAN 2020/2023

Elections and Public Health Spending in South Asia Nayan Krishna Joshi, PhD Institute for

Title: Model reduction and thermal regulation by Model Predictive Control of a new cylindricity

EASM 2014 recently shown an increasing interest in the behavioral and emotional effects of TI,

Guideline on Air Quality Models March 19-21, 2013 Sheraton Raleigh Hotel, Raleigh, N.C Pounding

Message-passing Two Steps Least Square Algorithms for Simultaneous - PowerPoint PPT Presentation

Message-passing Two Steps Least Square Algorithms for Simultaneous Equations Models Jos Juan Lpez Espn Universidad Miguel Hernndez (Elche, Spain) Domingo Gimnez Cnovas Universidad de Murcia (Murcia, Spain) 1 Contents

COMP31212: Concurrency Topics 4.3: Message Passing Topic 4.3: Message Passing Outline Topic

Message Passing Concepts Message Passing Model The message passing model is based on the

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

Message-Passing Programming with MPI Message-Passing Concepts Overview This lecture will

MPI - Message Passing Interface MPI is the mostly used message passing-standard By

Interference Alignment via Message-Passing Message-Passing M. Guillaud Motivation Maxime

Distributed Objects Message Passing vs. Distributed Objects Message Passing versus Distributed

+ Design of Parallel Algorithms Introduction to the Message Passing Interface MPI + Principles

Fault Tolerance in Message Passing Fault Tolerance in Message Passing and in Action and in

Message Passing Dr. Liam OConnor University of Edinburgh LFCS (and UNSW) Term 2 2020 1

Lecture 5: Message Passing &amp; Other Communication Mechanisms (SR &amp; Java) Intro:

A little introduction to MPI Jean-Luc Falcone July 2017 Message Passing Basics Point to point

Message passing and channels INF4140 - Models of concurrency Message passing and channels Fall

Message Passing Dr. Liam OConnor University of Edinburgh LFCS (and UNSW) Term 2 2020 1

c p e c Writing Message-Passing Parallel Programs with MPI Edinburgh Parallel Computing Centre

Hydrocarbon Impacted Clayey Sites Aime Schryer, Lisa Moehlman and Steven D. Siciliano Presenting

Cobb County Design Guidelines V I N I N G S V I L L A G E Community Design Presentation

Using AI to solve performance problems Salesforce Performance Engineering Jasmin Nakic | Jackie

PUBLIC INVOLVEMENT IN EFFICIENCY The University of Winnipeg MANITOBAS PLAN 2020/2023

Elections and Public Health Spending in South Asia Nayan Krishna Joshi, PhD Institute for

Title: Model reduction and thermal regulation by Model Predictive Control of a new cylindricity

EASM 2014 recently shown an increasing interest in the behavioral and emotional effects of TI,

Guideline on Air Quality Models March 19-21, 2013 Sheraton Raleigh Hotel, Raleigh, N.C Pounding

Lecture 5: Message Passing & Other Communication Mechanisms (SR & Java) Intro: