AmpTools a modular library to enable amplitude analysis
Matthew Shepherd Indiana University May 4, 2019 Joint GlueX and PANDA Workshop Ashburn, VA
AmpTools a modular library to enable amplitude analysis Matthew - - PowerPoint PPT Presentation
AmpTools a modular library to enable amplitude analysis Matthew Shepherd Indiana University May 4, 2019 Joint GlueX and PANDA Workshop Ashburn, VA Motivation Needed a general framework to handle amplitude analysis for CLEO-c, BES
Matthew Shepherd Indiana University May 4, 2019 Joint GlueX and PANDA Workshop Ashburn, VA
GlueX/PANDA Workshop May 4, 2019
amplitude analysis for CLEO-c, BES III, and GlueX
2
GlueX/PANDA Workshop May 4, 2019
amplitude analysis for CLEO-c, BES III, and GlueX
constraints
phenomenological models
colleagues
2
GlueX/PANDA Workshop May 4, 2019
amplitude analysis for CLEO-c, BES III, and GlueX
constraints
phenomenological models
colleagues
needs rather than the desire to produce a one- size-fits-all software package (like ROOT)
2
GlueX/PANDA Workshop May 4, 2019
3 On Feb 1, 2006, at 11:08 PM, Ryan Mitchell wrote: Hi, Another keyfile is attached. To fix parameters this version uses an "x" instead of "*". Names in {} are added, which can come before or after the amplitude is specified (but all before the ";").
GlueX/PANDA Workshop May 4, 2019
3 On Feb 1, 2006, at 11:08 PM, Ryan Mitchell wrote: Hi, Another keyfile is attached. To fix parameters this version uses an "x" instead of "*". Names in {} are added, which can come before or after the amplitude is specified (but all before the ";"). On May 27, 2009, at 4:51 PM, Hrayr Matevosyan <hmatevosyan@gmail.com> wrote: Hi Guys, The GPU code now works with the fitter in both single and double
anything to CVS, but I attach the output of the runs over 1.8 mil points with CPU and GPU. CPU takes ~ 2.7 sec for each fit iteration, where GPU in double precision ~ 0.006 sec.
GlueX/PANDA Workshop May 4, 2019
4
GlueX/PANDA Workshop May 4, 2019
5
GlueX/PANDA Workshop May 4, 2019
(other than quantum mechanics)
would exceed RAM on one machine
5
GlueX/PANDA Workshop May 4, 2019
set with N events written as:
6
L(θ) = e−µµN N!
N
Y
i=1
P(xi; θ),
GlueX/PANDA Workshop May 4, 2019
set with N events written as:
6
L(θ) = e−µµN N!
N
Y
i=1
P(xi; θ),
P(x; θ) = 1 µI(x; θ)η(x).
GlueX/PANDA Workshop May 4, 2019
set with N events written as:
6
L(θ) = e−µµN N!
N
Y
i=1
P(xi; θ),
P(x; θ) = 1 µI(x; θ)η(x).
coherent and incoherent sums of scalable, factorizable amplitudes
I(x) = X
σ
α
sσ,αVσ,αAσ,α(x)
Aσ,α(x) =
nσ,α
Y
γ=1
aσ,α,γ(x),
GlueX/PANDA Workshop May 4, 2019
using a background sample whose sum of weights is equal to the expected background
functions are implemented in the interface
7
GlueX/PANDA Workshop May 4, 2019
8
DalitzDataReader::DalitzDataReader( const vector< string >& args ) : UserDataReader< DalitzDataReader >(args), m_eventCounter( 0 ){ assert(args.size() == 1); string inFileName(args[0]); string inTreeName("nt"); TH1::AddDirectory( kFALSE ); gSystem->Load( "libTree" ); ifstream fileexists( inFileName.c_str() ); if (fileexists){ m_inFile = new TFile( inFileName.c_str() ); m_inTree = static_cast<TTree*>( m_inFile->Get( inTreeName.c_str() ) ); } else{ cout << "DalitzDataReader WARNING: Cannot find file... " << inFileName << endl; m_inFile = NULL; m_inTree = NULL; } if (m_inTree){ m_inTree->SetBranchAddress( "EnP1", &m_EnP1 ); m_inTree->SetBranchAddress( "PxP1", &m_PxP1 ); m_inTree->SetBranchAddress( "PyP1", &m_PyP1 ); m_inTree->SetBranchAddress( "PzP1", &m_PzP1 ); m_inTree->SetBranchAddress( "EnP2", &m_EnP2 ); m_inTree->SetBranchAddress( "PxP2", &m_PxP2 ); m_inTree->SetBranchAddress( "PyP2", &m_PyP2 ); m_inTree->SetBranchAddress( "PzP2", &m_PzP2 ); m_inTree->SetBranchAddress( "EnP3", &m_EnP3 ); m_inTree->SetBranchAddress( "PxP3", &m_PxP3 ); m_inTree->SetBranchAddress( "PyP3", &m_PyP3 ); m_inTree->SetBranchAddress( "PzP3", &m_PzP3 ); m_inTree->SetBranchAddress( "weight", &m_weight ); } } Kinematics* DalitzDataReader::getEvent(){ if( m_eventCounter < numEvents() ){ m_inTree->GetEntry( m_eventCounter++ ); vector< TLorentzVector > particleList; particleList.push_back( TLorentzVector( m_PxP1, m_PyP1, m_PzP1, m_EnP1 ) ); particleList.push_back( TLorentzVector( m_PxP2, m_PyP2, m_PzP2, m_EnP2 ) ); particleList.push_back( TLorentzVector( m_PxP3, m_PyP3, m_PzP3, m_EnP3 ) ); return new Kinematics( particleList, m_weight ); } else{ return NULL; } }
strings that come from the configuration file
vectors in a specific order
GlueX/PANDA Workshop May 4, 2019
strings from configuration file
from an array of four vectors
method to do expensive calculations common to all events when parameters change
storing expensive quantities that are fixed for every fit iteration
9
BreitWigner::BreitWigner( const vector< string >& args ) : UserAmplitude< BreitWigner >(args) { assert( args.size() == 4 ); m_mass = AmpParameter(args[0]); m_width = AmpParameter(args[1]); m_daughter1 = atoi(args[2].c_str()); m_daughter2 = atoi(args[3].c_str()); // need to register any free parameters so the framework knows about them registerParameter( m_mass ); registerParameter( m_width ); } complex< GDouble > BreitWigner::calcAmplitude( GDouble** pKin ) const { TLorentzVector P1(pKin[m_daughter1-1][1], pKin[m_daughter1-1][2], pKin[m_daughter1-1][3], pKin[m_daughter1-1][0]); TLorentzVector P2(pKin[m_daughter2-1][1], pKin[m_daughter2-1][2], pKin[m_daughter2-1][3], pKin[m_daughter2-1][0]); return complex<GDouble>(1.0,0.0) / complex<GDouble>((P1+P2).M2() - m_mass*m_mass, m_mass*m_width); }
GlueX/PANDA Workshop May 4, 2019
10
fit dalitz1 reaction dalitz p1 p2 p3 genmc dalitz DalitzDataReader phasespace.gen.root accmc dalitz DalitzDataReader phasespace.acc.root data dalitz DalitzDataReader physics.acc.root normintfile dalitz dalitz1.ni sum dalitz s1 amplitude dalitz::s1::R12 BreitWigner 1.000 0.200 1 2 amplitude dalitz::s1::R13 BreitWigner 1.500 0.150 1 3 initialize dalitz::s1::R12 cartesian 1.0 0.0 initialize dalitz::s1::R13 cartesian 1.0 0.0 real
GlueX/PANDA Workshop May 4, 2019
11
// ************************ // parse the config file // ************************ ConfigFileParser parser(cfgname); ConfigurationInfo* cfgInfo = parser.getConfigurationInfo(); cfgInfo->display(); // ************************ // AmpToolsInterface // ************************ AmpToolsInterface::registerAmplitude(BreitWigner()); AmpToolsInterface::registerDataReader(DalitzDataReader()); AmpToolsInterface ATI(cfgInfo); MinuitMinimizationManager* fitManager = ATI.minuitMinimizationManager(); fitManager->setStrategy(1); fitManager->migradMinimization();
GlueX/PANDA Workshop May 4, 2019
12
fit dalitz3 reaction dalitz p1 p2 p3 genmc dalitz DalitzDataReader phasespace.gen.root accmc dalitz DalitzDataReader phasespace.acc.root data dalitz DalitzDataReader physics.acc.root normintfile dalitz dalitz3.ni sum dalitz s1 amplitude dalitz::s1::R12 BreitWigner [M12] [G12] 1 2 amplitude dalitz::s1::R13 BreitWigner [M13] [G13] 1 3 initialize dalitz::s1::R12 cartesian 1.0 0.0 initialize dalitz::s1::R13 cartesian 1.0 0.0 real parameter M12 1.000 parameter G12 0.200 parameter M13 1.500 parameter G13 0.150
GlueX/PANDA Workshop May 4, 2019
13
GlueX/PANDA Workshop May 4, 2019
intensity, each with about m2 terms, where n is the number of data events and m is the number of amplitudes
single CPU
cost by orders of magnitude
14
GlueX/PANDA Workshop May 4, 2019
intensity, each with about m2 terms, where n is the number of data events and m is the number of amplitudes
single CPU
cost by orders of magnitude
14
GlueX/PANDA Workshop May 4, 2019
(couplings) are free parameters
event and contains no free parameters
analysis”
15
2
Events / 15 MeV/c
0.5 1 1.5 2 2.5 5000 10000 15000 20000 25000 30000
]
2
) [GeV/c π π Mass(
2
Events / 15 MeV/c
0.5 1 1.5 2 2.5 2000 4000 6000 8000 10000 12000 14000 16000 18000
]
2
) [GeV/c π π Mass( (a) 0++ (b) 2++ E1
GlueX/PANDA Workshop May 4, 2019
(couplings) are free parameters
event and contains no free parameters
analysis”
PDF (that numerically integrated using MC) factorize enabling rapid recompilation with each iteration
15
2
Events / 15 MeV/c
0.5 1 1.5 2 2.5 5000 10000 15000 20000 25000 30000
]
2
) [GeV/c π π Mass(
2
Events / 15 MeV/c
0.5 1 1.5 2 2.5 2000 4000 6000 8000 10000 12000 14000 16000 18000
]
2
) [GeV/c π π Mass( (a) 0++ (b) 2++ E1
GlueX/PANDA Workshop May 4, 2019
(couplings) are free parameters
event and contains no free parameters
analysis”
PDF (that numerically integrated using MC) factorize enabling rapid recompilation with each iteration
first fit iteration: acceleration of core library benefits all users!
15
2
Events / 15 MeV/c
0.5 1 1.5 2 2.5 5000 10000 15000 20000 25000 30000
]
2
) [GeV/c π π Mass(
2
Events / 15 MeV/c
0.5 1 1.5 2 2.5 2000 4000 6000 8000 10000 12000 14000 16000 18000
]
2
) [GeV/c π π Mass( (a) 0++ (b) 2++ E1
GlueX/PANDA Workshop May 4, 2019
f2(1270) mass
with each fit iteration
16
f2(1270) contribution
]
2
) [GeV/c Mass( 0.5 1 1.5 2 2.5 3 Events / 15 MeV 2000 4000 6000 8000 10000 12000
J/ψ➞γπ0π0
GlueX/PANDA Workshop May 4, 2019
f2(1270) mass
with each fit iteration
amplitude depends on a parameter in the fit and must be recomputed when the parameter changes
16
f2(1270) contribution
ln L =
Ndata
X
i=1
ln @
Namps
X
α,β
VαV ∗
β Aα(~
xi)Aβ(~ xi)∗ 1 A − 1 N gen
MC Namps
X
α,β
VαV ∗
β
@
N acc
MC
X
j=1
Aα(~ xj)Aβ(~ xj)∗ 1 A
not fixed: depends on parameters
N acc
MC ≈ 10Ndata
]
2
) [GeV/c Mass( 0.5 1 1.5 2 2.5 3 Events / 15 MeV 2000 4000 6000 8000 10000 12000
J/ψ➞γπ0π0
GlueX/PANDA Workshop May 4, 2019
f2(1270) mass
with each fit iteration
amplitude depends on a parameter in the fit and must be recomputed when the parameter changes
time: must provide users with tools to
and parameter change callback functions)
16
f2(1270) contribution
ln L =
Ndata
X
i=1
ln @
Namps
X
α,β
VαV ∗
β Aα(~
xi)Aβ(~ xi)∗ 1 A − 1 N gen
MC Namps
X
α,β
VαV ∗
β
@
N acc
MC
X
j=1
Aα(~ xj)Aβ(~ xj)∗ 1 A
not fixed: depends on parameters
N acc
MC ≈ 10Ndata
]
2
) [GeV/c Mass( 0.5 1 1.5 2 2.5 3 Events / 15 MeV 2000 4000 6000 8000 10000 12000
J/ψ➞γπ0π0
GlueX/PANDA Workshop May 4, 2019
17
Node 1
Node 2
Node n
Master Node
Optional GPU acceleration Finely parallel: typically one CUDA thread per event or amplitude Coarsely parallel:
GPU Parameters and partial sums exchanged over network layer Minimizing algorithm runs here
no constraint on GPU/CPU ratio or number of GPU or CPU per node
GlueX/PANDA Workshop May 4, 2019
by the user — useful templates and interfaces provided:
18
AmpToolsInterface::registerAmplitude(BreitWigner()); AmpToolsInterface::registerDataReader(DalitzDataReader()); AmpToolsInterfaceMPI::registerAmplitude(BreitWigner()); AmpToolsInterfaceMPI::registerDataReader(DataReaderMPI<DalitzDataReader>());
example is provided in the Dalitz tutorial
calculations that can run the GPU
GlueX/PANDA Workshop May 4, 2019
“device” (GPU) and “host” (CPU) code
managed through cudaMalloc(...), cudaMemcpy(...), etc.
calculate their amplitudes on either the GPU or CPU
easier for users to write code
consistent pieces of code: one in CUDA and one C++
19
__global__ void GPUBreitWigner_kernel( GPU_AMP_PROTO, GDouble mass0, GDouble width0, GDouble spin ){ int iEvent = GPU_THIS_EVENT; GDouble dV1[4] = GPU_P4(2); GDouble dV2[4] = GPU_P4(3); GDouble mass = SQ( dV1[0] + dV2[0] ); GDouble mass1 = SQ( dV1[0] ); GDouble mass2 = SQ( dV2[0] ); for( int i = 1; i <= 3; ++i ){ mass -= SQ( dV1[i] + dV2[i] ); mass1 -= SQ( dV1[i] ); mass2 -= SQ( dV2[i] ); } GDouble F = barrierFactor( q, spin ); mass = G_SQRT( mass ); mass1 = G_SQRT( mass1 ); mass2 = G_SQRT( mass2 ); WCUComplex bwTop = { G_SQRT( mass0 * width0 / 3.1416 ), 0 }; WCUComplex bwBot = { SQ( mass0 ) - SQ( mass ), -1.0 * mass0 * width0 }; pcDevAmp[iEvent] = ( F * bwTop / bwBot ); } void GPUBreitWigner_exec( dim3 dimGrid, dim3 dimBlock, GPU_AMP_PROTO, GDouble mass, GDouble width, int spin ) { GPUBreitWigner_kernel<<< dimGrid, dimBlock >>> ( GPU_AMP_ARGS, mass, width, spin ); }
compiled with NVIDIA’s compiler: nvcc, linked into standard C/C++ code
code to compute Breit-Wigner amplitude for one event parallel invocation here, called from core C++ fitting code
GlueX/PANDA Workshop May 4, 2019
Kepler GK110 GPU (4800 MB global memory)
fitting)
20
For small scale GPU applications one can get significant performance/cost using commodity hardware
GlueX/PANDA Workshop May 4, 2019
Kepler GK110 GPU (4800 MB global memory)
fitting)
constraints exist)
20
For small scale GPU applications one can get significant performance/cost using commodity hardware
GlueX/PANDA Workshop May 4, 2019
21
First 17 s of an 8-core parallel fit (the easier one) ln L =
Ndata
X
i=1
ln @
Namps
X
α,β
VαV ∗
β Aα(~
xi)Aβ(~ xi)∗ 1 A − 1 N gen
MC Namps
X
α,β
VαV ∗
β
@
N acc
MC
X
j=1
Aα(~ xj)Aβ(~ xj)∗ 1 A user code: amplitude calculations
AmpTools code AmpTools code
GlueX/PANDA Workshop May 4, 2019
22
First few iterations of a 16 core CPU fit with a floating parameter (the harder one)
ln L =
Ndata
X
i=1
ln @
Namps
X
α,β
VαV ∗
β Aα(~
xi)Aβ(~ xi)∗ 1 A − 1 N gen
MC Namps
X
α,β
VαV ∗
β
@
N acc
MC
X
j=1
Aα(~ xj)Aβ(~ xj)∗ 1 A
user code: amplitude calculations
GlueX/PANDA Workshop May 4, 2019
23
0.1$ 1$ 10$ 100$ 1000$ 1$ 5$ 9$ 13$ 17$ 21$ 25$ 29$ 33$
Fit$Speed$$[Func-on$Call$Rate$(Hz)]$ Number$of$AMD$Cores$or$NVIDIA$K20$GPUs$ CPU$ GPU$ CPU$Strong$Scaling$Limit$ GPU$Strong$Scaling$Limit$
GPU “steps” are a correctable artifact of event padding to the nearest power of 2
“The Easier Fit” big data set with fixed amplitudes
GlueX/PANDA Workshop May 4, 2019
23
0.1$ 1$ 10$ 100$ 1000$ 1$ 5$ 9$ 13$ 17$ 21$ 25$ 29$ 33$
Fit$Speed$$[Func-on$Call$Rate$(Hz)]$ Number$of$AMD$Cores$or$NVIDIA$K20$GPUs$ CPU$ GPU$ CPU$Strong$Scaling$Limit$ GPU$Strong$Scaling$Limit$
GPU “steps” are a correctable artifact of event padding to the nearest power of 2
“The Easier Fit” big data set with fixed amplitudes
0.1$ 1$ 10$ 100$ 1$ 10$ 100$ 1000$
Fit$Speed$$[Func-on$Call$Rate$(Hz)]$
$
Number$of$AMD$Cores$or$NVIDIA$K20$GPUs$
CPU$ GPU$ CPU$Strong$Scaling$Limit$ GPU$Strong$Scaling$Limit$
“The Harder Fit” same big data set with a free parameter in the amplitudes
GlueX/PANDA Workshop May 4, 2019
24
GlueX/PANDA Workshop May 4, 2019
utility objects
compute the intensity for an event has broad utility
calculated intensity
projections of MC to data
key variables
acceleration to speed generation of projections
utilize the toolset — examples provided to follow
25
GlueX/PANDA Workshop May 4, 2019
26
AmpTools enforces standardized class structures for amplitudes, which facilitate code sharing with theorists.
GlueX/PANDA Workshop May 4, 2019
27
GlueX/PANDA Workshop May 4, 2019
28
GlueX/PANDA Workshop May 4, 2019
defining the location of AMPTOOLS, AMPPLOTTER, and DALITZ in your environment
some executables
28
GlueX/PANDA Workshop May 4, 2019
defining the location of AMPTOOLS, AMPPLOTTER, and DALITZ in your environment
some executables
decays,” Phys. Rev. D 92, 052003 (2015).
28
GlueX/PANDA Workshop May 4, 2019
fits
uncertainties is sometimes very bad, especially for the tails of the confidence interval — consider MINUIT or MINOS errors a suggestion
29
ψ J/ π
g
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ψ J/ π
/g
DD*
g
10 20 30 40 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 (a)
Once troublesome parameters are identified, one can examine how the likelihood depends on these parameters.
GlueX/PANDA Workshop May 4, 2019
fits
uncertainties is sometimes very bad, especially for the tails of the confidence interval — consider MINUIT or MINOS errors a suggestion
data generated to match real data
sampling with replacement) to explore uncertainties on fit parameters
29
ψ J/ π
g
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ψ J/ π
/g
DD*
g
10 20 30 40 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 (a)
Once troublesome parameters are identified, one can examine how the likelihood depends on these parameters.
GlueX/PANDA Workshop May 4, 2019
30
GlueX/PANDA Workshop May 4, 2019
30
GlueX/PANDA Workshop May 4, 2019
that the user needs to develop
installations
30
GlueX/PANDA Workshop May 4, 2019
that the user needs to develop
installations
30
GlueX/PANDA Workshop May 4, 2019
implement arbitrary amplitudes in a robust and optimized way
collaborations
pressing physics needs rather than software ideas)
31
GlueX/PANDA Workshop May 4, 2019
implement arbitrary amplitudes in a robust and optimized way
collaborations
pressing physics needs rather than software ideas)
examples and talking with developers is easy (a reference manual is under active development)
problem
31