Lecture 1.1 What is PAT and How to use it? Content A short reminder - - PowerPoint PPT Presentation

lecture 1 1 what is pat and how to use it
SMART_READER_LITE
LIVE PREVIEW

Lecture 1.1 What is PAT and How to use it? Content A short reminder - - PowerPoint PPT Presentation

Lecture 1.1 What is PAT and How to use it? Content A short reminder of the CMS EDM and Analysis Workflow The answer to the question: What is PAT? An introduction to the PAT DataFormats Configuration of the PAT DataFormats An


slide-1
SLIDE 1

Lecture 1.1

PAT Tutorial June 2010

  • A short reminder of the CMS EDM and Analysis Workflow

Content

  • The answer to the question: What is PAT?
  • An introduction to the PAT DataFormats
  • Configuration of the PAT DataFormats
  • An introduction to the PAT Workflow
  • Support and Documentation

What is PAT and How to use it?

slide-2
SLIDE 2

Reminder of the Event Data Model

  • Configurable edm::Modules communicate with/via the EventContent
  • Same file structure (i.e. root) for: Gen-Sim-Digi-Reco-Analysis
  • Single framework for Reconstruction (POGs) and Analysis (PAGs)
slide-3
SLIDE 3

Typical CMS Analysis Workflow

  • Prompt reconstruction at Tier-0.
  • Central skims at Tier-1's.
  • Users run cmsRun at Tier-2's:
  • Perform high level analysis steps.
  • Preselect events.
  • Write their own user defined

EventContent to private T2/T3 space.

  • The latter step might be iterated.
  • Copy reduced datasets to your

favorite machine.

  • Run your final analysis/produce plots.

user-defined EventContent PAT helps you to create a

slide-4
SLIDE 4

What is the Physics Analysis Toolkit

  • It serves as well tested and supported common ground for group and user

PAT is a toolkit as part of the CMSSW framework

  • It facilitates reproducibility and comprehensibility of analyses.

analyses.

  • It is an interface between the sometimes complicated EDM and the simple

mind of the common user.

  • You can view it as a common language between CMS analysts:
  • If another CMS analyst describes you a PAT analysis you can easily know

what he/she is talking about

slide-5
SLIDE 5

Three Aspects of PAT

  • simplifies access via DataFormats
  • b/w RECO expertise & Analysis Level

contacts)

  • crossing point between POGs & PAGs

Interface

  • canalizes expertise (via POG & PAG

('vertical integration')

Common Tool

  • quick start into analysis for beginners
  • approved algorithms & sensible defaults
  • synergy (everybody can profit from

recent developments)

Common Format

  • facilitates transfer & comparisons
  • sustained provenance
  • PAG common configurations
slide-6
SLIDE 6

Facilitated Access to Event Information

  • Note: Each PAT Candidate IS a corresponding reco::RecoCandidate (and more)
  • Do you know how to access this event information within the EDM?

Isolation (different from defaults) Object Id, Cluster shapes BTag Algorithms, TagInfos Associated Tracks, JetCharge JetFlavor Generator Match, Trigger Match Correction Factors, Object Resolutions More, ...

reco::Candidate

  • With PAT Candidates you get this just by calling member functions!
slide-7
SLIDE 7

The PAT Data Formats

  • A PAT Candidate is a reco::RecoCandidate PLUS more.
  • All pat::Objects inherit from their corresponding reco::RecoCandidates
slide-8
SLIDE 8

PAT Candidate Member Functions

Check the Documentation: SWGuidePATDataFormats

slide-9
SLIDE 9

Flexibility User Friendliness Maximal Configuration

Combine Flexibility and User Friendliness

  • You can choose yourself whether you really need all the extra information

that the PAT Candidates provide.

  • Still you don't need to know, how EDM/PAT manages this access for you under

the hood.

  • The key is: configuration of DataFormats by cfi file! (E.g. for pat::Jets).
slide-10
SLIDE 10

Configuration of PAT DataFormats

import FWCore.ParameterSet.Config as cms patJets = cms.EDProducer("PATJetProducer", ... # embedding of AOD items embedCaloTowers = cms.bool(False), embedPFCandidates = cms.bool(False), # jet energy corrections addJetCorrFactors = cms.bool(True), jetCorrFactorsSource = cms.VInputTag("patJetCorrFactors"), # btag information addBTagInfo = cms.bool(True), addDiscriminators = cms.bool(True), discriminatorSources = cms.VInputTag( ... ), # clone tag infos ATTENTION: these take lots of space! # usually the discriminators from the default algos # are sufficient addTagInfos = cms.bool(True), tagInfoSources = cms.VInputTag( ... ), # track association addAssociatedTracks = cms.bool(True), trackAssociationSource = "ak5JetTracksAssociatorAtVertex", # jet charge addJetCharge = cms.bool(True), jetChargeSource = cms.InputTag("patJetCharge"), # add jet ID addJetID = cms.bool(True), jetIDMap = cms.InputTag("ak5JetID"),

Size: 14kb/event (for ttbar) You can configure the content of the DataFormats yourself (example: pat::Jet)!

slide-11
SLIDE 11

The PAT Workflow

Have a look at:

Resembled by the structure of the python directory Pre-Production steps PAT Candidate creation Main collection (w/o cleaning) Main collection (with cleaning)

SWGuidePATWorkflow

before PAT Candidate creation in the PatAlgos package (don't be shy, check it out!)

slide-12
SLIDE 12

EventContent of the default PAT Tuple

  • Have a look to patEventContent_cff.py:
  • Have a look to patTemplate_cfg.py:

Size: 20kb/event (for ttbar)

  • But decide yourself how your PAT Tuple should look like (add reco::Tracks or

reco::GenParticles to the Event Content or BTag information to the jets, etc ... ).

slide-13
SLIDE 13

The concept of Maximal Configuration

  • Configure your own

DataFormats via embedding (see Lecture 2.2/Exercise 06).

  • Configure your workflow via

tools that PAT provides (see Lecture 2.1/Exercise 05).

  • Apply selections via the

StringCutParser.

  • Add any extra info

you need the the EventContent.

slide-14
SLIDE 14

The Code Location

  • Definition of all PAT Candidates.

DataFormats/PatCandidates PhysicsTools/PatAlgos PhysicsTools/PatUtils PhysicsTools/PatExamples

  • pat::Photon, pat::Electron, pat::Muon, pat::Tau, pat::Jet, pat::MET, ...
  • Implementation and filling of all data formats.
  • Definition of common workflow and PAT tools.
  • Definition of common tools and helper functions used in PatAlgos.
  • Location of many examples e.g. all non-trivial examples used during

this Tutorial.

slide-15
SLIDE 15

Development

PAT is part of any CMSSW release. We recommend to use it from the release!

Have a look at: SWGuidePATRecipes

slide-16
SLIDE 16

Development (cont'd)

In case you want already to use features/fixes that will go into the next release follow the Pat release notes in the corresponding development branch.

slide-17
SLIDE 17

Support

  • Hypernews
  • Community
  • POG/PAG contacts
  • Developers
  • Lecturers & Tutors

Check the the main entry page of PAT in the software guide: SWGuidePAT

A short extract of possible support:

  • The quite developed PAT

Documentation!

slide-18
SLIDE 18

Documentation

  • SWGuidePAT/WorkBookPAT
  • SWGuidePATRecipes
  • WorkBookPATTutorial
  • WorkBookPATDataFormats
  • WorkBookPATConfiguration
  • SWGuidePATEventSize
  • SWGuidePATTools

And last but not least: This Tutorial and/or former Tutorials...

Main documentation pages Installation recipes Tutorials and examples to get started. Description of all PAT Candidate. Description of the configuration of PAT. Tools for event size estimate Description of all PaT tools.

  • WorkBookPATWorkflow

Description of the PAT workflow.

slide-19
SLIDE 19

Exercises

By now you should be prepared to do the following Exercises on WorkBookPATTutorial: Have Fun!

  • Exercise 1 (WorkBookPATDocNavigationExercise)

The PAT Documentation is one of the most looked after parts of the WorkBook. To know your documentation and how to use it can speed up your learning curve

  • enormously. Learn more about the PAT Documentation and how to make effective

use of it.

  • Exercise 2 (WorkBookTupleCreationExercise)

Learn how the default PAT tuple is produced to be prepared to produce your own PAT tuples.

  • Exercise 3 (WorkBookTupleCrapExercise)

This is the part of the crab tutorial. Once you are doing large sceal analyses you will need crab.