The need for tools A tool is a device that can be used to produce - - PowerPoint PPT Presentation

the need for tools
SMART_READER_LITE
LIVE PREVIEW

The need for tools A tool is a device that can be used to produce - - PowerPoint PPT Presentation

Tools for ILDG Dr Chris Maynard Application Consultant, EPCC c.maynard@ed.ac.uk +44 131 650 5077 The need for tools A tool is a device that can be used to produce an item or achieve a task, but that is not consumed in the process Wrong sort


slide-1
SLIDE 1

Dr Chris Maynard Application Consultant, EPCC c.maynard@ed.ac.uk +44 131 650 5077

Tools for ILDG

slide-2
SLIDE 2

The need for tools

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 2

A tool is a device that can be used to produce an item or achieve a task, but that is not consumed in the process Wrong sort of tool can produce poor results, or not scale to larger problems

slide-3
SLIDE 3

Lattice 2009 Beijing, I said …

How do we access our data?

– In the same way we did a decade ago – ssl terminal client (ssh) and copy protocol (scp)

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 3

We really need some tools!

  • Data explosion

– Data volumes – Tbytes, Pbytes soon – Data complexity – many ensemble, many measurements – Rise of the mega collaboration – Globally distributed {machines, data, people}

slide-4
SLIDE 4

Tools

  • Globus online (Monday)

– Reliable Data Movement via SaaS Raj Kettimuthu

  • Web2py (Poster)

– Poster A new user interface for the Gauge Connection lattice data archive, M. Di Pierro, J. Hetrick, D. Skinner, and S. Cholia – plus demo after this talk

  • LATFOR grid tools, Dirk Pleiter et al. ildg-get, web client
  • UKQCD Ildg-browser
  • JLQCD faceted web client
  • Metadata capture project

– EPCC and Tsukuba University – T. Amagasa, M.G. Beckett, C.M. Maynard, J. Perry, T. Yoshie

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 4

slide-5
SLIDE 5

LATFOR tools

  • ildg-get can access data, metadata, and ILDG services

– need to know LFN, or markovChainURI of the metadata

  • Metadata webclient
  • http://www-zeuthen.desy.de/latfor/ldg/doc/swinstall.html

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 5

slide-6
SLIDE 6

JLDG

  • Faceted browsing
  • http://www.jldg.org/facetnavi/

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 6

slide-7
SLIDE 7

UKQCD ILDG-browser

  • MDC GUI client

– Self-contained Java application, runs on Windows/Mac/Linux.

  • Allows users to:

– GUI to construct queries to MDC – Search Metadata – Store queries – Retrieve metadata

  • Does not have data access

– use browser to find the Logical File Name (LFN) – Get data with ildg-get

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 7

slide-8
SLIDE 8

UKQCD ILDG-browser demo

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 8

slide-9
SLIDE 9

Metadata capture

  • Tools thus described are for accessing ILDG services

– they exist and are useful

  • No tools for metadata capture

– Ensuring data provenance is difficult – are there degrees of provenance?

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 9

  • QCD production codes are

highly optimised

– run on highly diverse (and bespoke) architectures

  • Require lightweight process to

ease pain of post-processing data Hard Work

slide-10
SLIDE 10

ETMDC

  • Edinburgh - Tsukuba Metadata capture project

– T. Amagasa, M.G. Beckett, C.M. Maynard, J. Perry, T. Yoshie

  • Explore workflow as a mechanism for MDC
  • Edinburgh funded by

– OMII-UK – Software Sustainability Institute – Edinburgh Global (UoE)

  • End product

– Demonstrator - universal metadata capture tool for ILDG – Linux/Unix environment – Python, XSLT, make – QCD utils – some hints from QCD code gen

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 10

slide-11
SLIDE 11

MDC design criteria

  • Considered workflow tools

– Metadata generated and manipulated as part of data generation process – Examples: Kepler, Taverna, Ruby – QCD ConfGen Jim Simone’s FNAL group

  • Complex tools with rich functionality

– Will they run in bespoke QCD environment

  • Lightweight is key criterion

– opted for simplest solution – build demonstrator out of most commonly available components – Used make to manage dependencies, but could upgrade to Kepler

  • Used two example codes

– JLQCD, CPS

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 11

slide-12
SLIDE 12

Metadata

  • ALL QCD codes output meaningful metadata

– plus input parameter files – system size, physical parameters, quark, gluon couplings – algorithmic parameters, step size – measured quantities, plaquette, checksums etc – state information, user, code version, machine information – Gauge configuration file

  • No scheme for organising this information

– parse and process this information

  • Add some minimal mark-up to information already produced

– some hints for the tool

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 12

slide-13
SLIDE 13

Hints

  • Add simple markup to output

– easy for user to implement – its just plain text – gives tool something to work with

  • simple @ILDG tag for interesting information in plain text files
  • Examples:

@ILDG:codeVersion "v4.0" @ILDG:checksum 475303070

  • Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA

13

slide-14
SLIDE 14

User input

  • QCDml Ensemble ID [XML]

– written by human once per ensemble

  • gauge configuration files
  • log files with hints
  • Curator metadata file (CMF)

– where are the data, log files etc

  • MDC demonstrator will do the rest!

– Two main components – Configuration File generator – Configuration XML generator

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 14

slide-15
SLIDE 15

MDC architecture

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 15

slide-16
SLIDE 16

Example CMF

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 16

<CMF> <Ensemble> <EnsembleIDFileName>ensemble1.xml</EnsembleIDFileName> </Ensemble> <Configuration> <ConfigurationUpdateStart>1000</ConfigurationUpdateStart> <ConfigurationUpdateStep>10</ConfigurationUpdateStep> <ConfigurationUpdateEnd>1230</ConfigurationUpdateEnd> <ConfigurationFileName>config.%04</ConfigurationFileName> <ConfigurationILDGFileName>configILDG.%04</ConfigurationILDGFileName> <ConfigurationPrecisionILDG>64</ConfigurationPrecisionILDG> </Configuration> </CMF>

specify batch processing of configurations @ILDG:UpdateStart and @ILDG:UpdateEnd to delimit information in log file format string-style pattern to specify file name

slide-17
SLIDE 17

Configuration File Generator

  • Two components

– XSLT transform creates CaPU XML from – Ensemble XML ID – CMF

  • Conversion and Packing Utility (CaPU)

– specific to collaboration, but has common interface – converts data to ILDG format – measures plaquette, CRC checksum etc – writes Configuration Information File (CIF) (above + LFN)

  • UKQCD based on qdp++ utility

– if qdp++ can read your data, easy to modify the CaPU

  • JLQCD is shell script + data conversion

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 17

slide-18
SLIDE 18

Configuration XML Generator

  • Creates the QCDml config ID
  • Several components - Python
  • Extract configuration specific information

– from CMF, CIF and log files

  • Consistency and completeness checker

– Do I have all the information I need? – Do the sources of metadata agree? – am I processing the data I think I am? Provenance

  • Include collaboration specific information

– e.g. VML from CPS

  • Write the XML

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 18

calculated plaquette = logfile plaquette

slide-19
SLIDE 19

Summary

  • MDC Demonstrator

– Using common linux/unix tools/software to build components – Can automatically post-process data into QCDml

  • Others can use or adapt demonstrator

– simple modifications to output of QCD code – simple modifications to CaPU

  • Can be downloaded from

ILDG web site

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 19

slide-20
SLIDE 20

Conclusions

  • ILDG – we need tools
  • There are tools out there

– useful!

  • More groups are developing tools
  • If you need help get in touch
  • Share experiences
  • Neolithic  bronze age

– cross over or 1st order transition?

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 20

slide-21
SLIDE 21

NERSC gauge connection

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 21

slide-22
SLIDE 22
  • http://tests.web2py.com/ildg/default/index

Tools for ILDG: Lattice 2011 15/07 Squaw Valley, CA 22