Scientific computing challenges at ESS High event rate >10^7 - - PowerPoint PPT Presentation

scientific computing challenges at ess
SMART_READER_LITE
LIVE PREVIEW

Scientific computing challenges at ESS High event rate >10^7 - - PowerPoint PPT Presentation

Scientific computing challenges at ESS High event rate >10^7 8GB/min/instrument average Complex detector geometry Software DAQ solution 1 Mantid development at the ESS Construction phase Core framework development MPI


slide-1
SLIDE 1

Scientific computing challenges at ESS

High event rate >10^7 8GB/min/instrument average Complex detector geometry Software DAQ solution

1

slide-2
SLIDE 2

Mantid development at the ESS

  • Construction phase
  • Core framework development
  • MPI distribution
  • Maintainability
  • Integration with data acquisition & experiment

control

  • Contributions to project (4.0 development)
  • Operations (post 2019)
  • Instrument / class functionality development
  • Deployment
  • ESS specific data corrections

2

slide-3
SLIDE 3

ESS Mantid usage

  • User workbench for
  • Data reduction for event mode and histograms
  • Data visualisation & inspection
  • Simple fitting
  • Integration with ECP
  • Integration with cluster based reduction

3

slide-4
SLIDE 4

DMSC

Data Systems
 & Technologies 6FTE Instrument Data 8FTE Data Analysis
 & Modeling 11FTE

Copenhagen Data Centre DMSC servers in Lund Clusters, Workstations Disks, Parallel File System, Database Servers Networks (incl. Lund – CPH)
 Data transfer, Back-up & Archive External facing Servers User Program Software – Proposal & Scheduling Systems Instrument Control User
 Interfaces Live Visualization Data reduction (MANTID) Analysis codes MCSTAS support + dev.

DAQ & Data 
 Management 13FTE

Detector readout Data Acquisition File writers (NeXus) Data curation 4

DMSC Organisation construction phase

Project admin 3FTE

Project support Budget & schedule Meeting organisation

Construction budget 20mio euro + SINE2020 (40pm) Brightness (23py)

slide-5
SLIDE 5

DMSC

Data Systems
 & Technologies 10FTE Data reduction and data analysis 10FTE +10FTE Scientific modelling and simulation 4FTE

Storage and compute Data centre operations Lund Data centre operation CPH Inter site connection & network Software Deployment Security Data reduction & visualisation support and development Data analysis support & development Support and integration of MD and a-priori simulation tools

Experiment control Data curation 13FTE

Readout DAQ and Control Data management Data curation

5

Moving into operations

Project admin (Petra Aulin) 3FTE

Project support Budget & schedule Meeting

  • rganisation

User office support 4FTE

Development, support and maintenance of user office solution User database proposal system Visit system sample tracking

  • Construction - Build capacity and capability to support a user programme
  • Operations - Deliver a supported user programme
slide-6
SLIDE 6

Detector data interface FEA - FE-BE Event formation unit(s) (Pixel positioning)

Detector signal

Frames of Events & Aggregation of meta data Detectors

Fast SE & motion i.e With a latency that is outside of spec for EPICS

Fast SE data

High speed environment data interface Neutron data and meta data Aggregator Mantid subscriber

File writing service

Choppers

Control Box Control Box

DM group

ICS

Detector group

Data flow from Detectors, SE, choppers and motion control to disk, Mantid & analysis

Time stamped signals Time stamped signals Instrument PV access gateway

ERC ERC ChiC

Sample environment Motion axis

PLC layer

Apache Kafka

ERC ERC

catalogue

IOC(s) IOC(s)

Python based experiment control system & visualisation

  • f detector data.DAQ

control & configuration DB access

Reduction / Analysis API

Data reduction automated reduction data correction visualisation of reduced data

PFS API config

Kafka

Analysis interface Data analysis codes visualisation of analysed data

API config File Nexus Processed Analysis file in

DOI WEB

Accelerator data PSS Status User Office DB

Kafka

ERC Event receiver card

Fast SE readout

ID group DAM group DST group

ERC

Chopper & motion & SE

ICS software, CCDB ,IOC factory naming, SCR

slide-7
SLIDE 7

ESS data pipeline

7

BrightnESS is funded by the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 676548

DMSC Hardware Instrument Hardware

Motion Chopper

Detector Backend Detector Backend

Event Formation Unit Event Formation Unit

EPICS Bridge

Data Aggregation (Kafka)

NeXus File Writing

Live Feedback

MRF MRF

EPICS

10 GB/s fibre Experiment control

Mantid reduction MPI + Kafka listener Mantid automatic reduction Instrument view of live data

slide-8
SLIDE 8

Proof of concept

  • Streaming a NeXus file through Kafka to Mantid
  • Scaleable performance

~4x107 events/s

slide-9
SLIDE 9

The Mantid Project

  • Neutron specific data treatment framework
  • Standardised beyond data format
  • Event data capable
  • Live view
  • Complete instrument geometry
  • nD data visualisation
  • Data and software curation
  • App based UI
  • Python interace
  • Jupyter notebook
  • MPL graphing

9

9

slide-10
SLIDE 10

Software sustainability institute review of Mantid.

  • Keep the GUI - improve it
  • Increase maintainability
  • Increase stability & performance

10

slide-11
SLIDE 11

Mantid Performance requirements

Requirements - live data reduction for an event rate of > 10^7 Filter good events from bad events Capability for handling of complex geometries

11

slide-12
SLIDE 12

Safety first

12

Functional safety Developer freedom

The road to success is paved with …

slide-13
SLIDE 13

In collaboration with STFC

HistogramData

13

  • Started by Simon Heybrock,
  • bug found in ConvertUnits
  • Improves speed and memory overhead
  • Conceptual correctness and type safety
  • Introduced in Mantid 3.8 (October 2016)
  • Now rolled out to Mantid framework and to be

distributed as part of Mantid 3.10.0 with large collaboration effort.

slide-14
SLIDE 14

Mantid development objectives

  • Mantid has
  • Geometry
  • Data types
  • algorithms
  • Create a common

MPI implementation

  • Introduce type

safety

14

slide-15
SLIDE 15

Initial MPI tests

MPI more effective than threads. A number of ways to achieve load balance We balance on spectra

15

slide-16
SLIDE 16

MPI bottlenecks, Instrument 2.0

  • Instrument stores geometry and meta data.
  • r,t,p spectrum ID map, isMasked…
  • Complex detector geometries
  • Described within the framework
  • Current Instrument implementation is rate

not nice

  • Parameter map is large /complex
  • Organically developed

16

PixID # Spectrum #

slide-17
SLIDE 17

Instrument 2.0 refactor

  • Reengineered parameter map implementation
  • For current & future use cases
  • MPI uses in particular
  • Scanning instruments
  • Cached info layers
  • Detector
  • Spectra
  • Mask
  • Cleaner interface for developers

17

slide-18
SLIDE 18

so far so good…

  • Highlights the cost of refactoring
  • Significant improvements across Mantid
  • ILL can load interlaced scans from D2B into Matrix
  • Direct geometry workflow x2 faster

18

Runtime for unit conversion Runtime for unit Mask dets

slide-19
SLIDE 19

In collaboration with STFC

Step Scanning Instruments

19

  • Core framework changes complete as of March 2017
  • Performed as part of sub collaboration with ILL
  • Intended to be shipped as part of Mantid 3.10.0 in June

2017

slide-20
SLIDE 20

Next steps

  • Production MPI design
  • Matrix workspace
  • MD workspace
  • Integration of DAQ into ECP
  • User experience
  • How will users interact with our DAQ,

reduction and analysis systems at ESS

20