Cylc from the NCAS point of view (NCAS Experience with Rose/Cylc) - - PowerPoint PPT Presentation

cylc from the ncas point of view ncas experience with
SMART_READER_LITE
LIVE PREVIEW

Cylc from the NCAS point of view (NCAS Experience with Rose/Cylc) - - PowerPoint PPT Presentation

Cylc from the NCAS point of view (NCAS Experience with Rose/Cylc) Rosalyn Hatcher, Annette Osprey, NCAS-CMS (Grenville Lister) Joint final IS-ENES2 workshop on Workflow Solutions in Earth System Modelling and Meta-Data Generation during


slide-1
SLIDE 1

Cylc from the NCAS point of view

(NCAS Experience with Rose/Cylc)

Rosalyn Hatcher, Annette Osprey, NCAS-CMS (Grenville Lister)

Joint final IS-ENES2 workshop on Workflow Solutions in Earth System Modelling and Meta-Data Generation during Experiments, Lisbon, Sept 2016

slide-2
SLIDE 2

NCAS Rose/Cylc - Outline

  • NCAS-CMS – who we are
  • Workflow - Historical Overview
  • Rose/Cylc

– Platforms – Suites – Management – Training – Future directions

slide-3
SLIDE 3

NCAS Rose/Cylc – who we are

NCAS Computational Modelling Services - http://cms.ncas.ac.uk

slide-4
SLIDE 4

NCAS Rose/Cylc - Historical

Commits/Extracts Working copies Job configuration Job submission Institutional compute/storage Code repositories trunk mirrored from MO/IPSL UMUI (jobs?) Local to PUMA POST PROCESSING COMPUTE ARCHER MONSooN Polaris (Leeds) Mobilis (NOC) HPC Wales JASMIN LOTUS, SCI VMs ARCHER PP, RDF cluster, JASMIN VM MONSooN PP MASS

slide-5
SLIDE 5

Atmosphere model Dynamical core Physics Diagnostics (STASH) [JULES and/or UKCA]

UM software (pre vn9.0)

UMUI Database of user jobs Graphical job editor Reconfiguration Prepares initial model state Output file tools Data processing Analysis and visualisation PUMA HPC Local / Jasmin OASIS Coupler Ocean model Input file tools Prepare ancillary data FCM Code manager Compilation and build

slide-6
SLIDE 6

NCAS Rose/Cylc - Historical

  • Discovery – ad hoc
  • Manage code – software engineering tools
  • Configure/reconfigure experiment – manual process
  • Manage job submission
  • Manage job failure/continuation
  • Manage output
  • Post process
  • Archive

How this works is highly dependent in individual users and frequently involves a good deal on manual intervention.

slide-7
SLIDE 7

NCAS Rose/Cylc - MOSRS

Met Office Shared Repositories (UM, JULES, etc) PUMA

Local Mirror

  • f Shared

Repositories

Make code changes Extract code into Rose suite

Updated every 5 mins

Make code changes Extract code into Rose suite

another site Local Mirror

  • f Shared

Repositories

slide-8
SLIDE 8

NCAS Rose/Cylc

Commits/Extracts Working copies Job configuration Job submission Institutional compute/storage Code repositories mirrored from MOSRS Rose/Cylc MOSRS Local to PUMA POST PROCESSING COMPUTE ARCHER MONSooN Polaris Mobilis HPC Wales JASMIN LOTUS, SCI VMs ARCHER PP, RDF cluster, JASMIN VM MONSooN PP MASS

slide-9
SLIDE 9

Atmosphere model Dynamical core Physics Diagnostics (STASH) [JULES and/or UKCA]

UM software (vn10.0 onwards)

UMUI Rose Database of user jobs Graphical job editor Reconfiguration Prepares initial model state Output file tools Data processing Analysis and visualisation PUMA HPC Local / Jasmin OASIS Coupler Ocean model Input file tools Prepare ancillary data MOSRS FCM Code manager Compilation and build Cylc / Rose Job submission Workflow manager

slide-10
SLIDE 10

NCAS Rose/Cylc - Platforms

  • PUMA (cylc daemons, polling)
  • ARCHER

– Service nodes – RDF Analytics cluster

  • JASMIN

– jasmin-xfer1 – jasmin-cylc (MO)

  • JASMIN-Reading (running jules locally)
  • Polaris (on the way)

MONSooN (MO managed) Lander Rose VM Cylc VM HPC PP

76 Rose/Cylc users 119 Rose/Cylc users

slide-11
SLIDE 11

NCAS Rose/Cylc - Suites

  • Many suites!
  • Initial proliferation (support)
  • Greater convergence

Standard Suites (?)

  • GA7 – ACSIS/FEBBRAIO
  • GC3 – HighresMIP
  • GO5
  • NEMOVAR
  • UKESM
  • Nesting

Suite development in production runs Many moving parts/points of failure

slide-12
SLIDE 12

NCAS Rose/Cylc - Suites

post processing postproc file conversion move from scratch to RDF remove from scratch checksum pptransfer pull files to jasmin checksum

slide-13
SLIDE 13

NEMO data assimilation suite

slide-14
SLIDE 14
slide-15
SLIDE 15

NCAS Rose/Cylc - Training

All but retired our UMUI-based training!

  • UMUI Conversion course (Sept 2016, Spring 2017)

(http://cms.ncas.ac.uk/wiki/UmTraining/RoseSept2016)

  • 3-day UM Introduction (November and April) – Rose/Cylc based
  • 5-day UKCA Training (January 2016) – Rose/Cylc based

(http://www.ukca.ac.uk/wiki/index.php/UKCA_Training_January_2016)

slide-16
SLIDE 16

NCAS Rose/Cylc – Management …

Management

  • Installation/testing of new releases and upgrades

Consistency across platforms

  • UM, GCOM, other installation/testing – Rose-stem

Still learning about the capabilities of the system

Support

  • Porting suites
  • Troubleshooting/debugging
  • Increased level infrastructure support

eg users writing their own guis

slide-17
SLIDE 17

NCAS Rose/Cylc - Future

  • PUMA – central submission hub

cloud-based PUMA (JASMIN VM)

slide-18
SLIDE 18

NCAS Rose/Cylc - Future

  • PUMA – central submission hub

cloud-based PUMA (JASMIN VM)

  • UM in the cloud – experiments in AWS ongoing

Rose/Cylc control?

slide-19
SLIDE 19

NCAS Rose/Cylc - Future

  • PUMA – central submission hub

cloud-based PUMA (JASMIN VM)

  • UM in the cloud – experiments in AWS ongoing

Rose/Cylc control?

  • MIP data workflow

ARCHER work MASS RDF ET GWS MONSooN PP RDF -C LOTUS VMs