My typical workflow Jakub Muszy nski 6th7th May 2014 Computer - PowerPoint PPT Presentation

My typical workflow Jakub Muszy´ nski 6th–7th May 2014 Computer Science and Communications (CSC) Research Unit Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 1 / 15 �

My experiments I am simulating a P2P protocol. Executions are independent . Each execution has a set of parameters: network size — number of nodes in the network, initialization — initial state of the network, etc. Each parameter has a different set of values: network size: 500, 1000, . . . nodes, etc. For each combination of the parameters, I need X executions. Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 2 / 15 �

Implementation Done in Java — depends on the GraphStream 1 library. Remember about the proper settings of the Java Virtual Machine. → Especially: -d64 -Xms$memoryNeeded -Xmx$memoryNeeded ֒ State is implemented. Simple implementation of the Serializable interface. Output is compressed ( GZIP ) on the application level. 1 http://graphstream-project.org/ Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 3 / 15 �

Resources needed — example Total number of executions can be huge: parameters 1 and 2 have 5 values each, parameter 3 has 10 values, parameter 4 has 20 values, parameter 5 has 2 values, for each combination of parameters, I need 100 executions. In total it gives: 1.000.000 independent executions . Time required for a single execution: from a few minutes to a couple of hours . Memory ( RAM ): up to 4 GB (depending on the problem size). Input/Output operations: state files, final results. Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 4 / 15 �

Batches 1 batch = 1 job X executions grouped by the values of the parameters. Created by the configuration script which: creates a directory for the results ( mkdir ) of the batch: ./parameter1_value/parameter2_value/.../parameter5_value puts there the application configuration, setting appropriate parameters ( cp and sed ), creates marker files (missing executions) ( touch ). Executed using GNU Parallel 2 — see PS2. 2 http://www.gnu.org/software/parallel/ Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 5 / 15 �

Queue Depending on the current load of the platform: default queue (many users/jobs) with state saving: before the end of the walltime if the execution is not finished. besteffort queue (few users/jobs) with state saving: periodically (every X minutes) → internally implemented in the application. ֒ before the end of the walltime if the execution is not finished. Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 6 / 15 �

Default queue — oarsub options -n $jobName → If you name the jobs, it is easier to manage them. ֒ -t idempotent → Exit code equal to 99 ⇒ job is resubmitted with the same ֒ parameters. -l nodes=1,walltime=$hours → Bash variable hours is set depending on the problem size: ֒ problemSize=‘ echo $dir | sed ’s/.*networkSize$[0-9]*$.*/\1/’‘ hours="2" if [ $problemSize -ge 500 ]; then hours="4" fi --checkpoint 900 --signal 12 → 15 minutes before walltime ends, signal 12 ( USR2 ) is sent. ֒ Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 7 / 15 �

Besteffort queue — oarsub options Differences: Add: -t besteffort Change the properties: -l nodes=1/cpu=1,walltime=$hours Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 8 / 15 �

Job submision script (simplified) Find all directories with missing executions: 1 missingDirs=‘ find . -iname *.missing - printf "%h\n" | sort -u‘ For each directory: 2 Wait for the space in the queue (do not spam with too many jobs): while [ ‘ oarstat -u jmuszynski | wc -l‘ -ge 32 ]; do echo "Waiting 10 minutes to free the queue..." sleep 10m done Setup parameters for the oarsub — like the variable hours previously. Submit the job: oarsub <all_the_parameters_described_previously> Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 9 / 15 �

1 Job = GNUParallel + checkpointing Trap the checkpoint signal (defined previously in the oarsub ): CHKPNT_SIGNAL=12 EXIT_UNFINISHED=99 function checkpointAll { # do not start new jobs kill -TERM $parallelPID # checkpoint running for p in ‘ ps -fujmuszynski | grep $application\ | grep $parallelPID | grep -v parallel \ | awk ’{ print $2 }’‘; do kill -$CHKPNT_SIGNAL $p done # wait to finish, quit wait $parallelPID exit $EXIT_UNFINISHED } trap "checkpointAll" $CHKPNT_SIGNAL Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 10 / 15 �

GNUParallel Run the parallel tasks: parallel -j$jobsPossible $application {} ::: $testNumbers & parallelPID=$! Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 11 / 15 �

Besteffort jobs — WARNINGS Besteffort jobs CAN BE KILLED AT ANY MOMENT! You have to accept some loss of the CPU time. → Walltime should be SHORT if you do not have the state saving. ֒ At ANY moment includes even the state saving! → Keep two versions of the state — previous and current. ֒ Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 12 / 15 �

Besteffort jobs — WARNINGS Abount the walltime & the number of jobs HPC is a shared platform. → Use a common sense when submitting the jobs. ֒ → Limits are flexible, but avoid misuse. ֒ Max Max number of walltime active jobs per user 9000:00:00 1000 Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 13 / 15 �

HPC � = PC Which means, that you should monitor execution of your jobs ( https://hpc.uni.lu/status/ganglia.html ). As: Failures affect other users. Performance issues also, especially: I/O operations , RAM usage. Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 14 / 15 �

Thank you! Jakub Muszy´ nski (UL HPC School 2014) My typical workflow 15 / 15 �

My typical workflow Jakub Muszy nski 6th7th May 2014 Computer - PowerPoint PPT Presentation

My typical workflow Jakub Muszy nski 6th7th May 2014 Computer Science and Communications (CSC) Research Unit Jakub Muszy nski (UL HPC School 2014) My typical workflow 1 / 15 My experiments I am simulating a P2P protocol.

Peoplesoft Workflow Peoplesoft Workflow Technology Technology Putting Customer First SOA IT

STAR-CCM+ in your Workflow Bill Jester, CD-adapco STAR-CCM+ in your workflow Contents

Day 8 Workflow Cloud Resource Provisioning Todays Agenda Introduction What is workflow?

workflow: workflow: QSPR = Quantitative Structure Property

A Workflow Workflow for for Retrieving Retrieving Orthologous Orthologous A Promoters and I

Design of a Petri Net-based Design of a Petri Net-based Workflow Engine Workflow Engine Simone

Introduction to CONNJUR Workflow Builder and Yes Workflow 2017 Summer Workshop: June 29, 2017

Kap. 12 Workflow Management in ERP-Systemen 12.1 Workflow Management: Konzepte 12.2 Einbindung

Module 4 - Smoothing the Workflow with the Kanban Best Practices Establishing an Even Workflow

Meeting No. 3 Meeting No. 3 Typical Elements Typical Elements - - Level of Service Level of

Same-Day Dental Sealant Workflow Same-Day Sealant Workflow Why implement? Prevention of

Workflow Plus Signature Capture Tool for Synergy Enterprise What is This Tool ? This tool

JV WORKFLOW & BUSINESS EDITS HAND IN HAND WE LEARN 1 University Financial Services JV

The Connected Workflow How InVision Enterprise connects the product design workflow for better,

Translation Services: Innovation in Translation Workflow, Tools and Translation Workflow, Tools

Workflow Plus Request Free Field Tool Features Additional free fields to be used in workflow

XML and Databases Chapter 3: Designing XML DTDs Prof. Dr. Stefan Brass Martin-Luther-Universit

Tools for the Digital Diplomatist Open source tools for online publication of charters Francesca

XML and databases (and XForms) Patryk Czarnik XML and Applications 2013/2014 Week 13

COMPUTER NETWORKS ECE 422 INTRODUCTION SESSION 2 Tuesday, 04 February 2020 1 THE

XOR with intermediate (hidden) units Delta rule as gradient descent in error (sigmoid units)

Java AND and OR Java XOR and NOT AND operator (&) OR

Neural Networks Module2 : learning with Gradient Descent module 2: numerical optimization

Optimizing linear maps modulo 2 (i.e.: fast xor sequences for bitsliced software) D. J.

My typical workflow Jakub Muszy nski 6th7th May 2014 Computer - PowerPoint PPT Presentation

My typical workflow Jakub Muszy nski 6th7th May 2014 Computer Science and Communications (CSC) Research Unit Jakub Muszy nski (UL HPC School 2014) My typical workflow 1 / 15 My experiments I am simulating a P2P protocol.

Peoplesoft Workflow Peoplesoft Workflow Technology Technology Putting Customer First SOA IT

STAR-CCM+ in your Workflow Bill Jester, CD-adapco STAR-CCM+ in your workflow Contents

Day 8 Workflow Cloud Resource Provisioning Todays Agenda Introduction What is workflow?

workflow: workflow: QSPR = Quantitative Structure Property

A Workflow Workflow for for Retrieving Retrieving Orthologous Orthologous A Promoters and I

Design of a Petri Net-based Design of a Petri Net-based Workflow Engine Workflow Engine Simone

Introduction to CONNJUR Workflow Builder and Yes Workflow 2017 Summer Workshop: June 29, 2017

Kap. 12 Workflow Management in ERP-Systemen 12.1 Workflow Management: Konzepte 12.2 Einbindung

Module 4 - Smoothing the Workflow with the Kanban Best Practices Establishing an Even Workflow

Meeting No. 3 Meeting No. 3 Typical Elements Typical Elements - - Level of Service Level of

Same-Day Dental Sealant Workflow Same-Day Sealant Workflow Why implement? Prevention of

Workflow Plus Signature Capture Tool for Synergy Enterprise What is This Tool ? This tool

JV WORKFLOW &amp; BUSINESS EDITS HAND IN HAND WE LEARN 1 University Financial Services JV

The Connected Workflow How InVision Enterprise connects the product design workflow for better,

Translation Services: Innovation in Translation Workflow, Tools and Translation Workflow, Tools

Workflow Plus Request Free Field Tool Features Additional free fields to be used in workflow

XML and Databases Chapter 3: Designing XML DTDs Prof. Dr. Stefan Brass Martin-Luther-Universit

Tools for the Digital Diplomatist Open source tools for online publication of charters Francesca

XML and databases (and XForms) Patryk Czarnik XML and Applications 2013/2014 Week 13

COMPUTER NETWORKS ECE 422 INTRODUCTION SESSION 2 Tuesday, 04 February 2020 1 THE

XOR with intermediate (hidden) units Delta rule as gradient descent in error (sigmoid units)

Java AND and OR Java XOR and NOT AND operator (&amp;) OR

Neural Networks Module2 : learning with Gradient Descent module 2: numerical optimization

Optimizing linear maps modulo 2 (i.e.: fast xor sequences for bitsliced software) D. J.

JV WORKFLOW & BUSINESS EDITS HAND IN HAND WE LEARN 1 University Financial Services JV

Java AND and OR Java XOR and NOT AND operator (&) OR