User Support, Campus Integration, OSG XSEDE Rob Gardner OSG - - PowerPoint PPT Presentation

user support campus integration osg xsede
SMART_READER_LITE
LIVE PREVIEW

User Support, Campus Integration, OSG XSEDE Rob Gardner OSG - - PowerPoint PPT Presentation

User Support, Campus Integration, OSG XSEDE Rob Gardner OSG Council Meeting June 25, 2015 149. Present to Council 1 page document on "Enabling Campus Resource Sharing and use of remote OSG resources in 15 minutes - Rob Gardner, Frank


slide-1
SLIDE 1

User Support, Campus Integration, OSG XSEDE

Rob Gardner OSG Council Meeting June 25, 2015

slide-2
SLIDE 2
  • 149. Present to Council 1 page document on

"Enabling Campus Resource Sharing and use of remote OSG resources in 15 minutes - Rob Gardner, Frank

slide-3
SLIDE 3

Enabling Campus Sharing & Use of OSG

  • Clemson helping drive this development
  • Two track strategy to integrate Palmetto

resource and user community

○ Track 1: “light, quick” ■ Sumit from Palmetto to OSG, and back ■ “Quick Connect” → OSG Connect to Palmetto via hosted Bosco service (ssh) ○ Track 2: “full OSG capabilities” ■ Full HT Condor CE, OASIS+Squid, {StashCache}

Working document: goo.gl/9aNkJs

slide-4
SLIDE 4

Track 1: OSG Connect to Clemson-Palmetto

  • Hosted service @ OSG Connect
  • Addressed OrangeFS + Condor file locking

Current opp limit on Palmetto Capped at 500 jobs due to PBS Pro limitation that prevents Clemson users in the general pool from (non-owners) preempting OSG users. Expect a fix in next release of PBS Pro so that OSG jobs can claim additional idle cycles

  • n Palmetto

submitted from login.osgconnect.net

slide-5
SLIDE 5

Track 1: Submit from Clemson-Palmetto

  • Download Campus Connect client from github
  • Minutes to submission to OSG
slide-6
SLIDE 6

Campus Connect Client

  • lightweight module to

manage submission from a campus login host

  • heavy lifting done at

hosted schedd

  • In Year 4, extend to reach,

monitor, and account: ○ local campus allocation ○ XD allocation ○ Full integration with campus IDM & signup

Evaluating at:

slide-7
SLIDE 7

Longer term: Hosted Campus CE-ssh

  • Discussions to establish approach for hosted

CE services on behalf of campuses short of manpower

  • Quick(er) on-ramp of a campus HPC cluster

without requiring local OSG expertise

  • @ the campus: provide ssh access, local

accounts for supported VOs

  • Normal CE operations handled by OSG staff
  • Possible “umbrella CE” for small campuses
slide-8
SLIDE 8
  • 152. Pay attention to "Sound Bites" that

communicate the scale and reach of OSG to outside agencies/projects - Rob G, Bo, Clemmie

slide-9
SLIDE 9

Open Science Grid: HTC supercomputer

  • 2014 stats

○ 67% size of XD, 35% BlueWaters ○ 2.5 Million CPU hours/day ○ 800M hours/year ○ 125M/y provided opportunistic

  • >1 petabyte data xfer/day
  • 50+ research groups
  • thousands of users
  • XD service provider for XSEDE

Rudi Eigenmann Program Director Division of Advanced Cyberinfrastructure (ACI) NSF CISE CASC Meeting, April 1, 2015

slide-10
SLIDE 10

Lowering barriers to usability

OSG as a campus research computing cluster

★ Login host ★ Job scheduler ★ Software (modules) ★ Storage ★ Tools

slide-11
SLIDE 11

Software & tools on the OSG

  • Distributed software file system
  • Special module command

○ identical software on all clusters

  • Common tools & libs
  • Curate on demand

continuously

  • HTC apps in XSEDE

campus bridging yum repo $ switchmodules oasis $ module avail $ module load R $ module load namd $ module purge $ switchmodules local

http://goo.gl/TlLq1M

slide-12
SLIDE 12

Modules now used at most sites

All software accesses by module are monitored centrally for support purposes

slide-13
SLIDE 13

User Tools

  • $ tutorial
  • $ connect

○ on login.osgconnect.net, on campus, or laptop

  • $ module (software, all OASIS enabled sites)
  • $ stash-cp (Stash to job, in development)
slide-14
SLIDE 14

Education and Training assets

  • Helpdesk with community forum and

knowledge base

  • github seen as strategy for formal

management of user documentation

○ Markdown tutorials → same place as code ○ tutorial write-ups track code samples closely ○ auto html and upload to help desk (in seconds)

  • Expect to announce helpdesk support.
  • pensciencegrid.org this week
slide-15
SLIDE 15

Code and Markdown managed in Github Content indexed, searchable

slide-16
SLIDE 16

User sees personal history of support requests to OSG .. and can drill down to see full interaction

  • history. Staff can

make private notes,

  • r link to a Jira issue

for technical support tracking. Of course, all available via email: user- support@openscienc egrid.org Can DM tweet to @osgusers which generates a ticket

slide-17
SLIDE 17

Uber-like feedback is collected (except we don’t rate users :)

slide-18
SLIDE 18
  • Software Carpentry includes a section on scientific

programming using Python. IPython Notebook is used for instruction.

  • SWC typically asks users to install IPython on laptops; this

is a top source of delays and confusion.

  • In our DHTC edition of SWC, we already have a multiuser

server with login accounts that users retain indefinitely.

  • Idea: use this framework to provide a shared IPython,

establishing a common baseline for the toolchain.

IPython Notebooks

slide-19
SLIDE 19

IPython Notebook Service

Developed a platform to launch per-user IPython Notebook servers:

1. User visits http://ipython.osgconnect.net and logs in.

slide-20
SLIDE 20

IPython Notebook Service

  • 2. Server launches pre-configured

IPython Docker container. Docker provides user and data isolation. Containers can be shut down and re-instantiated on demand.

  • 3. Within moments, a newly

provisioned IPython instance is

  • available. Notebook storage is

persistent and accessible via login.osgconnect.net.

slide-21
SLIDE 21

Education and Training activities

  • Working with Tim Cartwright and Lauren Michael (ACI-REF)

to support 2015 OSG User School

  • UChicago-Northwestern roundtable (postponed to “Fall”)
  • OSG-SWC @ Duke, October 26-29 (tentative dates)

Joint Software Carpentry & Open Science Grid Workshop at Duke University Distributed high throughput computation is concerned with using many computing resources potentially spread over large geographic distances and shared between organizations. These could be university research computing clusters, national leadership class HPC facilities, or public cloud resources. Incorporating these into science workflows can dramatically benefit your research program. However, to get the most of these systems requires some knowledge and skill in scientific computation. This workshop extends basic instruction on Linux programming from the Software Carpentry series with concepts and exercises on distributed high throughput computation. Participants will use resources of the Scalable Computing Support Center as well as the Open Science Grid, a national supercomputing-scale high throughput computing facility. There will be experts on hand to answer questions about distributed high throughput computing and whether it is a good fit for your science.

slide-22
SLIDE 22

Other Campus Outreach Events

  • Internet2 Technology Exchange, October 4-7,

Cleveland (formal decision next week)

○ Distributed High Throughput Computation: a Campus Roundtable Discussion (Research Track)

  • Rocky Mountain Advanced Computing

Consortium, HPC Symposium (Aug 11-13, Boulder)

○ 30 minute slot shared with XSEDE

  • XSEDE15, CLUSTER15 (Campus Bridging)
slide-23
SLIDE 23

OSG as XD Provider to XSEDE

slide-24
SLIDE 24

OSG XD - Last 12 months

Project Name PI Institution Field of Science Allocation Wall Hours TG-IBN130001 Donald Krieger University of Pittsburgh Biological Sciences Research 54,881,313 TG-CHE140110 John Stubbs University of New England Chemistry Research 1,047,897 TG-DMR130036 Emanuel Gull University of Michigan Materials Science Research 563,106 TG-PHY120014 Qaisar Shafi University of Delaware Physics and astronomy Research 309,036 TG-CHE140098 Paul Siders University of Minnesota; Duluth Chemistry Research 88,047 TG-CHE130091 Paul Siders University of Minnesota; Duluth Chemistry Startup 58,086 TG-MCB140160 David Rhee Albert Einstein College of Medicine Molecular and Structural Biosciences Startup 39,517 TG-AST140088 Francis Halzen University of Wisconsin-Madison High Energy Physics Startup 30,850 TG-CHE140094 John Stubbs University of New England Chemistry Startup 27,057 TG-OCE130029 Yvonne Chan University of Hawaii; Manoa Ocean Sciences Startup 22,007 TG-IRI130016 Joseph Cohen University of Massachusetts; Boston Information Robotics and Intelligent Systems Startup 20,401 TG-DMR140072 Adrian Del Maestro University of Vermont Materials Science Startup 20,179 TG-OCE140013 Yvonne Chan University of Hawaii; Manoa Ocean Sciences Research 19,861 TG-AST150012 Gregory Snyder Space Telescope Science Institute Mathematical Sciences Startup 18,099 TG-MCB090163 Michael Hagan Brandeis University Molecular and Structural Biosciences Research 10,676 TG-DEB140008 Robert Toonen University of Hawaii; Manoa Biological Sciences Startup 4,147 TG-TRA130011 John Chrispell Indiana University of Pennsylvania Other Campus Champions 1,578 TG-MCB140232 Alan Chen SUNY at Albany Molecular Biosciences Startup 598 TG-SEE140006 Sheila Kannappan University of North Carolina; Chapel Hill Physics and astronomy Educational 46 TG-CDA100013 Mark Reed University of North Carolina; Chapel Hill Mathematical Sciences Campus Champions 6 TG-CCR120041 Luca Clementi San Diego Supercomputer Center Computer and Information Science and Engineering Startup 1 Total 57,162,509
slide-25
SLIDE 25

OSG XD: June XRAC Meeting (Nashville)

  • OSG pledges 2M CPU-hours (SUs) per quarter
  • There were 199 requests for XSEDE resources, mostly for

Stampede and Comet

  • There were no requests for OSG resources
  • Post meeting, following granted:

○ 50k SU to Kettimuthu/ANL (CS: workflow modeling) ○ 100k SU to Qin/Spellman (class on gene networks) ○ 1.39M SU to Gull/UMich (PHYS: condensed matter)

  • Many NAMD requests

○ → start a MD-HTC activity with ACI-REF?

slide-26
SLIDE 26

Conclusions & Outlook

  • “Clemson on the air”

○ Local submit to OSG validated at scale of 1500 jobs ○ Joint use of campus and OSG resources in same work environment ■ Model for other campuses, ACI-REF as channel ○ Quick Connect to share resources is functional

  • HTC training materials now formally managed
  • Helping users via XSEDE

○ plan detailed studies of common application scaling properties & potential conversion to HTC workflow

slide-27
SLIDE 27

user-support@opensciencegrid.org @osgusers