EOS monitoring metrics and user access pattern analysis Philipp - - PowerPoint PPT Presentation

eos monitoring metrics and user access pattern analysis
SMART_READER_LITE
LIVE PREVIEW

EOS monitoring metrics and user access pattern analysis Philipp - - PowerPoint PPT Presentation

EOS monitoring metrics and user access pattern analysis Philipp Zigann CERN IT-DSS-DT 9th Oct. 2012 Zigann (CERN) EOS user access analysis 9th Oct. 2012 1 / 17 Outline EOS 1 Targets of Data Analysis 2 Data Acquisition 3 Metrics 4


slide-1
SLIDE 1

EOS monitoring metrics and user access pattern analysis Philipp Zigann

CERN IT-DSS-DT

9th Oct. 2012

Zigann (CERN) EOS user access analysis 9th Oct. 2012 1 / 17

slide-2
SLIDE 2

Outline

1

EOS

2

Targets of Data Analysis

3

Data Acquisition

4

Metrics

5

User Pattern

6

Future Work

Zigann (CERN) EOS user access analysis 9th Oct. 2012 2 / 17

slide-3
SLIDE 3

Outline

1

EOS

2

Targets of Data Analysis

3

Data Acquisition

4

Metrics

5

User Pattern

6

Future Work

Zigann (CERN) EOS user access analysis 9th Oct. 2012 3 / 17

slide-4
SLIDE 4

EOS

Exploration of storage Pure disk based storage In-memory namespace (no DB) Mainly used during data analysis by physicists Developed for fast random file access

Zigann (CERN) EOS user access analysis 9th Oct. 2012 4 / 17

slide-5
SLIDE 5

Outline

1

EOS

2

Targets of Data Analysis

3

Data Acquisition

4

Metrics

5

User Pattern

6

Future Work

Zigann (CERN) EOS user access analysis 9th Oct. 2012 5 / 17

slide-6
SLIDE 6

Targets of Data Analysis

Automated recognition of system anomalies

Improves reaction time of system admins Fast error recognition (and therefore faster solving)

Detection of user access pattern

classification of typical use cases determination of (in)efficient access pattern

  • ptimize inefficient access

Zigann (CERN) EOS user access analysis 9th Oct. 2012 6 / 17

slide-7
SLIDE 7

Outline

1

EOS

2

Targets of Data Analysis

3

Data Acquisition

4

Metrics

5

User Pattern

6

Future Work

Zigann (CERN) EOS user access analysis 9th Oct. 2012 7 / 17

slide-8
SLIDE 8

Data Acquisition

Xroot built-in monitoring

Generating udp packages for each read/write request Detailed information about single reads/writes EOS is based on xroot Analysed by Domenico Giordano (CERN) et al.

Lemon monitoring

System monitoring tool, mainly used at CERNs infrastructure

EOS log file

Zigann (CERN) EOS user access analysis 9th Oct. 2012 8 / 17

slide-9
SLIDE 9

EOS log file

Entry describes what happened between the open and close of a file

log=7677503c-adc7-11e1-9083-003048f0e00c&path=/eos/atlas/atl...Ele.root&ruid=38112&rgid=1307& td=username.12459:127@lxplus309&host=lxfsrg15a07.cern.ch&lid=6291730&fid=45557244&fsid=2246 &ots=1338760799&otms=547&cts=1338760890&ctms=654&rb=615562&wb=0&srb=368145245672&swb=0 &nrc=70&nwc=0&rt=28.48&wt=0.00&osize=5671631075&csize=5671631075

Parameters File information User identification Number of seeked, written, read bytes (and used calls) Open and close time Waiting time for io No information about a single read/write call!

Zigann (CERN) EOS user access analysis 9th Oct. 2012 9 / 17

slide-10
SLIDE 10

Outline

1

EOS

2

Targets of Data Analysis

3

Data Acquisition

4

Metrics

5

User Pattern

6

Future Work

Zigann (CERN) EOS user access analysis 9th Oct. 2012 10 / 17

slide-11
SLIDE 11

Metrics I

Throughput [MB/s]

Read+written Bytes divided by open duration of a file

Reopened Files per Job

Number of reopened files during one job

Read Bytes / File Size

Read ratio of a write request compared to the file size

Written Bytes / File Size

Write ratio that indicates file updates and full (re)writing

Written Bytes / Number of Write Calls

Average transfer volume of a write call during a request (writing)

Zigann (CERN) EOS user access analysis 9th Oct. 2012 11 / 17

slide-12
SLIDE 12

Metrics II

Disk Wait

Time of waiting for an io request divided by open duration of the file.

2012-10-08 19:18:07

disk wait ratio 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 count

4

10

5

10

6

10

7

10

8

10

disk wait ratio

disk wait ratio Entries 1.13926e+08 Mean 0.04746 Underflow Overflow 7

2012 CERN - Zigann 

disk wait ratio Entries 1.13926e+08 Mean 0.04746 Underflow Overflow 7 disk wait ratio Entries 1.13926e+08 Mean 0.04746 Underflow Overflow 7

disk wait ratio

Zigann (CERN) EOS user access analysis 9th Oct. 2012 12 / 17

slide-13
SLIDE 13

Metrics III

Read Bytes / Number of Read Calls [MB/Call]

Average transfer volume of a read call during a request (reading)

2012-10-04 18:36:11

read bytes/read call 200 400 600 800 1000 1200 1400 1600 1800 2000 2200

3

10 × count 1 10

2

10

3

10

4

10

5

10

6

10

7

10

8

10

Read bytes / call

Read bytes / call Entries 1.139261e+08 Mean 8.22e+04 Underflow Overflow

2012 CERN - Zigann 

Read bytes / call Entries 1.139261e+08 Mean 8.22e+04 Underflow Overflow Read bytes / call Entries 1.139261e+08 Mean 8.22e+04 Underflow Overflow

Read bytes / call

Zigann (CERN) EOS user access analysis 9th Oct. 2012 13 / 17

slide-14
SLIDE 14

Outline

1

EOS

2

Targets of Data Analysis

3

Data Acquisition

4

Metrics

5

User Pattern

6

Future Work

Zigann (CERN) EOS user access analysis 9th Oct. 2012 14 / 17

slide-15
SLIDE 15

User Pattern

File Transfer

Accessing a file completely (read or write bytes / file size = 1) No reopens Significant Number of MB/Call (2MB, 512kB or 256kB)

Event Mixing

Mixing one single event with a bunch of other events Low read radio Large files Many reopens

Zigann (CERN) EOS user access analysis 9th Oct. 2012 15 / 17

slide-16
SLIDE 16

Outline

1

EOS

2

Targets of Data Analysis

3

Data Acquisition

4

Metrics

5

User Pattern

6

Future Work

Zigann (CERN) EOS user access analysis 9th Oct. 2012 16 / 17

slide-17
SLIDE 17

Future Work

Wanted Information

Information about the users target (which kind of information is he really looking for) Make vector reads visible Clearly concatenation of single events to jobs

Inefficient System Usage

Determine and try to reduce it Adaptation of systems to requirements Adaptation of user behaviour

Zigann (CERN) EOS user access analysis 9th Oct. 2012 17 / 17