Nedyalko Borisov Duke University
DIADS: Addressing the “My-Problem-
- r-Yours” Syndrome with Integrated
DIADS: Addressing the My-Problem- or-Yours Syndrome with Integrated - - PowerPoint PPT Presentation
DIADS: Addressing the My-Problem- or-Yours Syndrome with Integrated SAN and Database Diagnosis Nedyalko Borisov Duke University Shivnath Babu, Duke Sandeep Uttamchandani, IBM Ramani Routray, IBM Aameek Singh, IBM Current State
Each team has limited
2
3
May be infeasible May have high overhead
4
Inputs
Poorly performing query Monitoring data from DBMS Monitoring data from SAN
5
Outputs
Root cause of query's poor
Localization of problem
6
7
8
9
10
11
12
13
14
DBMS
Plan-level data (e.g., running
DBMS-level data (e.g., hits in
15
SAN
Component-level data (e.g., for
Event logs
16
➢ Which operators have a change in running time that explains
➢ Anomaly Score computed with Kernel Density Estimation (KDE)
17
KDE picture borrowed from Internet
18
19
20
Handling event (fault) propagation
Codebook (ex: EMC) Rules (ex: Oracle) Bayesian networks
21
22
➢ How are symptoms
➢ How is database populated
➢ How to prevent database
➢ What about missing/extra
➢ Language for expressing complex
Intuitive built-in patterns Temporal patterns
➢ Currently, by administrators;
➢ Parameterized symptoms and root
➢ Support for partial matching with
23
24
Impact score ( 0-100%)
Separating high-impact causes from others Safeguard against false positives Identifying presence of false negatives
Reverse dependency analysis: Bottom-up traversal of the
Use of models (DBMS cost models, SAN device models)
25
26
27
TPC-H Queries PostgreSQL IBM DS6000 storage manager On production system
28
29
SAN misconfiguration
O4, O8, O22
30
SAN misconfiguration
31
High score
Concurrent IO In bursty manner Query is not affected
32
33
V1 misconfiguration –
V2 workload – low
34
35
For example: Dageville et al. [VLDB'04]
For example: Genesis [ICDCS'06]
For example: PeerPresure [OSDI'04]
For example: Yemini et al. [IEEE Comm. Magazine '96]
APG: Provides holistic view across DBMS and SAN Diagnosis workflow: Careful integration of machine
Can succeed where DBMS-only and SAN-only tools fail
Alternative techniques for each module Automated fix recommendation Other applications of DIADS, e.g., what-if for SAN changes
36