The following is intended to outline our general product direction. - - PowerPoint PPT Presentation

the following is intended to outline our general product
SMART_READER_LITE
LIVE PREVIEW

The following is intended to outline our general product direction. - - PowerPoint PPT Presentation

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should


slide-1
SLIDE 1
slide-2
SLIDE 2
slide-3
SLIDE 3

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any

  • contract. It is not a commitment to deliver any

material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

slide-4
SLIDE 4

<Insert Picture Here>

Fault-Diagnostic Best Practices: What Every DBA Must Know About Oracle Database 11g

Mark Ramacher Director, Server Management Server Technologies

slide-5
SLIDE 5

Oracle Database 11g R1 Fault Diagnostic Automation

Realistic Testing and Automatic Health Checks Diagnostic Workflow Automation Intelligent Resolution Proactive Patching

Goal: Reduce Time to Problem Resolution

Prevention Resolution

Diagnostic Solution Delivery

slide-6
SLIDE 6

Diagnostic Workflow Automation

slide-7
SLIDE 7

Historic Issues with RDBMS Diagnostic Data

  • No organization
  • DBA must search around for relevant diagnostics to send
  • No catalog of failures
  • Just a text stream (alert log) for history
  • DBA: How healthy has my database been this last quarter?
  • DBA: Have I seen this failure before?
  • Not always sufficient on first failure
  • DBA must reproduce the failure with debug switches
  • Cause of multiple round trips between customer and support
slide-8
SLIDE 8

Historic Issues with RDBMS Diagnostic Data (continued)

  • Unmanaged
  • Grows forever
  • DBA must decide when and which files to delete
  • Unrestrained
  • Floods of data from repeated occurrences of an error
  • DBA must perform emergency space management
slide-9
SLIDE 9

The New World of 11g Diagnostics

  • Organized
  • Diagnostic data is annotated and can be queried and

correlated

  • DBA uses automated tool to find failure data
  • Cataloged
  • Automated Problem and incident management
  • DBA can query to see history of failures and which are

duplicates

  • First Failure Capture
  • DBA’s work is done after sending initial diagnostic package
slide-10
SLIDE 10

The New World of 11g Diagnostics

  • Managed
  • Auto purging
  • DBAs don’t have to monitor space usage of trace files
  • Constrained
  • Flood control
  • One less worry for a DBA in time of crisis
slide-11
SLIDE 11

Concepts : Problems and Incidents

  • Problems are fundamental code or configuration

issues that can cause execution failures

  • They exist until they are corrected, e.g. by patch
  • They are managed to resolution
  • An incident is a single occurrence of a problem
  • They happen at point(s) in time and thus have timestamps
  • They induce diagnostic actions like dumps and traces
  • They are associated to problems by a “problem key”
  • E.g. error code
slide-12
SLIDE 12

Automatic Diagnostic Repository (ADR)

  • Stores diagnostic data in a directory hierarchy
  • Holds data concurrently for multiple Oracle products
  • Each product instance has its own diagnostic workspace
  • ADR data is highly structured
  • Formalizes incidents and problems and assigns them IDs
  • Metadata is kept for each incident and problem
  • Incident related diagnostic data is placed in its own directory
  • Alert log and trace files are annotated and can be queried
slide-13
SLIDE 13

Automatic Diagnostic Repository

diag rdbms DB Name SID ADR Base $ORACLE_HOME/log DIAGNOSTIC_DEST ADR Home $ORACLE_BASE

ADRCI

log.xml alert_SID.log

V$DIAG_INFO

BACKGROUND_DUMP_DEST USER_DUMP_DEST CORE_DUMP_DEST alert cdump

(others)

hm

incpkg incident

metadata incdir_1 incdir_n … trace

Support Workbench

slide-14
SLIDE 14

Automatic Diagnostic Repository (ADR)

  • Self-managing
  • Trace files purged after 1 month (configurable)
  • Incident/Problem metadata purged after 1 year (configurable)
  • Note: incidents can be flagged as “don’t purge” to override

purging

  • Repeated incidents are flood controlled (5 dumps per hour

per problem)

  • Recreates itself as needed
slide-15
SLIDE 15

Incident Packaging Service (IPS)

Where there is structure, there can be automation…

  • IPS uses the ADR structure to automate the packaging
  • f diagnostic data
  • Solves the problem of “what needs to be sent”
  • Gathers all relevant diagnostic data for a problem
  • Correlates related incidents to make sure it captures

root cause

slide-16
SLIDE 16

Problem ID

ADR

Zip File Diagnostic

Incident Packaging Service (IPS)

Package BOM Correlation Generate Package Modify Contents

Add Delete Scrub

slide-17
SLIDE 17

Incident Packaging Service (IPS)

  • Recommends further diagnostic actions for DBA
  • For example “build SQL test case”
  • Packages structure and metadata so that Oracle side

automation can takes place

  • Use of IPS is critical to speed up problem resolution!
slide-18
SLIDE 18

11g Health Monitoring

  • Health Monitoring is designed to help the DBA:
  • Find problems before they impact service availability
  • Determine the scope of a problem
  • Validate that a problem is resolved
  • Provides a number of “health checkers”
  • Dictionary
  • DB structure integrity (control files, data file headers, etc.)
  • Redo log content
  • Undo Segment integrity
  • Data block integrity
slide-19
SLIDE 19

11g Health Monitoring

  • Checkers can be “reactively” activated during incidents
  • Targeted
  • E.g. check integrity of blocks near a corrupted block
  • All checkers can be activated on demand
slide-20
SLIDE 20

First Failure Analysis

  • Dumping the required diagnostic data “out of the box”
  • Reduces round trips with Oracle support
  • Internal logic activates detailed dumps for the given

failure circumstances

  • Additional data is detailed but targeted
  • Minimal increase in overall diagnostic data size
slide-21
SLIDE 21

EM Support Workbench

  • Support Workbench reduces the DBA’s diagnostic

management to a few clicks

  • Two main entry flows
  • From an incident alert on the DB home page
  • From the Support Workbench home page
  • Support Workbench home page
  • View recent and historical problems
  • View diagnostics packages
  • View health checker findings
slide-22
SLIDE 22

Support Workbench Home

slide-23
SLIDE 23

Support Workbench – Problem Details

  • View and process a problem
  • View “reactive” checker findings for this problem
  • View all incidents of this problem
  • View associated Metalink service request
  • Package diagnostics for the problem
  • Perform guided resolution on the problem
  • Data Repair advisor (link appears only if relevant)
  • SQL Repair advisor
slide-24
SLIDE 24

Problem Details

slide-25
SLIDE 25

Support Workbench - Packaging

  • Two flows that guide you through IPS packaging
  • Quick package
  • Wizard to guide you through the basic packaging steps
  • Cannot modify contents
  • Advanced packaging
  • Content Editing
  • Additional user dumps
  • Automated upload to Oracle*
  • Automated service request creation

* Requires OCM (Oracle Configuration Manager)

slide-26
SLIDE 26

Advanced Packaging

slide-27
SLIDE 27

Intelligent Repair Advisors

  • Data Recovery Advisor
  • Guided expert data recovery system using diagnostic data

and health check output

  • SQL Test Case Builder
  • Automatically retrieves exact environment information from

ADR to build SQL test cases and replicate SQL issues

  • SQL Repair Advisor
  • Analyzes failing SQL statements to isolate bug
  • May recommend SQL Patch as work around
slide-28
SLIDE 28

Data Recovery Advisor

  • In an outage, uncertainty and confusion are

common

  • Largest part of downtime is:
  • Investigating the problem, planning a solution
  • Data Recovery Advisor
  • Automates investigation, reports all problems
  • Intelligently determines plan for recovery
  • Handles multiple failure situations
  • Presents only feasible recovery options
  • Are there backups, is there a standby?
  • Ranked by repair time and data loss
  • Can automatically apply recovery plan

Time to Repair

Recovery Time Investigation Time Planning Time

Reduces downtime by Eliminating Confusion

slide-29
SLIDE 29

SQL Test Case Builder

Business Requirement

  • Bug resolution
  • Test case required for fast bug resolution
  • Not always easy to provide a test case
  • What information should be provided?
  • How much data is need?
  • Getting the test case to Oracle can be tricky

Solution

  • Oracle automatically creates a test case
  • Collects necessary information relating to a SQL incident
  • Collected data is packaged to be sent to Oracle
  • Collected data allows a developer to reproduce the problem
slide-30
SLIDE 30

SQL Repair Advisor

Solution

  • Advisor
  • Investigates the incident locally
  • Automatically determines the root cause
  • Provides a workaround (SQL Patch) for just the effected SQL
  • If not, sends necessary diagnostic information to Oracle

Business Requirement

  • The most common types of SQL problems - exception, performance

regression etc., are hard to diagnose

  • A lot of time is spent trying to reproduce the problem
  • If a workaround is found it has to be applied to entire system
slide-31
SLIDE 31

SQL Repair Advisor Flow

SQL statement Execute Statement crashes Generate incident in ADR automatically DBA runs SQL Repair Advisor DBA gets alerted SQL Repair Advisor investigates SQL patch generated DBA accepts SQL patch Statement executes successfully again SQL statement patched Execute Trace files

slide-32
SLIDE 32

Automatic Diagnostic Workflow

Critical Error ADR Zip File IPS DBA Support Workbench Repair Checkers IPS

alert

Oracle Support

slide-33
SLIDE 33

Summary of Important Changes to 11g Diagnostic Data Management

  • Location of traces and dumps has changed
  • Now located within the ADR file hierarchy
  • Dumps are separated out from traces
  • Format of alert.log is changed
  • New XML format allows easy parsing
  • Old style still available for backward compatibility
  • Automatic space management
  • Incident flood control minimizes number of traces
  • ADR auto-purging reduces disk footprint
  • Use IPS (through Support Workbench) to package

diagnostic data sent to Oracle!

slide-34
SLIDE 34

Recommended Campground Demos

Moscone West Exhibit Hall Change Management & Data Masking for DBAs Moscone West Exhibit Hall Self-Managing Database: Automatic Application & SQL Tuning Moscone West Exhibit Hall Self-Managing Database: Oracle Database 11g SQL Plan Management Moscone West Exhibit Hall Self-Managing Database: Automatic Fault Diagnostics Moscone West Exhibit Hall Oracle Real Application Testing: Database Replay Moscone West Exhibit Hall Oracle Real Application Testing: SQL Performance Analyzer Moscone West Exhibit Hall Self-Managing Database: Automatic Performance Diagnostics

Location Demo

slide-35
SLIDE 35