Skip to content

Leveraging DB2 Snapshot Administrative Views

Native DB2 LUW database snapshot utilities identify performance characteristics of the database at the moment they are run, but are not easily used to identify performance trends over days, weeks and months. There are obvious advantages to being able to compare performance trends over time; furthermore many snapshot data elements are not meaningful unless seen changing over time. With the advent in DB2 version 9 of System Administrative Snapshot Views, automating historical performance capture has become easy for Unix database administrators.

This article will show the administrator how to use Unix shell scripts to build a home-grown database performance history system without resort to third-party tools. Shell scripts are used to automate performance capture and provide 24×7 monitoring support. The author also provides 40 views representing calculated common metrics of the resulting performance history data.

Our target audience for this article are Unix DB2 Database Administrators and system support staff tasked with DB2 database monitoring in Unix environments. Solaris, AIX and Linux DB2 DBA’s running DB2 version 9 and later will benefit.

The benefits of consistently capturing historical performance data using snapshot administrative views are many:

  • Many levels of granularity are possible. Look at performance characteristics over an hour, day, week or month.
  • Quick access to hundreds of snapshot data elements.
  • Understand overnight/weekend performance problems.
  • Retrospective debugging of database problems. How many times have you found out about a problem hours after the problem actually manifested, too late to run snapshots and left to speculate about the source of the problem? DB2 issues often manifest in the middle of the night or on weekends. By the time you’re on the scene, the source of the crash is long gone, leaving little besides db2diag.log for backtracking. Having historical snapshot performance data on hand will allow you to more easily track down the gremlin.
  • Visualization potential using graphing software.

This longitudinal capability was what made DB2 monitoring tools such as Database Guys1 so powerful. You don’t need to purchase a third party tool to gain this capability. You just need to build your own capture system. It is not difficult.

Prior to DB2 version 8, the only way to obtain snapshot data was to generate a text file from GET SNAPSHOT output, and use shell scripts, perl or other parsing tools to aggregate the data. IBM introduced snapshot table functions in version 8.1 . These represented a huge leap forward by giving administrator access to snapshot data via SQL. For DB2 version 9 IBM introduced snapshot Administrative Views, which are now the preferred means of accessing snapshot data, and are easier to write for than table functions. Snapshot administrative views present snapshot data similarly to system catalog views.

Both snapshot Table Functions and Administrative Views are very useful, providing snapshot data about current database performance mediated by powerful SQL reporting language. The behavior of the database at this moment in time might be all you need to know. But the ugliest problems seem to happen outside of the 9-5 window, and single time slice snapshots don’t permit us to perceive patterns such as trends, recurring performance spikes, or slow degradation in database performance over the span of weeks. Snapshot data becomes vastly more useful when stored over time in performance monitoring tables.

Here’s how.

Assumptions

  • Version: DB2 LUW Version 9.
  • Unix OS is used for scripting examples, but the system could be adapted for Windows.
  • Either SYSADM authority or a combination of SYSMON and other authority [for table creation].
  • Understanding of System Monitor data elements and performance metrics. (see References.)
  • My performance view DDL and related queries were written for the condition of a single database per instance, because that was true in the shop where I developed this version of the capture system. If you have multiple databases per instance, modify my examples for primary key, metric view SELECT columns, predicates and GROUP BY clauses of report views to include DB_NAME. If you are a DPF shop, you will also need to make a similar change for column DBPARTITIONNUM.

Table of Contents

  1. Create snapshot historical tables
  2. Populate snapshot historical tables
  3. Performance Metric Views
  4. Peformance Metrics Two-Column Views
    1. Application and Connection Metrics

      1. Applications Connected
      2. Applications Executing
      3. New Connects Per Minute
      4. Physical (tablespace I/O) Metrics
      5. Physical Synchronous Read Percentage (SRP)
      6. Logical and Physical Read Rate Per Minute
    2. Bufferpool I/O Ratio Metrics
      1. Bufferpool Overall Hit Ratio Percentage
      2. Bufferpool Data Hit Percentage
      3. Bufferpool Index Hit Percentage
      4. Dirty Page Threshold Cleaners Ratio Percentage (TCR)
      5. Victim Cleans Ratio Percentage (VCR)
    3. Transactional Metrics
      1. Bufferpool Logical Index Reads Per Transaction (BPLITX)
      2. BP Logical Reads Per Transaction (BPLRTX)
      3. Sorts Per Transaction (SRTTX)
      4. Data Manipulation Language Statements per Transaction (DMLTX)
      5. Select/Insert/Update/Delete Statements per Transaction (SIUD_SQL_STMTS_TX)
      6. Select Statements per Transaction (SELECT_STMTS_TX)
      7. Transactions Per Minute (TPM)
    4. Locking Metrics
      1. Lock Escalations since last snapshot
      2. Exclusive (X) Lock Escalations Since Last Snapshot
      3. Lock Timeouts since last snapshot
      4. Deadlocks since last snapshot
      5. LockWait Time Seconds Per Minute
      6. Lock Waits Per Minute
    5. Sort Metrics
      1. Sort Time Seconds Per Minute
      2. Sorts Per Minute
    6. Logging Metrics
      1. Log Cleans Ratio Percentage (LCR)
      2. Log Write IO per Minute
      3. Total Log Used (MB)
    7. Miscellaneous 2-Column Metrics
      1. SQL Execution Time Seconds Per Minute
      2. Rows Read Per Minute
  5. 3+ Column Performance Views
    1. RR/RS Ratio (RRRSR)
    2. RR/RS Ratio (RRRSR) Aggregates
    3. Statements with High RR/RS Ratio (RRRSR)
    4. Rows Read per minute by Table
    5. Rows Written per minute by Table
  6. Prune snapshot historical tables
  7. References
  8. Footnotes

Create snapshot historical tables

This is possibly the best reference for available snapshot administrative views.

This article provides CREATE DDL for only 5 snapshot capture tables. These are the only tables you will need for the specific performance reports provided in this series (actually you won’t need SNAP_TBS_CFG_T either, but I had it so decided to include it).

If you want to store the full subset of columns from an administrative view, the quickest way to create your historical system administrative table is:

  1. Use the CREATE TABLE . . . LIKE… syntax to create the table, for example
    db2 create table DBA.SNAPDB like SYSIBMADM.SNAPDB
  2. Use ALTER TABLE.. ALTER COLUMN SET NOT NULL to make not null those columns which will comprise your primary key, for example:
    db2 alter table DBA.SNAPDB alter column SNAPSHOT_TIMESTAMP set not null
  3. You may need to reorg your new table; for example
    db2 reorg table dba.snapdb
  4. Create your unique index for primary key, and alter table add primary key.2

If you do not want to store the full set of columns from the SYSIBMADM view, you can create custom DDL for the administrative view. Unfortunately db2look will not generate DDL directly from an administrative view, because SYSIBMADM objects are not true user views.  But what you can do to make quick work of editing your own DDL is to follow Step #1 above (CREATE TABLE. . . LIKE . . . ) and then run db2look against the resulting object. Drop the table you created after you have produced the DDL from db2look. Massage the DDL to include only those columns you wish kept for history. Create the table from your edited DDL, then create your primary key.3

If you choose to capture a smaller subset of administrative view columns, change your populate table INSERT subselect to use explicitly named set of columns (instead of SELECT *).

The snapshot administrative views I use in the remainder of my examples are:

Snapshot Admin View Download DDL
SYSIBMADM.SNAPDB
dba.snapdb.ddl
SYSIBMADM.SNAPTAB dba.snaptab.ddl
SYSIBMADM.SNAPAPPL dba.snapappl.ddl
SYSIBMADM.SNAPSTMT dba.snapstmt.ddl

Download all these DDL files as a single zip.

Create these tables, and proceed to the next step: Populate snapshot historical tables

Populate snapshot historical tables

These script examples use cron and Korn shell, but you can substitute your own scheduler and scripting tool. For the performance granularity I sought, I had the populate script run every five minutes. Your executing user id must have at least SYSMON authority and INSERT authority on the target DBA schema tables.

Populate script: populate_snapshot_tables.ksh

Here is a sample line from cron -l :


0,5,10,15,20,25,30,35,40,45,50,55 * * * * /db2/support/scripts/populate_snapshot_tables.ksh -d dbname

Performance Metric Views

Included in this article are about forty views of database performance metrics. The 2-column views all feature an initial column SNAPSHOT TIMESTAMP with the second being the metric. These views are ideally suited for time graph 2-axis charting and visualization.

We then present a few views (3 or more columns) with more complex presentation of performance metrics, not suitable for an X-Y graph but useful for the diagnosis of specific performance issues.

Please see my References for sources and credits. All SQL in this article is original and is protected under a Creative Commons 3.0 license.4

I wrote these views for an environment in which there was only a single database per DB2 instance, so users with multiple databases per instance will have to make a small modification of some views. With multiple databases per instance, any views which interpret table DBA.SNAPDB, DBA.SNAPAPPL, or DBA.SNAPSTMT may need to be modified to add DB_NAME to select clauses, predicates and GROUP BY clauses ( If you are a DPF shop, you will also need to make a similar change for column DBPARTITIONNUM.).

To download ddl by individual view, go to the individual view entry below and click on the “[DOWNLOAD VIEW DDL]” link next to the view link. To download all DB2 view DDL at once as a ZIP archive, right-click on this link.


Peformance Metrics Two-Column Views

Application and Connection Metrics

Applications Connected

[DOWNLOAD VIEW DDL]

Description: Number of connections (logical application sessions, Application ID’s) to the database by snapshot timestamp.
Usage example:

select APPLS_CUR_CONS from dba.v_appls_cur_cons
order by snapshot_timestamp desc fetch first 1 row only

Applications Executing

[DOWNLOAD VIEW DDL]
Description: Applications Executing Currently (within 5 minutes). Number of connected AND EXECUTING logical session threads, or Applications, at time of snapshot. Most connections will not be executing at any snapshot point in time.
Usage example:

select APPLS_IN_DB2 from dba.v_appls_in_db2
order by snapshot_timestamp desc fetch first 1 row only

New Connects Per Minute

[DOWNLOAD VIEW DDL]
Description: Number of new connects formed per minute.

Usage example:

select DHR_PCT from DBA.V_DATA_HIT_RATIO order by snapshot_timestamp desc fetch first 1 row only

Physical (tablespace I/O) Metrics


Asynchronous Write Percentage (AWP)
[DOWNLOAD VIEW DDL]
Description: Asynchronous data and index writes to tablespace (“physical”) as a percentage of total index and data physical writes. It is ALWAYS desirable to do a high percentage of Asynchronous Write I/O. Not only do Asynchronous Write I/Os avoid delaying the application connection, but they also tend to be 3 to 10 times faster depending on your disks and storage system.

It is desirable to have AWP greater than 90%. However, if your database has very little Insert, Update, or Delete activity, then achieving a high AWP will be difficult, if not impossible, to do. On the other hand, if your database has DMLTX (update/insert/delete per transaction) greater than or equal to 1, then achieving a high AWP value is very important. To achieve a high AWP, you need to have a sufficient number of I/O Cleaners (DB CFG NUM_IOCLEANERS) and not have the Changed Pages Threshold set too high (DB CFG CHNGPGS_THRESH).5

How to calculate:

AWP = ( ( (Asynchronous pool data page writes + Asynchronous pool index page writes) * 100 )
/ (Buffer pool data writes + Buffer pool index writes) )

Usage example:

select ASYNC_WRITE_PCT from DBA.V_ASYNC_WRITE_RATIO
order by snapshot_timestamp desc fetch first 1 row only

Physical Synchronous Read Percentage (SRP)

[DOWNLOAD VIEW DDL]
Description: Indicates the number of data pages read in from the table space containers (physical) through synchronous processes, by database manager agents, as a percentage of total physical reads. Synchronous reads are in contrast to asynchronous reads by engine dispatchable units (EDUs). Synchronous physical reads are always preferable to asynchronous reads.

SRP = 100 - ( (100 * (Asynchronous pool data page reads + Asynchronous pool index page reads)) / (Buffer pool
 data physical reads + Buffer pool index physical reads) )

Usage example:

select SYNC_READ_PCT from DBA.V_SYNC_READ_PCT order by snapshot_timestamp desc fetch first 1 row only

Logical and Physical Read Rate Per Minute

[DOWNLOAD VIEW DDL]
Description: Sum of:

  • number of data pages which have been requested from the buffer pool (logical) for regular and large table spaces, and
  • number of data pages read in from the table space containers (physical) for regular and large table spaces.

Logical + Physical Data Reads per minute. An indicator of overall read activity.

How to calculate:

Total  read rate = ( bufferpool.data.logical.reads + bufferpool.data.physical.reads ) / time.interval

Usage example:

select PHYS_READ_IO_PER_MINUTE from DBA.V_PHYS_IO_READ_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Bufferpool I/O Ratio Metrics

Bufferpool Overall Hit Ratio Percentage

[DOWNLOAD VIEW DDL]
Description: Percentage of data and index reads directly from bufferpool (as opposed to read from disk). Should be 98-100% most of the time.

How to calculate:

     1 -6 * 100%

Usage example:

select OVERALL_HIT_RATIO from DBA.V_OVERALL_HIT_RATIO order by snapshot_timestamp desc fetch first 1 row only

Bufferpool Data Hit Percentage

[DOWNLOAD VIEW DDL]
Description: Percentage of data reads that are from the bufferpool rather than from physical disk. A measure of how efficiently the bufferpool is being used.

How to calculate:

Data Hit Percentage = 100 - 100 * ( bp.data.physical.reads / bp.data.logical.reads )

Usage example:

select DHR_PCT from DBA.V_DATA_HIT_RATIO order by snapshot_timestamp desc fetch first 1 row only

Bufferpool Index Hit Percentage

[DOWNLOAD VIEW DDL]
Description: Percentage of time that index reads are from bufferpool (as opposed to physical, from disk). Like Data Hit Ratio, we want to keep this in the 98%-100% range.

How to calculate:

Index Hit Ratio Percentage = 100 - 100 * ( bp.index.physical.reads / bp.index.logical.reads )

Usage example:

select INDEX_HIT_RATIO from DBA.V_INDEX_HIT_RATIO order by snapshot_timestamp desc fetch first 1 row only

Dirty Page Threshold Cleaners Ratio Percentage (TCR)

[DOWNLOAD VIEW DDL]

Description:

How often a bufferpool page cleaner was invoked because a buffer pool had reached the dirty page threshold criterion for the database (dirty pages contain data that has been changed in the bufferpool, but not yet written to disk). The threshold is set by the chngpgs_thresh configuration parameter. It is a percentage applied to the buffer pool size. When the number of dirty pages in the pool exceeds this value, the cleaners are triggered.

If chngpgs_thresh is set too low, pages might be written out too early, requiring them to be read back in. If set too high, then too many pages may accumulate, requiring users to write out pages synchronously in a long burst, tying up other processing (with poor resulting performance).

If this ratio is low, it may indicate that you have defined too many page cleaners. Or CHNGPGS_THRESH database configuration parameter is too high. If this ratio is high, it may indicate that you have too few page cleaners defined. Either way overall performance is impacted. Bufferpool size is also involved.

How to calculate:

TCR = 100 * ( dirty.page.threshold.cleaner.triggers /
(dirt.page.steal.cleaner.triggers + lsn.gap.cleaner.triggers + dirty.page.threshold.cleaner.triggers) )

Usage example:

select THRESHOLD_CLEANS_RATIO from DBA.V_THRESHOLD_CLEANS_RATIO
order by snapshot_timestamp desc fetch first 1 row only

Victim Cleans Ratio Percentage (VCR)

[DOWNLOAD VIEW DDL]
Description: Expressed as percentage of total page cleans, the number of times a page cleaner was invoked because a synchronous write was needed during the victim buffer replacement for the database. From a Quest document: “A victim page is a clean or dirty page in the buffer pool that is removed simply because DB2 needs to make room for incoming pages. If a victim page is a dirty page then the information must be written out to disk.”

How to calculate:

(VCR) = 100 * ( dirt.page.steal.cleaner.triggers /
(dirt.page.steal.cleaner.triggers + lsn.gap.cleaner.triggers
+ dirty.page.threshold.cleaner.triggers) )

You don’t want this over 40%, but also not too low. If this ratio is low, it may indicate that you have defined too many page cleaners. If your chngpgs_thresh is set too low, you may be writing out pages that you will dirty later. Aggressive cleaning defeats one purpose of the buffer pool, that is to defer writing to the last possible moment. If this ratio is high, it may indicate that you have too few page cleaners defined, or your bufferpool is too small. Too few page cleaners will increase recovery time after failures.

Usage example:

select VICTIM_CLEANS_RATIO from DBA.V_VICTIM_CLEANS_RATIO order by snapshot_timestamp desc fetch first 1 row only

Transactional Metrics

Bufferpool Logical Index Reads Per Transaction (BPLITX)

[DOWNLOAD VIEW DDL]
Description: Can help detect the existence of Index Page Leaf Scans.7
Usage example:

select BPLI_PER_TRANSACTION from DBA.V_BPLITX
order by snapshot_timestamp desc fetch first 1 row only

BP Logical Reads Per Transaction (BPLRTX)

[DOWNLOAD VIEW DDL]
Description: Bufferpool logical data and index reads per transaction.

How to calculate:

BPLRTX = (Buffer Pool Data Logical Reads + Buffer Pool Index Logical Reads) /
(Commit statements attempted + Rollback statements attempted)

The idea is to get BPLRTX as low as possible.8

Usage example:

select BPLR_PER_TRANSACTION from DBA.V_BPLRTX
order by snapshot_timestamp desc fetch first 1 row only

Sorts Per Transaction (SRTTX)

[DOWNLOAD VIEW DDL]
Description: Total sorts divided by number of transactions (units of work). I.e. number of sorts before a commit is taken.

Usage example:

select SORTS_PER_TRANSACTION from DBA.V_SORTS_PER_TRANSACTION order by snapshot_timestamp desc fetch first 1 row only

Data Manipulation Language Statements per Transaction (DMLTX)

[DOWNLOAD VIEW DDL]
Description: Update/delete/insert statements per transaction (no Selects counted). This graph helps us understand patterns in application update behavior.

Usage example:

select DMLTX from DBA.V_DMLTX order by snapshot_timestamp desc fetch first 1 row only

Select/Insert/Update/Delete Statements per Transaction (SIUD_SQL_STMTS_TX)

[DOWNLOAD VIEW DDL]
Description: Select/Update/delete/insert statements per transaction (i.e. all SQL excluding DCL).

Usage example:

select SIUD_SQL_STMTS_TX from DBA.V_STMTS_TX order by snapshot_timestamp desc fetch first 1 row only

Select Statements per Transaction (SELECT_STMTS_TX)

[DOWNLOAD VIEW DDL]
Description: Select statements (only) per transaction.

Usage example:

select SELECT_STMTS_TX from DBA.V_SELECT_STMTS_TX order by snapshot_timestamp desc fetch first 1 row only

Transactions Per Minute (TPM)

[DOWNLOAD VIEW DDL]
Description: Transactions per minute. Your indication of whether or not this is a high transaction volume database.

Usage example:

select UOW_PER_MINUTE from DBA.V_UOW_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Locking Metrics

Lock Escalations since last snapshot

[DOWNLOAD VIEW DDL]
Description: Number of lock escalations since last snapshot (last 5 minutes). I was dealing with a database with very rare lock escalation so decided not to do a per minute calculation but just show the number of lock escalations last snapshot.

Usage example:

select LOCKESCALS from DBA.V_LOCK_ESCAL order by snapshot_timestamp desc fetch first 1 row only

Exclusive (X) Lock Escalations Since Last Snapshot

[DOWNLOAD VIEW DDL]
Description: An exclusive (X) lock escalation can cause locking contention, so seeing this happen on a frequent basis helps us to tune both the application and table locking parameters.

Usage example:

select X_LOCKESCALS from DBA.V_XLOCK_ESCAL order by snapshot_timestamp desc fetch first 1 row only

Lock Timeouts since last snapshot

[DOWNLOAD VIEW DDL]
Description: Number of lock timeouts (SQLCODE -911 return code 68) since last snapshot (last 5 minutes). As with lock escalations, I was dealing with a database with very rare lock timeouts so decided not to do a per minute calculation but just show any at all that had occurred – any non-zero value was of concern.

Usage example:

select LOCK_TIMEOUTS from DBA.V_LOCK_TIMEOUTS order by snapshot_timestamp desc fetch first 1 row only

Deadlocks since last snapshot

[DOWNLOAD VIEW DDL]
Description: Number of Deadlocks (SQLCODE -911 return code 2) since last snapshot (last 5 minutes or monitoring interval).

Usage example:

select DEADLOCKS from dba.v_deadlocks order by snapshot_timestamp desc fetch first 1 row only

LockWait Time Seconds Per Minute

[DOWNLOAD VIEW DDL]
Description: Total lockwait time for all applications expressed in seconds per minute.

Usage example:

select LOCKWAIT_TIME_PER_MINUTE from DBA.V_LOCKWAIT_TIME_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Lock Waits Per Minute

[DOWNLOAD VIEW DDL]
Description: Total lockwaits for all applications per minute.

Usage example:

select LOCKWAITS_PER_MINUTE from DBA.V_LOCKWAITS_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Sort Metrics

Sort Time Seconds Per Minute

[DOWNLOAD VIEW DDL]
Description: Total sort time, all applications, in seconds per minute. A good indicator of when sort-intensive processes are running.

Usage example:

select SORT_TIME_PER_MINUTE from DBA.V_SORT_TIME_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Sorts Per Minute

[DOWNLOAD VIEW DDL]
Description: Total number of sorts, all applications, per minute.

Usage example:

select SORTS_PER_MINUTE from DBA.V_SORTS_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Logging Metrics

Log Cleans Ratio Percentage (LCR)

[DOWNLOAD VIEW DDL]
Description: Number of times LSN gap cleaner trigger as percentage of total cleaners. From the IBM DB2 Version 9 Infocenter:

When the amount of log space encompassing the log record which has updated the oldest page in the buffer pool and the current log position exceeds that allowed by the softmax parameter, it is said that the database is in an ‘LSN gap’ situation. Under the default method of page cleaning, when the logger detects than an LSN gap has occurred, it will trigger the page cleaners to write out all the pages which are contributing to the LSN gap situation. That is, it will write out those pages which are older than what is allowed by the softmax parameter. Page cleaners will be idle for some period of time while no LSN gap is occurring. Then, once an LSN gap occurs, the page cleaners are activated to write a large number of pages before going back to sleep. This can result in the saturation of the I/O subsystem, which then affects other agents which are reading or writing pages.

When LSN gap cleaner triggers (LCR approaches 100%), check for performance issues on the database. If you have configured very large logs, softmax may need to be successively lowered until LSN gap cleaner triggers more frequently. If there is strong correlation, it might be necessary to use the new DB2 proactive page cleaning feature DB2_USE_ALTERNATE_PAGE_CLEANING (registry setting).

How to calculate:

LCR = 100 * ( lsn.gap.cleaner.triggers /
( dirt.page.steal.cleaner.triggers
+ lsn.gap.cleaner.triggers
+ dirty.page.threshold.cleaner.triggers  ) )

Usage example:

select LOG_CLEANS_RATIO from DBA.V_LOG_CLEANS_RATIO order by snapshot_timestamp desc fetch first 1 row only

Log Write IO per Minute

[DOWNLOAD VIEW DDL]
Description: The number of I/O requests issued by the logger for writing log data to the disk, per minute. This is an indicator of logging activity, and can be used in conjunction with other graphs to determine whether logging is responsible for application performance issues. Can also point toward heavy update periods.

Usage example:

select NUM_LOG_WRITE_IO_PER_MINUTE from DBA.V_LOG_WRITE_IO_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Total Log Used (MB)

[DOWNLOAD VIEW DDL]
Description: Total log space in megabytes. This can show us problematic or dangerously large units of work, could also be an alert that we are about to exhaust log space. Also allows you to better visualize the growth and contraction of log files over a period of time.

Usage example:

select TOTAL_LOG_USED_MB from DBA.V_TOTAL_LOG_USED_MB order by snapshot_timestamp desc fetch first 1 row only

Miscellaneous 2-Column Metrics

SQL Execution Time Seconds Per Minute

[DOWNLOAD VIEW DDL]
Description: SQL Execution Time in seconds per minute.

Usage example:

select ELAPSED_EXEC_TIME_S_PER_MINUTE from DBA.V_SQL_EXEC_TIME_PERMINUTE
order by snapshot_timestamp desc fetch first 1 row only

Rows Read Per Minute

[DOWNLOAD VIEW DDL]
Description: Rows read from all tables, per minute. Not the same as rows selected. For example a poorly performing SQL might have to read 10,000 rows of a table in order to return only one row in a SELECT statement.

Usage example:

select ROWS_READ_PER_MINUTE from DBA.v_rows_read_perminute order by snapshot_timestamp desc fetch first 1 row only

3+ Column Performance Views

RR/RS Ratio (RRRSR)

[DOWNLOAD VIEW DDL]
Description: Rows Read / Rows Selected Ratio (RRRSR) by agents’ ID. RRRSR indicates how many table rows were read, total, as a ratio to number of rows returned in all SELECT statements executed by that agent during its lifetime. This can be an indicator of poor SQL performance.

Table Describe:

describe table dba.v_rr_rs_ratio

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
AGENT_ID                       SYSIBM    BIGINT                    8     0 No
APPL_RR_RS_RATIO               SYSIBM    DECIMAL                  18     2 Yes

RR/RS Ratio (RRRSR) Aggregates

[DOWNLOAD VIEW DDL]
Description: Maximum and Average value found of Rows Read to Rows Selected Ratio (RRRSR) for all agents sampled at this interval. RRRSR indicates how many table rows were read, total, as a ratio to number of rows returned in all SELECT statements executed by that agent during its lifetime. This metric is an aggregate among all agents alive at that snapshot. The maximum is a good indicator of performance spikes.

This view is dependent on view dba.v_rr_rs_ratio (above).

Table Describe:

describe table dba.v_rr_rs_ratio_aggregate

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
APPL_RR_RS_RATIO_AVG           SYSIBM    DECIMAL                  18     2 Yes
APPL_RR_RS_RATIO_MAX           SYSIBM    DECIMAL                  18     2 Yes

Statements with High RR/RS Ratio (RRRSR)

[DOWNLOAD VIEW DDL]
Description: All captured statements which match applid/timestamp with at least 80% of max RR_RS ratio for that snapshot interval. The view should show you some of the worst performing SQL captured at your snapshot interval. Because snapshot microseconds might not match between administrative views SNAPSTMT and SNAPAPPL, SQL allows for join between the two capture tables with a 2 second variance, which can be adjusted for your own implementation. The 80% of max value can also be easily modified for your use.

Table Describe:

describe table dba.v_stmt_high_rr_rs_ratio

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
RR_RS_RATIO                    SYSIBM    DECIMAL                  18     2 Yes
STMT_TEXT                      SYSIBM    CLOB               16777216     0 Yes

Rows Read per minute by Table

[DOWNLOAD VIEW DDL]
Description: By table, rows read per minute per snapshot interval.
Table Describe:

describe table dba.v_table_rows_read_perminute

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
TABSCHEMA                      SYSIBM    VARCHAR                 128     0 No
TABNAME                        SYSIBM    VARCHAR                 128     0 No
TABLE_ROWS_READ_PER_MINUTE     SYSIBM    DECIMAL                  18     2 Yes

Rows Written per minute by Table

[DOWNLOAD VIEW DDL]
Description: By table, rows written per minute per snapshot interval.
Table Describe:

describe table dba.v_table_rows_written_perminute

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
TABSCHEMA                      SYSIBM    VARCHAR                 128     0 No
TABNAME                        SYSIBM    VARCHAR                 128     0 No
TABLE_ROWS_WRITTEN_PER_MINUTE  SYSIBM    DECIMAL                  21     2 Yes

  4 record(s) selected.

Prune snapshot historical tables

I recommend you create the performance tables in their own DMS tablespace, so that growth in these tables does not endanger space requirements for production data. In any case, you will need to prune data, probably on a weekly basis. The following script can be used as a template for your own prune process. Remember to also REORG / RUNSTATS tables periodically.

example crontab entry, for database dbname, number of days old before purging: 90; run on Monday morning after midnight.

01 00 * * 1 /db2/support/scripts/purge_dba_snapshot_tables.ksh -d dbname -n 90

Download the snapshot history prune script here:
purge_dba_snapshot_tables.ksh

Or view it inline…
inline script:

#!/bin/ksh
# purge_dba_snapshot_tables.ksh
# purge DBA schema snapshot capture tables
#=========================================================================
#  Parameters:
#     -d  :  Required.  The name of the database against which the SQL will
#            run.
#     -n  :  Number of days back to prune
#=========================================================================
scriptStatusLog=/db2/support/scripts/purge_dba_snapshot_tables.log
# edit prfile to your own local configuration.
prfile=/home/db2/sqllib/db2profile
db2options="-av"
function Usage {
echo 'Usage: purge_dba_snapshot_tables.ksh -d  -n  '
echo 'All parameters are required. '
echo 'Use this script to purge snapshot table data older than n days.'
}

if [[ "$#" -lt "4" ]]; then
  Usage
  exit
fi

while getopts  d:n: OPT
do
   case $OPT in
      d) DbName=$OPTARG ;;
      n) PurgeDays=$OPTARG ;;
      *) Usage; echo "$(date) Usage error." >> $scriptStatusLog ; exit 1 ;;
   esac
done

if [ -f $prfile ]; then
   . $prfile
else
   echo "$(date) Cannot locate $prfile." >> $scriptStatusLog
   Usage
   exit 1
fi

echo "$(date) Now executing purge_dba_snapshot_tables.ksh with following options:" >> $scriptStatusLog
echo "DbName = $DbName " >>  $scriptStatusLog
echo "Purge Days = $PurgeDays " >>  $scriptStatusLog
echo "$(date) Now connecting . . . " >>  $scriptStatusLog
db2 connect to $DbName >> $scriptStatusLog 2>&1
echo "$(date) Now executing purge . . . " >>  $scriptStatusLog
db2 $db2options  "delete from dba.snapdb where snapshot_timestamp < current timestamp - $PurgeDays days" >> $scriptStatusLog 2>&1
db2 $db2options  "delete from dba.snapappl where snapshot_timestamp < current timestamp - $PurgeDays days" >> $scriptStatusLog 2>&1
db2 $db2options  "delete from dba.snapstmt where snapshot_timestamp < current timestamp - $PurgeDays days" >> $scriptStatusLog 2>&1
echo "$(date) purge_dba_snapshot_tables completed with RC = $?  " >>  $scriptStatusLog
db2 connect reset > /dev/null 2>&1
db2 terminate  > /dev/null 2>&1

References

Some good articles on use of snapshot data for performance analysis:


Footnotes

  1. later DGI; purchased in 2003 by BMC Software []
  2. You will notice that my CREATE INDEX DDL omits DB_NAME, because I developed the DDL for a single database instance. You will want to modify DDL for a multiple database instance (if you are a DPF shop, you will also need to make a similar change for column DBPARTITIONNUM). I chose to use table schema ‘DBA’ for the performance tables, you can use whatever name you want, edit the performance views to reflect your changes. []
  3. Thank you to Frederik Engelen for help simplifying the process of creating administrative view history.  []
  4. I can also help you to adapt these database monitoring tools to your environment; see my consulting website for contact information. []
  5. See Scott Hayes article. []
  6. pool_data_p_reads + pool_xda_p_reads + pool_index_p_reads + pool_temp_data_p_reads + pool_temp_xda_p_reads + pool_temp_index_p_reads ) / (pool_data_l_reads + pool_xda_l_reads + pool_index_l_reads + pool_temp_data_l_reads + pool_temp_xda_l_reads + pool_temp_index_l_reads []
  7. Scott Hayes says:

    Let’s say you have an index on columns A and B. If you have an SQL query that says WHERE B = ‘some value’ but does not include search criteria for column A, then it is highly likely that DB2 will asynchronously prefetch the leaf pages of the index to scan for occurrences of B = ‘some value’. If there are 1000′s of leaf pages in a 3 or 4 level index, this can be a very CPU, and possibly I/O, expensive operation.

    []

  8. Scott Hayes suggests:

    To get an idea of your ‘best case goal’ for BPLRTX, we need to look at a few data points:

    1. The average number of SELECTS per Transaction (SELTX)
    2. The average number of Insert, Update, and Deletes per Transaction (DMLTX)
    3. The average number of Statements per Transaction (STMTTX) = SELTX + DMLTX
    4. The average number of Index NLEVELS in the database: Select avg(NLEVELS) from SYSCAT.INDEXES
    5. Add 1 to the average NLEVELS giving AVGNLEVEL

    Finally, your ‘best case goal’ for BPLRTX is computed by the formula:

     BPLRTX GOAL = STMTTX x (AVGNLEVEL + 1)

    This goal provides the minimum possible, best case, value for BPLRTX by multiplying the average number of statements per transaction by the (rounded up) average Index NLEVELS plus 1 to cover a logical read to the data page. By way of example, if an average transaction does 8 SELECT statements and 2 DML statements, and if the average NLEVEL is 3, then best case BPLRTX GOAL = (8+2) x (3+1) = 40 Bufferpool Logical Reads per Transaction.

    Now, let’s be honest with each other. The world isn’t perfect, the application’s SQL is probably far from perfect, and the database’s indexes aren’t perfect either. SO, a realistic goal would include adding a 50% slop factor to the ideal goal. This would give us BPLRTX 60 in the previous example.

    []

6 Comments

  1. Frederik Engelen

    Wow, the quality of your latest posts is amazing. I am a bit late on congratulating you on this, but thanks anyway.

    I’m wondering, will the quality stay the same once you get rewarded IBM Data Champion, or will you just doze off :-)

    Posted on 08-Jan-09 at 02:33 | Permalink
  2. Frederik Engelen

    A small suggestion to create the historic tables. Just use the CREATE LIKE functionality:

    create table dba.snapdb like sysibmadm.snapdb

    Posted on 09-Jan-09 at 08:15 | Permalink
  3. Thanks Frederik, I implemented your suggestion. The advise on creating historical tables has been revised accordingly.

    Posted on 27-Feb-09 at 16:00 | Permalink
  4. Hi Jeffrey,

    Unfortunately, http://www.ibmdatabasemag.com will be going away – the articles will be preserved somewhere (TBD), but I’ve heard the blog content isn’t going to survive – however, no worries, the DB2 LUW Performance blogs have a new home at http://www.dbisoftware.com/blog/db2_performance.php – you might want to edit your hyperlink references if you can to the correct URLs.

    You did a semi-respectable job with this impressive post. I’m sure you invested at least 40 hours in writing and preparing it, and I appreciate the references, credits, and links. Your efforts to assist the DB2 LUW community are praise worthy, especially for the organizations “without tools budget” in this present economy – certainly some performance insight is better than none.

    However, of course, I earnestly have to disagree that tools are not needed. They are. Organizations should focus on their core businesses – not writing in-house tools and scripts that have to be maintained, and maintained and often re-written when a team member leaves an organization. Furthermore, the right tools can provide performance insight that cannot be obtained from scripts, snapshots, and the methods you’ve described – particularly SQL cost aggregation analysis that is described by this blog post: “Identifying Mosquito Swarms” at http://www.dbisoftware.com/blog/db2_performance.php?id=123

    The right tools further add value by graphing performance trends over time – for database, partition, bufferpool, tablespace, table, and even individual SQL metrics, and including important change events plotted on the performance graphs. DBI’s Brother-Panther tracks and plots change events on performance graphs so that our customers can SEE the performance consequences of changes. These charts help team members collaborate, and also answer that dreaded question “Performance was great yesterday – what changed?!?!?!?”

    Just this week I’ve been working with a transportation company in Florida and we’ve been able to make huge performance improvements in just a few hours with the advanced analytical capabilities of Brother-Panther(R) for DB2 LUW. Cost aggregation across workload timeframes is critical to assuring that you will get optimum performance results for a minimal investment of DBA time. The methods you have described here do not perform statement cost aggregation and, at best, will only help DBAs “hunt elephants”. See http://www.dbisoftware.com/blog/db2_performance.php?id=121
    and http://www.dbisoftware.com/blog/db2_performance.php?id=122.

    On behalf of DBI, I hope your blog readers will visit DBI’s newly updated web site at http://www.dbisoftware.com and watch “Database Truths in Black and White”. We also just completed work with the San Mateo IBM Innovation Center and demonstrated 24-44% energy savings, 99% cuts in response times, and over 200X user scalability — learn more at http://www.dbisoftware.com/greenperformance.php

    Thanks for your service to the DB2 LUW community.

    Regards,
    Scott

    Posted on 15-Jun-09 at 19:55 | Permalink
  5. Steve Westfall

    Jeff,

    There appears to be a syntax problem in the definition of view dba.v_dmltx. On my 9.5 system is returns the following:

    SQL0104N An unexpected token “,” was found following “ba.snapdb as SNAP0 )”.
    Expected tokens may include: “JOIN”. SQLSTATE=42601

    If I find time to investigate it and if I find a solution, I’ll let you know.

    Good article. I’ve done this type of metrics collection in the past, and I’m glad you wrote this up.

    Steve

    Posted on 20-Aug-09 at 10:07 | Permalink
  6. Satya

    Is there any way I can have since how long each sql ran/ or running… want to save the respond time for each sql ?

    Posted on 07-Dec-09 at 09:56 | Permalink

3 Trackbacks/Pingbacks

  1. [...] The main article in this series (read first) is “Effective DB2 Monitoring with Snapshot Administrative Views.” [...]

  2. [...] a better job by writing my own snapshot captures and saving Snapshot Administrative Views to tables for 24 hour monitoring. The Health features would be nice, if only Health Monitor worked the way it [...]

  3. [...] Research: If you have a historical performance monitoring system, study of your resulting graphs may identify heavy update tasks or even utility processes that you [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*