Native DB2 LUW database snapshot utilities identify performance characteristics of the database at the moment they are run, but are not easily used to identify performance trends over days, weeks and months. There are obvious advantages to being able to compare performance trends over time; furthermore many snapshot data elements are not meaningful unless seen changing over time. With the advent in DB2 version 9 of System Administrative Snapshot Views, automating historical performance capture has become easy for Unix database administrators.

This article will show the administrator how to use Unix shell scripts to build a home-grown database performance history system without resort to third-party tools. Shell scripts are used to automate performance capture and provide 24×7 monitoring support. The author also provides 40 views representing calculated common metrics of the resulting performance history data.

Our target audience for this article are Unix DB2 Database Administrators and system support staff tasked with DB2 database monitoring in Unix environments. Solaris, AIX and Linux DB2 DBA’s running DB2 version 9 and later will benefit.

The benefits of consistently capturing historical performance data using snapshot administrative views are many:

Many levels of granularity are possible. Look at performance characteristics over an hour, day, week or month.
Quick access to hundreds of snapshot data elements.
Understand overnight/weekend performance problems.
Retrospective debugging of database problems. How many times have you found out about a problem hours after the problem actually manifested, too late to run snapshots and left to speculate about the source of the problem? DB2 issues often manifest in the middle of the night or on weekends. By the time you’re on the scene, the source of the crash is long gone, leaving little besides db2diag.log for backtracking. Having historical snapshot performance data on hand will allow you to more easily track down the gremlin.
Visualization potential using graphing software.

This longitudinal capability was what made DB2 monitoring tools such as Database Guys ((later DGI; purchased in 2003 by BMC Software)) so powerful. You don’t need to purchase a third party tool to gain this capability. You just need to build your own capture system. It is not difficult.

Prior to DB2 version 8, the only way to obtain snapshot data was to generate a text file from GET SNAPSHOT output, and use shell scripts, perl or other parsing tools to aggregate the data. IBM introduced snapshot table functions in version 8.1 . These represented a huge leap forward by giving administrator access to snapshot data via SQL. For DB2 version 9 IBM introduced snapshot Administrative Views, which are now the preferred means of accessing snapshot data, and are easier to write for than table functions. Snapshot administrative views present snapshot data similarly to system catalog views.

Both snapshot Table Functions and Administrative Views are very useful, providing snapshot data about current database performance mediated by powerful SQL reporting language. The behavior of the database at this moment in time might be all you need to know. But the ugliest problems seem to happen outside of the 9-5 window, and single time slice snapshots don’t permit us to perceive patterns such as trends, recurring performance spikes, or slow degradation in database performance over the span of weeks. Snapshot data becomes vastly more useful when stored over time in performance monitoring tables.

Here’s how.

Assumptions

Version: DB2 LUW Version 9.
Unix OS is used for scripting examples, but the system could be adapted for Windows.
Either SYSADM authority or a combination of SYSMON and other authority [for table creation].
Understanding of System Monitor data elements and performance metrics. (see References.)
My performance view DDL and related queries were written for the condition of a single database per instance, because that was true in the shop where I developed this version of the capture system. If you have multiple databases per instance, modify my examples for primary key, metric view SELECT columns, predicates and GROUP BY clauses of report views to include DB_NAME. If you are a DPF shop, you will also need to make a similar change for column DBPARTITIONNUM.

Table of Contents

Create snapshot historical tables

This is possibly the best reference for available snapshot administrative views.

This article provides CREATE DDL for only 5 snapshot capture tables. These are the only tables you will need for the specific performance reports provided in this series (actually you won’t need SNAP_TBS_CFG_T either, but I had it so decided to include it).

If you want to store the full subset of columns from an administrative view, the quickest way to create your historical system administrative table is:

Use the CREATE TABLE . . . LIKE… syntax to create the table, for example
db2 create table DBA.SNAPDB like SYSIBMADM.SNAPDB
Use ALTER TABLE.. ALTER COLUMN SET NOT NULL to make not null those columns which will comprise your primary key, for example:
db2 alter table DBA.SNAPDB alter column SNAPSHOT_TIMESTAMP set not null
You may need to reorg your new table; for example
db2 reorg table dba.snapdb
Create your unique index for primary key, and alter table add primary key. (( You will notice that my CREATE INDEX DDL omits DB_NAME, because I developed the DDL for a single database instance. You will want to modify DDL for a multiple database instance (if you are a DPF shop, you will also need to make a similar change for column DBPARTITIONNUM). I chose to use table schema ‘DBA’ for the performance tables, you can use whatever name you want, edit the performance views to reflect your changes. ))

If you do not want to store the full set of columns from the SYSIBMADM view, you can create custom DDL for the administrative view. Unfortunately db2look will not generate DDL directly from an administrative view, because SYSIBMADM objects are not true user views. But what you can do to make quick work of editing your own DDL is to follow Step #1 above (CREATE TABLE. . . LIKE . . . ) and then run db2look against the resulting object. Drop the table you created after you have produced the DDL from db2look. Massage the DDL to include only those columns you wish kept for history. Create the table from your edited DDL, then create your primary key. (( Thank you to Frederik Engelen for help simplifying the process of creating administrative view history. ))

If you choose to capture a smaller subset of administrative view columns, change your populate table INSERT subselect to use explicitly named set of columns (instead of SELECT *).

The snapshot administrative views I use in the remainder of my examples are:

Snapshot Admin View	Download DDL
SYSIBMADM.SNAPDB	dba.snapdb.ddl
SYSIBMADM.SNAPTAB	dba.snaptab.ddl
SYSIBMADM.SNAPAPPL	dba.snapappl.ddl
SYSIBMADM.SNAPSTMT	dba.snapstmt.ddl

Download all these DDL files as a single zip.

Create these tables, and proceed to the next step: Populate snapshot historical tables

Populate snapshot historical tables

These script examples use cron and Korn shell, but you can substitute your own scheduler and scripting tool. For the performance granularity I sought, I had the populate script run every five minutes. Your executing user id must have at least SYSMON authority and INSERT authority on the target DBA schema tables.

Populate script: populate_snapshot_tables.ksh

Here is a sample line from cron -l :

0,5,10,15,20,25,30,35,40,45,50,55 * * * * /db2/support/scripts/populate_snapshot_tables.ksh -d dbname

Performance Metric Views

Included in this article are about forty views of database performance metrics. The 2-column views all feature an initial column SNAPSHOT TIMESTAMP with the second being the metric. These views are ideally suited for time graph 2-axis charting and visualization.

We then present a few views (3 or more columns) with more complex presentation of performance metrics, not suitable for an X-Y graph but useful for the diagnosis of specific performance issues.

Please see my References for sources and credits. All SQL in this article is original and is protected under a Creative Commons 3.0 license. (( I can also help you to adapt these database monitoring tools to your environment; see my consulting website for contact information. ))

I wrote these views for an environment in which there was only a single database per DB2 instance, so users with multiple databases per instance will have to make a small modification of some views. With multiple databases per instance, any views which interpret table DBA.SNAPDB, DBA.SNAPAPPL, or DBA.SNAPSTMT may need to be modified to add DB_NAME to select clauses, predicates and GROUP BY clauses ( If you are a DPF shop, you will also need to make a similar change for column DBPARTITIONNUM.).

To download ddl by individual view, go to the individual view entry below and click on the “[DOWNLOAD VIEW DDL]” link next to the view link. To download all DB2 view DDL at once as a ZIP archive, right-click on this link.

Peformance Metrics Two-Column Views

Application and Connection Metrics

Applications Connected

[DOWNLOAD VIEW DDL]

Description: Number of connections (logical application sessions, Application ID’s) to the database by snapshot timestamp.
Usage example:

select APPLS_CUR_CONS from dba.v_appls_cur_cons
order by snapshot_timestamp desc fetch first 1 row only

Applications Executing

[DOWNLOAD VIEW DDL]
Description: Applications Executing Currently (within 5 minutes). Number of connected AND EXECUTING logical session threads, or Applications, at time of snapshot. Most connections will not be executing at any snapshot point in time.
Usage example:

select APPLS_IN_DB2 from dba.v_appls_in_db2
order by snapshot_timestamp desc fetch first 1 row only

New Connects Per Minute

[DOWNLOAD VIEW DDL]
Description: Number of new connects formed per minute.

Usage example:

select DHR_PCT from DBA.V_DATA_HIT_RATIO order by snapshot_timestamp desc fetch first 1 row only

Physical (tablespace I/O) Metrics

Asynchronous Write Percentage (AWP)
[DOWNLOAD VIEW DDL]
Description: Asynchronous data and index writes to tablespace (“physical”) as a percentage of total index and data physical writes. It is ALWAYS desirable to do a high percentage of Asynchronous Write I/O. Not only do Asynchronous Write I/Os avoid delaying the application connection, but they also tend to be 3 to 10 times faster depending on your disks and storage system.

It is desirable to have AWP greater than 90%. However, if your database has very little Insert, Update, or Delete activity, then achieving a high AWP will be difficult, if not impossible, to do. On the other hand, if your database has DMLTX (update/insert/delete per transaction) greater than or equal to 1, then achieving a high AWP value is very important. To achieve a high AWP, you need to have a sufficient number of I/O Cleaners (DB CFG NUM_IOCLEANERS) and not have the Changed Pages Threshold set too high (DB CFG CHNGPGS_THRESH). (( See Scott Hayes article. ))

How to calculate:

AWP = ( ( (Asynchronous pool data page writes + Asynchronous pool index page writes) * 100 )
/ (Buffer pool data writes + Buffer pool index writes) )

Usage example:

select ASYNC_WRITE_PCT from DBA.V_ASYNC_WRITE_RATIO
order by snapshot_timestamp desc fetch first 1 row only

Physical Synchronous Read Percentage (SRP)

[DOWNLOAD VIEW DDL]
Description: Indicates the number of data pages read in from the table space containers (physical) through synchronous processes, by database manager agents, as a percentage of total physical reads. Synchronous reads are in contrast to asynchronous reads by engine dispatchable units (EDUs). Synchronous physical reads are always preferable to asynchronous reads.

SRP = 100 - ( (100 * (Asynchronous pool data page reads + Asynchronous pool index page reads)) / (Buffer pool
 data physical reads + Buffer pool index physical reads) )

Usage example:

select SYNC_READ_PCT from DBA.V_SYNC_READ_PCT order by snapshot_timestamp desc fetch first 1 row only

Logical and Physical Read Rate Per Minute

[DOWNLOAD VIEW DDL]
Description: Sum of:

number of data pages which have been requested from the buffer pool (logical) for regular and large table spaces, and
number of data pages read in from the table space containers (physical) for regular and large table spaces.

Logical + Physical Data Reads per minute. An indicator of overall read activity.

How to calculate:

Total  read rate = ( bufferpool.data.logical.reads + bufferpool.data.physical.reads ) / time.interval

Usage example:

select PHYS_READ_IO_PER_MINUTE from DBA.V_PHYS_IO_READ_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Bufferpool I/O Ratio Metrics

Bufferpool Overall Hit Ratio Percentage

[DOWNLOAD VIEW DDL]
Description: Percentage of data and index reads directly from bufferpool (as opposed to read from disk). Should be 98-100% most of the time.

How to calculate:

     1 - ((pool_data_p_reads + pool_xda_p_reads + 
	pool_index_p_reads + pool_temp_data_p_reads 
	+ pool_temp_xda_p_reads + pool_temp_index_p_reads )
     	/ (pool_data_l_reads + pool_xda_l_reads + pool_index_l_reads + 
     	pool_temp_data_l_reads + pool_temp_xda_l_reads 
     	+ pool_temp_index_l_reads )) * 100%

Usage example:

select OVERALL_HIT_RATIO from DBA.V_OVERALL_HIT_RATIO order by snapshot_timestamp desc fetch first 1 row only

Bufferpool Data Hit Percentage

[DOWNLOAD VIEW DDL]
Description: Percentage of data reads that are from the bufferpool rather than from physical disk. A measure of how efficiently the bufferpool is being used.

How to calculate:

Data Hit Percentage = 100 - 100 * ( bp.data.physical.reads / bp.data.logical.reads )

Usage example:

select DHR_PCT from DBA.V_DATA_HIT_RATIO order by snapshot_timestamp desc fetch first 1 row only

Bufferpool Index Hit Percentage

[DOWNLOAD VIEW DDL]
Description: Percentage of time that index reads are from bufferpool (as opposed to physical, from disk). Like Data Hit Ratio, we want to keep this in the 98%-100% range.

How to calculate:

Index Hit Ratio Percentage = 100 - 100 * ( bp.index.physical.reads / bp.index.logical.reads )

Usage example:

select INDEX_HIT_RATIO from DBA.V_INDEX_HIT_RATIO order by snapshot_timestamp desc fetch first 1 row only

Dirty Page Threshold Cleaners Ratio Percentage (TCR)

[DOWNLOAD VIEW DDL]

Description:

How often a bufferpool page cleaner was invoked because a buffer pool had reached the dirty page threshold criterion for the database (dirty pages contain data that has been changed in the bufferpool, but not yet written to disk). The threshold is set by the chngpgs_thresh configuration parameter. It is a percentage applied to the buffer pool size. When the number of dirty pages in the pool exceeds this value, the cleaners are triggered.

If chngpgs_thresh is set too low, pages might be written out too early, requiring them to be read back in. If set too high, then too many pages may accumulate, requiring users to write out pages synchronously in a long burst, tying up other processing (with poor resulting performance).

If this ratio is low, it may indicate that you have defined too many page cleaners. Or CHNGPGS_THRESH database configuration parameter is too high. If this ratio is high, it may indicate that you have too few page cleaners defined. Either way overall performance is impacted. Bufferpool size is also involved.

How to calculate:

TCR = 100 * ( dirty.page.threshold.cleaner.triggers /
(dirt.page.steal.cleaner.triggers + lsn.gap.cleaner.triggers + dirty.page.threshold.cleaner.triggers) )

Usage example:

select THRESHOLD_CLEANS_RATIO from DBA.V_THRESHOLD_CLEANS_RATIO
order by snapshot_timestamp desc fetch first 1 row only

Victim Cleans Ratio Percentage (VCR)

[DOWNLOAD VIEW DDL]
Description: Expressed as percentage of total page cleans, the number of times a page cleaner was invoked because a synchronous write was needed during the victim buffer replacement for the database. From a Quest document: “A victim page is a clean or dirty page in the buffer pool that is removed simply because DB2 needs to make room for incoming pages. If a victim page is a dirty page then the information must be written out to disk.”

How to calculate:

(VCR) = 100 * ( dirt.page.steal.cleaner.triggers /
(dirt.page.steal.cleaner.triggers + lsn.gap.cleaner.triggers
+ dirty.page.threshold.cleaner.triggers) )

You don’t want this over 40%, but also not too low. If this ratio is low, it may indicate that you have defined too many page cleaners. If your chngpgs_thresh is set too low, you may be writing out pages that you will dirty later. Aggressive cleaning defeats one purpose of the buffer pool, that is to defer writing to the last possible moment. If this ratio is high, it may indicate that you have too few page cleaners defined, or your bufferpool is too small. Too few page cleaners will increase recovery time after failures.

Usage example:

select VICTIM_CLEANS_RATIO from DBA.V_VICTIM_CLEANS_RATIO order by snapshot_timestamp desc fetch first 1 row only

Transactional Metrics

Bufferpool Logical Index Reads Per Transaction (BPLITX)

[DOWNLOAD VIEW DDL]
Description: Can help detect the existence of Index Page Leaf Scans. (( Scott Hayes says:

Let’s say you have an index on columns A and B. If you have an SQL query that says WHERE B = ‘some value’ but does not include search criteria for column A, then it is highly likely that DB2 will asynchronously prefetch the leaf pages of the index to scan for occurrences of B = ‘some value’. If there are 1000’s of leaf pages in a 3 or 4 level index, this can be a very CPU, and possibly I/O, expensive operation.

))
Usage example:

select BPLI_PER_TRANSACTION from DBA.V_BPLITX
order by snapshot_timestamp desc fetch first 1 row only

BP Logical Reads Per Transaction (BPLRTX)

[DOWNLOAD VIEW DDL]
Description: Bufferpool logical data and index reads per transaction.

How to calculate:

BPLRTX = (Buffer Pool Data Logical Reads + Buffer Pool Index Logical Reads) /
(Commit statements attempted + Rollback statements attempted)

The idea is to get BPLRTX as low as possible. (( Scott Hayes suggests:

To get an idea of your ‘best case goal’ for BPLRTX, we need to look at a few data points:

The average number of SELECTS per Transaction (SELTX)

The average number of Insert, Update, and Deletes per Transaction (DMLTX)

The average number of Statements per Transaction (STMTTX) = SELTX + DMLTX

The average number of Index NLEVELS in the database: Select avg(NLEVELS) from SYSCAT.INDEXES

Add 1 to the average NLEVELS giving AVGNLEVEL

Finally, your ‘best case goal’ for BPLRTX is computed by the formula:
 BPLRTX GOAL = STMTTX x (AVGNLEVEL + 1)
This goal provides the minimum possible, best case, value for BPLRTX by multiplying the average number of statements per transaction by the (rounded up) average Index NLEVELS plus 1 to cover a logical read to the data page. By way of example, if an average transaction does 8 SELECT statements and 2 DML statements, and if the average NLEVEL is 3, then best case BPLRTX GOAL = (8+2) x (3+1) = 40 Bufferpool Logical Reads per Transaction.

Now, let’s be honest with each other. The world isn’t perfect, the application’s SQL is probably far from perfect, and the database’s indexes aren’t perfect either. SO, a realistic goal would include adding a 50% slop factor to the ideal goal. This would give us BPLRTX 60 in the previous example.

))

Usage example:

select BPLR_PER_TRANSACTION from DBA.V_BPLRTX
order by snapshot_timestamp desc fetch first 1 row only

Sorts Per Transaction (SRTTX)

[DOWNLOAD VIEW DDL]
Description: Total sorts divided by number of transactions (units of work). I.e. number of sorts before a commit is taken.

Usage example:

select SORTS_PER_TRANSACTION from DBA.V_SORTS_PER_TRANSACTION order by snapshot_timestamp desc fetch first 1 row only

Data Manipulation Language Statements per Transaction (DMLTX)

[DOWNLOAD VIEW DDL]
Description: Update/delete/insert statements per transaction (no Selects counted). This graph helps us understand patterns in application update behavior.

Usage example:

select DMLTX from DBA.V_DMLTX order by snapshot_timestamp desc fetch first 1 row only

Select/Insert/Update/Delete Statements per Transaction (SIUD_SQL_STMTS_TX)

[DOWNLOAD VIEW DDL]
Description: Select/Update/delete/insert statements per transaction (i.e. all SQL excluding DCL).

Usage example:

select SIUD_SQL_STMTS_TX from DBA.V_STMTS_TX order by snapshot_timestamp desc fetch first 1 row only

Select Statements per Transaction (SELECT_STMTS_TX)

[DOWNLOAD VIEW DDL]
Description: Select statements (only) per transaction.

Usage example:

select SELECT_STMTS_TX from DBA.V_SELECT_STMTS_TX order by snapshot_timestamp desc fetch first 1 row only

Transactions Per Minute (TPM)

[DOWNLOAD VIEW DDL]
Description: Transactions per minute. Your indication of whether or not this is a high transaction volume database.

Usage example:

select UOW_PER_MINUTE from DBA.V_UOW_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Locking Metrics

Lock Escalations since last snapshot

[DOWNLOAD VIEW DDL]
Description: Number of lock escalations since last snapshot (last 5 minutes). I was dealing with a database with very rare lock escalation so decided not to do a per minute calculation but just show the number of lock escalations last snapshot.

Usage example:

select LOCKESCALS from DBA.V_LOCK_ESCAL order by snapshot_timestamp desc fetch first 1 row only

Exclusive (X) Lock Escalations Since Last Snapshot

[DOWNLOAD VIEW DDL]
Description: An exclusive (X) lock escalation can cause locking contention, so seeing this happen on a frequent basis helps us to tune both the application and table locking parameters.

Usage example:

select X_LOCKESCALS from DBA.V_XLOCK_ESCAL order by snapshot_timestamp desc fetch first 1 row only

Lock Timeouts since last snapshot

[DOWNLOAD VIEW DDL]
Description: Number of lock timeouts (SQLCODE -911 return code 68) since last snapshot (last 5 minutes). As with lock escalations, I was dealing with a database with very rare lock timeouts so decided not to do a per minute calculation but just show any at all that had occurred – any non-zero value was of concern.

Usage example:

select LOCK_TIMEOUTS from DBA.V_LOCK_TIMEOUTS order by snapshot_timestamp desc fetch first 1 row only

Deadlocks since last snapshot

[DOWNLOAD VIEW DDL]
Description: Number of Deadlocks (SQLCODE -911 return code 2) since last snapshot (last 5 minutes or monitoring interval).

Usage example:

select DEADLOCKS from dba.v_deadlocks order by snapshot_timestamp desc fetch first 1 row only

LockWait Time Seconds Per Minute

[DOWNLOAD VIEW DDL]
Description: Total lockwait time for all applications expressed in seconds per minute.

Usage example:

select LOCKWAIT_TIME_PER_MINUTE from DBA.V_LOCKWAIT_TIME_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Lock Waits Per Minute

[DOWNLOAD VIEW DDL]
Description: Total lockwaits for all applications per minute.

Usage example:

select LOCKWAITS_PER_MINUTE from DBA.V_LOCKWAITS_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Sort Metrics

Sort Time Seconds Per Minute

[DOWNLOAD VIEW DDL]
Description: Total sort time, all applications, in seconds per minute. A good indicator of when sort-intensive processes are running.

Usage example:

select SORT_TIME_PER_MINUTE from DBA.V_SORT_TIME_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Sorts Per Minute

[DOWNLOAD VIEW DDL]
Description: Total number of sorts, all applications, per minute.

Usage example:

select SORTS_PER_MINUTE from DBA.V_SORTS_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Logging Metrics

Log Cleans Ratio Percentage (LCR)

[DOWNLOAD VIEW DDL]
Description: Number of times LSN gap cleaner trigger as percentage of total cleaners. From the IBM DB2 Version 9 Infocenter:

When the amount of log space encompassing the log record which has updated the oldest page in the buffer pool and the current log position exceeds that allowed by the softmax parameter, it is said that the database is in an ‘LSN gap’ situation. Under the default method of page cleaning, when the logger detects than an LSN gap has occurred, it will trigger the page cleaners to write out all the pages which are contributing to the LSN gap situation. That is, it will write out those pages which are older than what is allowed by the softmax parameter. Page cleaners will be idle for some period of time while no LSN gap is occurring. Then, once an LSN gap occurs, the page cleaners are activated to write a large number of pages before going back to sleep. This can result in the saturation of the I/O subsystem, which then affects other agents which are reading or writing pages.

When LSN gap cleaner triggers (LCR approaches 100%), check for performance issues on the database. If you have configured very large logs, softmax may need to be successively lowered until LSN gap cleaner triggers more frequently. If there is strong correlation, it might be necessary to use the new DB2 proactive page cleaning feature DB2_USE_ALTERNATE_PAGE_CLEANING (registry setting).

How to calculate:

LCR = 100 * ( lsn.gap.cleaner.triggers /
( dirt.page.steal.cleaner.triggers
+ lsn.gap.cleaner.triggers
+ dirty.page.threshold.cleaner.triggers  ) )

Usage example:

select LOG_CLEANS_RATIO from DBA.V_LOG_CLEANS_RATIO order by snapshot_timestamp desc fetch first 1 row only

Log Write IO per Minute

[DOWNLOAD VIEW DDL]
Description: The number of I/O requests issued by the logger for writing log data to the disk, per minute. This is an indicator of logging activity, and can be used in conjunction with other graphs to determine whether logging is responsible for application performance issues. Can also point toward heavy update periods.

Usage example:

select NUM_LOG_WRITE_IO_PER_MINUTE from DBA.V_LOG_WRITE_IO_PERMINUTE order by snapshot_timestamp desc fetch first 1 row only

Total Log Used (MB)

[DOWNLOAD VIEW DDL]
Description: Total log space in megabytes. This can show us problematic or dangerously large units of work, could also be an alert that we are about to exhaust log space. Also allows you to better visualize the growth and contraction of log files over a period of time.

Usage example:

select TOTAL_LOG_USED_MB from DBA.V_TOTAL_LOG_USED_MB order by snapshot_timestamp desc fetch first 1 row only

Miscellaneous 2-Column Metrics

SQL Execution Time Seconds Per Minute

[DOWNLOAD VIEW DDL]
Description: SQL Execution Time in seconds per minute.

Usage example:

select ELAPSED_EXEC_TIME_S_PER_MINUTE from DBA.V_SQL_EXEC_TIME_PERMINUTE
order by snapshot_timestamp desc fetch first 1 row only

Rows Read Per Minute

[DOWNLOAD VIEW DDL]
Description: Rows read from all tables, per minute. Not the same as rows selected. For example a poorly performing SQL might have to read 10,000 rows of a table in order to return only one row in a SELECT statement.

Usage example:

select ROWS_READ_PER_MINUTE from DBA.v_rows_read_perminute order by snapshot_timestamp desc fetch first 1 row only

3+ Column Performance Views

RR/RS Ratio (RRRSR)

[DOWNLOAD VIEW DDL]
Description: Rows Read / Rows Selected Ratio (RRRSR) by agents’ ID. RRRSR indicates how many table rows were read, total, as a ratio to number of rows returned in all SELECT statements executed by that agent during its lifetime. This can be an indicator of poor SQL performance.

Table Describe:

describe table dba.v_rr_rs_ratio

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
AGENT_ID                       SYSIBM    BIGINT                    8     0 No
APPL_RR_RS_RATIO               SYSIBM    DECIMAL                  18     2 Yes

RR/RS Ratio (RRRSR) Aggregates

[DOWNLOAD VIEW DDL]
Description: Maximum and Average value found of Rows Read to Rows Selected Ratio (RRRSR) for all agents sampled at this interval. RRRSR indicates how many table rows were read, total, as a ratio to number of rows returned in all SELECT statements executed by that agent during its lifetime. This metric is an aggregate among all agents alive at that snapshot. The maximum is a good indicator of performance spikes.

This view is dependent on view dba.v_rr_rs_ratio (above).

Table Describe:

describe table dba.v_rr_rs_ratio_aggregate

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
APPL_RR_RS_RATIO_AVG           SYSIBM    DECIMAL                  18     2 Yes
APPL_RR_RS_RATIO_MAX           SYSIBM    DECIMAL                  18     2 Yes

Statements with High RR/RS Ratio (RRRSR)

[DOWNLOAD VIEW DDL]
Description: All captured statements which match applid/timestamp with at least 80% of max RR_RS ratio for that snapshot interval. The view should show you some of the worst performing SQL captured at your snapshot interval. Because snapshot microseconds might not match between administrative views SNAPSTMT and SNAPAPPL, SQL allows for join between the two capture tables with a 2 second variance, which can be adjusted for your own implementation. The 80% of max value can also be easily modified for your use.

Table Describe:

describe table dba.v_stmt_high_rr_rs_ratio

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
RR_RS_RATIO                    SYSIBM    DECIMAL                  18     2 Yes
STMT_TEXT                      SYSIBM    CLOB               16777216     0 Yes

Rows Read per minute by Table

[DOWNLOAD VIEW DDL]
Description: By table, rows read per minute per snapshot interval.
Table Describe:

describe table dba.v_table_rows_read_perminute

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
TABSCHEMA                      SYSIBM    VARCHAR                 128     0 No
TABNAME                        SYSIBM    VARCHAR                 128     0 No
TABLE_ROWS_READ_PER_MINUTE     SYSIBM    DECIMAL                  18     2 Yes

Rows Written per minute by Table

[DOWNLOAD VIEW DDL]
Description: By table, rows written per minute per snapshot interval.
Table Describe:

describe table dba.v_table_rows_written_perminute

Column                         Type      Type
name                           schema    name               Length   Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
SNAPSHOT_TIMESTAMP             SYSIBM    TIMESTAMP                10     0 No
TABSCHEMA                      SYSIBM    VARCHAR                 128     0 No
TABNAME                        SYSIBM    VARCHAR                 128     0 No
TABLE_ROWS_WRITTEN_PER_MINUTE  SYSIBM    DECIMAL                  21     2 Yes

  4 record(s) selected.

Prune snapshot historical tables

I recommend you create the performance tables in their own DMS tablespace, so that growth in these tables does not endanger space requirements for production data. In any case, you will need to prune data, probably on a weekly basis. The following script can be used as a template for your own prune process. Remember to also REORG / RUNSTATS tables periodically.

example crontab entry, for database dbname, number of days old before purging: 90; run on Monday morning after midnight.

01 00 * * 1 /db2/support/scripts/purge_dba_snapshot_tables.ksh -d dbname -n 90

Download the snapshot history prune script here:
purge_dba_snapshot_tables.ksh

Or view it inline…
inline script:

#!/bin/ksh
# purge_dba_snapshot_tables.ksh
# purge DBA schema snapshot capture tables
#=========================================================================
#  Parameters:
#     -d  :  Required.  The name of the database against which the SQL will
#            run.
#     -n  :  Number of days back to prune
#=========================================================================
scriptStatusLog=/db2/support/scripts/purge_dba_snapshot_tables.log
# edit prfile to your own local configuration.
prfile=/home/db2/sqllib/db2profile
db2options="-av"
function Usage {
echo 'Usage: purge_dba_snapshot_tables.ksh -d  -n  '
echo 'All parameters are required. '
echo 'Use this script to purge snapshot table data older than n days.'
}

if [[ "$#" -lt "4" ]]; then
  Usage
  exit
fi

while getopts  d:n: OPT
do
   case $OPT in
      d) DbName=$OPTARG ;;
      n) PurgeDays=$OPTARG ;;
      *) Usage; echo "$(date) Usage error." >> $scriptStatusLog ; exit 1 ;;
   esac
done

if [ -f $prfile ]; then
   . $prfile
else
   echo "$(date) Cannot locate $prfile." >> $scriptStatusLog
   Usage
   exit 1
fi

echo "$(date) Now executing purge_dba_snapshot_tables.ksh with following options:" >> $scriptStatusLog
echo "DbName = $DbName " >>  $scriptStatusLog
echo "Purge Days = $PurgeDays " >>  $scriptStatusLog
echo "$(date) Now connecting . . . " >>  $scriptStatusLog
db2 connect to $DbName >> $scriptStatusLog 2>&1
echo "$(date) Now executing purge . . . " >>  $scriptStatusLog
db2 $db2options  "delete from dba.snapdb where snapshot_timestamp < current timestamp - $PurgeDays days" >> $scriptStatusLog 2>&1
db2 $db2options  "delete from dba.snapappl where snapshot_timestamp < current timestamp - $PurgeDays days" >> $scriptStatusLog 2>&1
db2 $db2options  "delete from dba.snapstmt where snapshot_timestamp < current timestamp - $PurgeDays days" >> $scriptStatusLog 2>&1
echo "$(date) purge_dba_snapshot_tables completed with RC = $?  " >>  $scriptStatusLog
db2 connect reset > /dev/null 2>&1
db2 terminate  > /dev/null 2>&1

References

Some good articles on use of snapshot data for performance analysis:

IBM DB2 v9 InfoCenter Key References
- All Supported Administrative SQL Routines and ViewsProbably the most succinct/complete list of DB2 administrative views.
- Snapshot monitor SQL Administrative Views, Administrative Convenience Views” (derivative calculated values)
- Database system monitor data organization
  Most useful overview of the organization of data gathered by snapshots.
- Administrative views versus table functions
  When to still use table functions . . . they still have a role.
IBM DB2 v9 InfoCenter Cool Administrative Convenience Views
- TOP_DYNAMIC_SQL administrative view
  returns the top dynamic SQL statements sortable by number of executions, average execution time, number of sorts, or sorts per statement.
- TBSP_UTILIZATION administrative view
  returns table space configuration and utilization information. A quick peek at page/space consumption.
Helpful Scott Hayes articles from IBMDatabaseMagazine.com
- DB2 LUW Performance: Asked and Answered (BPLRTX)
  More on BPLRTX, buffer pool logical reads per transaction. Also NUMBLOCKPAGES. Suggesting a guideline for ideal BPLRTX.
- DB2 LUW Performance: The Most Important Cost
  Buffer Pool Logical Reads per Transaction.
- DB2 LUW Performance: Progress Review plus Closing Files
  Another list of performance measures, a good index page to search further. More transaction based references.
- DB2 LUW Performance: Detecting Index Leaf Page Scans
  Index Leaf Page scans, BPLITX. “While perhaps not as costly as data page scans, index leaf page scans can quickly suck your CPUs dry of processing capacity.”
- DB2 LUW Performance: Index Read Efficiency (IREF)
  IREF = Rows read / Rows Selected (Fetched)
- DB2 LUW Performance: Table Read I/O and Overflows
  “TBRRTX should be computed for each table by dividing the number of Rows Read by the number of database Transactions . . . TBROVP – The percentage of Overflows over Rows Read = Overflows * 100 / Rows Read (+1 if you want to void division by zero).”
- DB2 LUW Performance: Catalog Cache
  “Catalog Cache Hit Ratio (CATHR) = 100 – ( Catalog cache inserts * 100 / Catalog cache lookups ) ” A metric that didn’t interest me in my current site, but may be useful to you.
- DB2 LUW Performance: Synchronous Read Percent (SRP)
  “Synchronous Read Percent – sb 90% or better for OLTP, the lower the number the more you need re-examine your tuning decisions. For DW, 50% might be great.”
- DB2 LUW Performance: Write I/O Optimization
  Asynchronous Write Percentage
- DB2 LUW Performance: Tuning LOGBUFSZ – DB2 for LUW Performance
Useful metrics docs from Quest Software
- Understanding Buffer Pool Performance and Tuning in DB2 UDB v8.1 Multi-platform Environments (PDF) By Scott Olson and Joshua Steffan
- Understanding Buffer Pool Performance and Tuning in DB2 UDB LUW 8.2 (PDF), by Jeff Brokaw.
  The various bufferpool ratio equations can be found on page 28 onward of the PDF. Data and index hit ratios, physical IO read rate, prefetch ratio, log cleans ratio, threshold cleans, and victim cleans ratios.

DB2 LUW DBA HowTo

Leveraging DB2 Snapshot Administrative Views

Create snapshot historical tables

Populate snapshot historical tables

Performance Metric Views

Peformance Metrics Two-Column Views

Application and Connection Metrics

Applications Connected

Applications Executing

New Connects Per Minute

Physical (tablespace I/O) Metrics

Physical Synchronous Read Percentage (SRP)

Logical and Physical Read Rate Per Minute

Bufferpool I/O Ratio Metrics

Bufferpool Overall Hit Ratio Percentage

Bufferpool Data Hit Percentage

Bufferpool Index Hit Percentage

Dirty Page Threshold Cleaners Ratio Percentage (TCR)

Victim Cleans Ratio Percentage (VCR)

Transactional Metrics

Bufferpool Logical Index Reads Per Transaction (BPLITX)

BP Logical Reads Per Transaction (BPLRTX)

Sorts Per Transaction (SRTTX)

Data Manipulation Language Statements per Transaction (DMLTX)

Select/Insert/Update/Delete Statements per Transaction (SIUD_SQL_STMTS_TX)

Select Statements per Transaction (SELECT_STMTS_TX)

Transactions Per Minute (TPM)

Locking Metrics

Lock Escalations since last snapshot

Exclusive (X) Lock Escalations Since Last Snapshot

Lock Timeouts since last snapshot

Deadlocks since last snapshot

LockWait Time Seconds Per Minute

Lock Waits Per Minute

Sort Metrics

Sort Time Seconds Per Minute

Sorts Per Minute

Logging Metrics

Log Cleans Ratio Percentage (LCR)

Log Write IO per Minute

Total Log Used (MB)

Miscellaneous 2-Column Metrics

SQL Execution Time Seconds Per Minute

Rows Read Per Minute

3+ Column Performance Views

RR/RS Ratio (RRRSR)

RR/RS Ratio (RRRSR) Aggregates

Statements with High RR/RS Ratio (RRRSR)

Rows Read per minute by Table

Rows Written per minute by Table

Prune snapshot historical tables

References

Footnotes

DB2 LUW CLP – how to rollback transaction using +c -s -f options

Pages

DB2 Administration

DB2 Community

DB2 Monitoring

IBM documentation

IT Contract Work

Solaris

Categories

Recent Posts

RSS feed for DB2 DBA HowTo

Archives

Recent Comments