Optimizer statistics include the following:
Table statistics
Number of rows
Number of blocks
Average row length
Column statistics
Number of distinct values (NDV) in a column
Number of nulls in a column
Data distribution (histogram)
Extended statistics
Index statistics
Number of leaf blocks
Number of levels
Index clustering factor
System statistics
I/O performance and utilization
CPU performance and utilization
Table statistics contain metadata that the optimizer uses when developing an execution plan.
In Oracle Database, table statistics include information about rows and blocks.
The optimizer uses these statistics to determine the cost of table scans and table joins. The database tracks all relevant statistics about permanent tables. For example, table statistics stored in DBA_TAB_STATISTICS
track the following:
Number of rows
The database uses the row count stored in
DBA_TAB_STATISTICS
when determining cardinality.- Average row length
Number of data blocks
The optimizer uses the number of data blocks with the
DB_FILE_MULTIBLOCK_READ_COUNT
initialization parameter to determine the base table access cost.Number of empty data blocks
DBMS_STATS.GATHER_TABLE_STATS
commits before gathering statistics on permanent tables.
EMPTY_BLOCKS, LAST_ANALYZED
FROM DBA_TAB_STATISTICS
WHERE OWNER='SH'
AND TABLE_NAME='CUSTOMERS';
---------- ----------- ---------- ------------ ---------
55500 189 1517 0 25-MAY-17
Column Statistics
Column statistics track information about column values and data distribution.
The optimizer uses column statistics to generate accurate cardinality estimates and make better decisions about index usage, join orders, join methods, and so on. For example, statistics in DBA_TAB_COL_STATISTICS
track the following:
Number of distinct values
Number of nulls
High and low values
Histogram-related information
The optimizer can use extended statistics, which are a special type of column statistics. These statistics are useful for informing the optimizer of logical relationships among columns.
Index Statistics
The index statistics include information about the number of index levels, the number of index blocks, and the relationship between the index and the data blocks. The optimizer uses these statistics to determine the cost of index scans.
The DBA_IND_STATISTICS
view tracks index statistics.
Statistics include the following:
Levels
The
BLEVEL
column shows the number of blocks required to go from the root block to a leaf block. A B-tree index has two types of blocks: branch blocks for searching and leaf blocks that store values. See Oracle Database Concepts for a conceptual overview of B-tree indexes.Distinct keys
This columns tracks the number of distinct indexed values. If a unique constraint is defined, and if no
NOT NULL
constraint is defined, then this value equals the number of non-null values.Average number of leaf blocks for each distinct indexed key
Average number of data blocks pointed to by each distinct indexed key
The clustering factor is a property of a specific index, not a table. If multiple indexes exist on a table, then the clustering factor for one index might be small while the factor for another index is large. An attempt to reorganize the table to improve the clustering factor for one index may degrade the clustering factor of the other index.
A histogram is a special type of column statistic that provides more detailed information about the data distribution in a table column. A histogram sorts values into "buckets," as you might sort coins into buckets.
Based on the NDV and the distribution of the data, the database chooses the type of histogram to create. (In some cases, when creating a histogram, the database samples an internally predetermined number of rows.) The types of histograms are as follows:
Frequency histograms and top frequency histograms
Height-Balanced histograms (legacy)
Hybrid histograms
For columns that contain data skew (a nonuniform distribution of data within the column), a histogram enables the optimizer to generate accurate cardinality estimates for filter and join predicates that involve these columns.
DBMS_XPLAN
package supplies five table functions:-
DISPLAY
- to format and display the contents of a plan table. -
DISPLAY_AWR
- to format and display the contents of the execution plan of a stored SQL statement in the AWR. -
DISPLAY_CURSOR
- to format and display the contents of the execution plan of any loaded cursor. -
DISPLAY_SQL_PLAN_BASELINE
- to display one or more execution plans for the SQL statement identified by SQL handle -
DISPLAY_SQLSET
- to format and display the contents of the execution plan of statements stored in a SQL tuning set.
Format Constants | |
ALIAS | If relevant, shows the "Query Block Name / Object Alias" section |
ALLSTATS | A shortcut for 'IOSTATS MEMSTATS' |
BYTES | If relevant, shows the number of bytes estimated by the optimizer |
COST | If relevant, shows optimizer cost information |
IOSTATS | Assuming that basic plan statistics are collected when SQL statements are executed (either by using the gather_plan_statistics hint or by setting the parameter statistics_level to ALL), this format will show IO statistics for ALL (or only for the LAST as shown below) executions of the cursor |
LAST | By default, plan statistics are shown for all executions of the cursor. The keyword LAST can be specified to see only the statistics for the last execution |
MEMSTATS | Assuming that PGA memory management is enabled (that is, pga_aggregate_target parameter is set to a non 0 value), this format allows to display memory management statistics (for example, execution mode of the operator, how much memory was used, number of bytes spilled to disk, and so on). These statistics only apply to memory intensive operations like hash-joins, sort or some bitmap operators |
NOTE | If relevant, shows the note section of the explain plan |
PARALLEL | If relevant, shows PX information (distribution method and table queue information) |
PARTITION | If relevant, shows partition pruning information |
PREDICATE | If relevant, shows the predicate section |
PROJECTION | If relevant, shows the projection section |
REMOTE | If relevant, shows the information for distributed query (for example, remote from serial distribution and remote SQL) |
ROWS | If relevant, shows the number of rows estimated by the optimizer |
RUNSTATS_LAST | Same as IOSTATS LAST: displays the runtime stat for the last execution of the cursor |
RUNSTATS_TOT | Same as IOSTATS: displays IO statistics for all executions of the specified cursor |
dbms_xplan.display function has four basic levels
BASIC TYPICAL (default) SERIAL ALL
SELECT * FROM table(dbms_xplan.display);
SELECT * FROM table(dbms_xplan.display(null,null,'ALL'));
SELECT * FROM table(dbms_xplan.display(null,null,'BASIC +COST'));
SELECT * FROM table(dbms_xplan.display(null,null,'TYPICAL -BYTES -ROWS'));
Operation | Explanation | To Learn More |
---|---|---|
Access paths
|
As for simple statements, the optimizer must choose an access path to retrieve data from each table in the join statement. For example, the optimizer might choose between a full table scan or an index scan..
|
|
Join methods
|
To join each pair of row sources, Oracle Database must decide how to do it. The "how" is the join method. The possible join methods are nested loop, sort merge, and hash joins. A Cartesian join requires one of the preceding join methods. Each join method has specific situations in which it is more suitable than the others.
|
|
Join types
|
The join condition determines the join type. For example, an inner join retrieves only rows that match the join condition. An outer join retrieves rows that do not match the join condition.
|
"Join Types" |
Join order
|
To execute a statement that joins more than two tables, Oracle Database joins two tables and then joins the resulting row source to the next table. This process continues until all tables are joined into the result. For example, the database joins two tables, and then joins the result to a third table, and then joins this result to a fourth table, and so on.
|
N/A |
---Statistics Gathering Procedures in the DBMS_STATS Package
Default Table and Index Values When Statistics are Missing
Statistic | Default Value Used by Optimizer |
---|---|
Tables | num_of_blocks * (block_size - cache_layer) / avg_row_len 100 bytes |
Indexes | 800 (8 * number of blocks) |
Verifying Table Statistics
SELECT TABLE_NAME, NUM_ROWS, BLOCKS, AVG_ROW_LEN,
TO_CHAR(LAST_ANALYZED, 'MM/DD/YYYY HH24:MI:SS')
FROM DBA_TABLES
WHERE TABLE_NAME IN ('SO_LINES_ALL','SO_HEADERS_ALL','SO_LAST_ALL');
This returns the following typical data:
TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN LAST_ANALYZED
------------------------ -------- ------- ----------- -------------
SO_HEADERS_ALL 1632264 207014 449 07/29/1999
00:59:51
SO_LINES_ALL 10493845 1922196 663 07/29/1999
01:16:09
SO_LAST_ALL
SELECT INDEX_NAME "NAME", NUM_ROWS, DISTINCT_KEYS "DISTINCT",
LEAF_BLOCKS, CLUSTERING_FACTOR "CF", BLEVEL "LEVEL",
AVG_LEAF_BLOCKS_PER_KEY "ALFBPKEY"
FROM DBA_INDEXES
WHERE owner = 'SH'
ORDER BY INDEX_NAME;
NAME NUM_ROWS DISTINCT LEAF_BLOCKS CF LEVEL ALFBPKEY
-------------------------- -------- -------- ----------- ------- ------- ----------
CUSTOMERS_PK 50000 50000 454 4405 2 1
PRODUCTS_PK 10000 10000 90 1552 1 1
PRODUCTS_PROD_CAT_IX 10000 4 99 4422 1 24
PRODUCTS_PROD_SUBCAT_IX 10000 37 170 6148 2 4
SALES_PROD_BIX 6287 909 1480 6287 1 1
SALES_PROMO_BIX 4727 459 570 4727 1 1
--Verifying Column StatisticsSELECT COLUMN_NAME, NUM_DISTINCT, NUM_NULLS, NUM_BUCKETS, DENSITY FROM DBA_TAB_COL_STATISTICS WHERE TABLE_NAME ="PA_EXPENDITURE_ITEMS_ALL" ORDER BY COLUMN_NAME;
This returns the following data:
COLUMN_NAME NUM_DISTINCT NUM_NULLS NUM_BUCKETS DENSITY ------------------------------ ------------ ---------- ----------- ---------- BURDEN_COST 4300 71957 1 .000232558 BURDEN_COST_RATE 675 7376401 1 .001481481 CONVERTED_FLAG 1 16793903 1 1 COST_BURDEN_DISTRIBUTED_FLAG 2 15796 1 .5 COST_DISTRIBUTED_FLAG 2 0 1 .5 COST_IND_COMPILED_SET_ID 87 6153143 1 .011494253 EXPENDITURE_ID 1171831 0 1 8.5337E-07 TASK_ID 8648 0 1 .000115634 TRANSFERRED_FROM_EXP_ITEM_ID 1233787 15568891 1 8.1051E-07
---Verifying Histogram Statistics
SQL> SELECT ENDPOINT_NUMBER, ENDPOINT_VALUE
FROM DBA_HISTOGRAMS
WHERE TABLE_NAME ="SO_LINES_ALL" AND COLUMN_NAME="S2"
ORDER BY ENDPOINT_NUMBER;
This query returns the following typical data:
ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- ---------------
1365 4
1370 5
2124 8
2228 18
GATHER_INDEX_STATS
procedure.The current statistics information is available from the data dictionary views for the specific objects (DBA, ALL and USER views). Some of these view were added in later releases.
- DBA_TABLES
- DBA_TAB_STATISTICS
- DBA_TAB_PARTITIONS
- DBA_TAB_SUB_PARTITIONS
- DBA_TAB_COLUMNS
- DBA_TAB_COL_STATISTICS
- DBA_PART_COL_STATISTICS
- DBA_SUBPART_COL_STATISTICS
- DBA_INDEXES
- DBA_IND_STATISTICS
- DBA_IND_PARTITIONS
- DBA_IND_SUBPARTIONS
Histogram information is available from the following views.
- DBA_TAB_HISTOGRAMS
- DBA_PART_HISTOGRAMS
- DBA_SUBPART_HISTOGRAMS
Table, column and index statistics can be deleted using the relevant delete procedures.
EXEC DBMS_STATS.delete_database_stats;
EXEC DBMS_STATS.delete_schema_stats('SCOTT');
EXEC DBMS_STATS.delete_table_stats('SCOTT', 'EMP');
EXEC DBMS_STATS.delete_column_stats('SCOTT', 'EMP', 'EMPNO');
EXEC DBMS_STATS.delete_index_stats('SCOTT', 'EMP_PK');
EXEC DBMS_STATS.delete_dictionary_stats;
--Transfering Stats
--Creating Histograms
You generate histograms by using the DBMS_STATS
package. You can generate histograms for columns of a table or partition. For example, to create a 10-bucket histogram on the SAL
column of the emp
table, issue the following statement:
EXECUTE DBMS_STATS.GATHER_TABLE_STATS('scott','emp', METHOD_OPT => 'FOR COLUMNS SIZE 10 sal');
Gather statistics during the day. Gathering ends after 720 minutes and is stored in the "mystats" table:
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS( gathering_mode => 'interval',interval => 720,stattab => 'mystats',statid => 'OLTP');
END;
/
Gather statistics during the night. Gathering ends after 720 minutes and is stored in the "mystats" table:
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS(gathering_mode => 'interval',interval => 720,stattab => 'mystats',statid => 'OLAP');
END;
/
Where estimates vary greatly from the actual number of rows returned, the cursor is marked IS_REOPTIMIZIBLE
and will not be used again. The IS_REOPTIMIZIBLE attribute indicates that this SQL statement should be hard
parsed on the next execution so the optimizer can use the execution statistics recorded on the initial execution to
determine a better execution plan
select sql_id,child_number,sql_text,is_reoptimizable from v$sql where sql_id=''
select sql_id,child_number,sql_text,is_reoptimizable from v$sql where sql_text like '% %'
The new plan is not marked IS_REOPTIMIZIBLE, so it will be used for all subsequent executions of this SQL statement.