I referred some reference from my favorite Tim Hall blog (ORACLE_BASE)
A.Basic of optimizer statistics: –
The optimizer cost model relies on statistics collected about the objects involved in a query, and the database and host where the query runs. Statistics are critical to the optimizer’s ability to pick the best execution plan for a SQL statement.
Object
|
What influences statistics
|
Views
|
Table Statistics
|
1.No of rows.
2.No of blocks.
3.Avg row length.
|
1.DBA_TABLES
2.DBA_TAB_STATISTICS
|
Column Statistics
|
1.No of distinct values.
2.No of nulls
3.Data distribution(Histogram)
4.Extended statistics
|
1.DBA_TAB_COLS
2.DBA_TAB_COL_STATISTICS
3.USER_PART_COL_STATISTICS
4.USER_SUBPART_COL_STATISTICS
|
Index Statistics
|
No of distinct values.
No of leaf blocks.
Clustering g factor
|
1.DBA_INDEXES
2.DBA_IND_STATISTICS
|
Histograms
|
Data distribution in columns
|
DBA_TAB_HISTOGRAMS
|
Extended Stats
|
Relation between data stored in different columns of the same table
|
DBA_TAB_COL_STATISTICS
|
System statistics
|
I/O performance and utilization.
CPU performance and utilization
|
sys.aux_stats$
|
B.Configuring Automatic Optimizer Statistics Collection Using Cloud Control FOR gather statistics oracle
Please go to “Administration”->”Oracle Schedure”->”Automated Maintenance Tasks”
You can see 3 default automatic jobs scheduled. You must choose “Optimizer Statistics Gathering”
Now click on “Optimizer Statistics Gathering”
You may click on “Global Statistics Gathering Options” for viewing configured parameters.
You must choose “Configure” button to enable or disable /Change Window.
You may check by sqlplus also to check status of Automatics Optimizer Stats Collection
SQL> COL CLIENT_NAME FORMAT a31
SELECT CLIENT_NAME, STATUS
FROM DBA_AUTOTASK_CLIENT
WHERE CLIENT_NAME = ‘auto optimizer stats collection’;
SQL> 2 3
CLIENT_NAME STATUS
——————————- ——–
auto optimizer stats collection ENABLED
C.Manual Statistics Gather standard parameters FOR gather statistics oracle
Preference |
Description |
Default (11gR2) |
Scope |
Version |
CASCADE
|
Determines if index stats should be gathered for the current table (TRUE, FALSE, AUTO_CASCADE).
|
DBMS_STATS.AUTO_CASCADE
|
G, D, S, T
|
10gR1+
|
DEGREE
|
Degree of parallelism (integer or DEFAULT_DEGREE).
|
DBMS_STATS.DEFAULT_DEGREE
|
G, D, S, T
|
10gR1+
|
ESTIMATE_PERCENT
|
Percentage of rows to sample when gathering stats (0.000001-100 or AUTO_SAMPLE_SIZE).
|
DBMS_STATS.AUTO_SAMPLE_SIZE
|
G, D, S, T
|
10gR1+
|
METHOD_OPT
|
Controls column statistics collection and histogram creation.
|
FOR ALL COLUMNS SIZE AUTO
|
G, D, S, T
|
10gR1+
|
NO_INVALIDATE
|
Determines if dependent cursors should be invalidated as a result of new stats on objects (TRUE, FALSE or AUTO_INVALIDATE).
|
DBMS_STATS.AUTO_INVALIDATE
|
G, D, S, T
|
10gR1+
|
AUTOSTATS_TARGET
|
Determines which objects have stats gathered (ALL, ORACLE, AUTO).
|
AUTO
|
G
|
10gR2+
|
GRANULARITY
|
The granularity of stats to be collected on partitioned objects (ALL, AUTO, DEFAULT, GLOBAL, ‘GLOBAL AND PARTITION’, PARTITION, SUBPARTITION).
|
AUTO
|
G, D, S, T
|
10gR2+
|
PUBLISH
|
Determines if gathered stats should be published immediately or left in a pending state (TRUE, FALSE).
|
TRUE
|
G, D, S, T
|
11gR2+
|
INCREMENTAL
|
Determines whether incremental stats will be used for global statistics on partitioned objects, rather than generated using table scans (TRUE, FALSE).
|
FALSE
|
G, D, S, T
|
11gR2+
|
CONCURRENT
|
Should objects statistics be gathered on multiple objects at once, or one at a time (MANUAL, AUTOMATIC, ALL, OFF).
|
OFF
|
G
|
12cR1+
|
GLOBAL_TEMP_TABLE_STATS
|
Should stats on global temporary tables be session-specific or shared between sessions (SHARED, SESSION).
|
SESSION
|
G, D, S
|
12cR1+
|
INCREMENTAL_LEVEL
|
Which level of synopses should be collected for incremental partitioned statistics (TABLE, PARTITION).
|
PARTITION
|
G, D, S, T
|
12cR1+
|
INCREMENTAL_STALENESS
|
How is staleness of partition statistics determined (USE_STALE_PERCENT, USE_LOCKED_STATS, NULL).
|
NULL
|
G, D, S, T
|
12cR1+
|
TABLE_CACHED_BLOCKS
|
The number of blocks cached in the buffer cache during calculation of index cluster factor. Jonathan Lewis recommends “16” as a sensible value.
|
1
|
G, D, S, T
|
12cR1+
|
OPTIONS
|
Used for the OPTIONS parameter of the GATHER_TABLE_STATS procedure (GATHER, GATHER AUTO).
|
GATHER
|
G, D, S, T
|
12cR1+
|
D.Some standard statistics gather FOR gather statistics oracle
Gather statistics:-
EXEC DBMS_STATS.gather_database_stats;
EXEC DBMS_STATS.gather_database_stats(estimate_percent => 15);
EXEC DBMS_STATS.gather_database_stats(estimate_percent => 15, cascade => TRUE);
EXEC DBMS_STATS.gather_schema_stats(‘SCOTT’); EXEC DBMS_STATS.gather_schema_stats(‘SCOTT’, estimate_percent => 15);
EXEC DBMS_STATS.gather_schema_stats(‘SCOTT’, estimate_percent => 15, cascade => TRUE);
EXEC DBMS_STATS.gather_table_stats(‘SCOTT’, ‘EMPLOYEES’);
EXEC DBMS_STATS.gather_table_stats(‘SCOTT’, ‘EMPLOYEES’, estimate_percent => 15);
EXEC DBMS_STATS.gather_table_stats(‘SCOTT’, ‘EMPLOYEES’, estimate_percent => 15, cascade => TRUE);
EXEC DBMS_STATS.gather_table_stats(ownname=>’C##TEST’,tabname=>’TT1′,method_opt=>’FOR COLUMNS C1′);
Delete statistics:-
EXEC DBMS_STATS.delete_database_stats;
EXEC DBMS_STATS.delete_schema_stats(‘SCOTT’);
EXEC DBMS_STATS.delete_table_stats(‘SCOTT’, ‘EMP’);
EXEC DBMS_STATS.delete_column_stats(‘SCOTT’, ‘EMP’, ‘EMPNO’);
EXEC DBMS_STATS.delete_index_stats(‘SCOTT’, ‘EMP_PK’);
E.Determining When Optimizer Statistics Are Stale FOR gather statistics oracle
A.First create a demo table and load some data.
SQL> set linesize 300;
SQL> create table tab1(c1 number(10),c2 char(100));
Table created.
SQL> insert into tab1 select rownum,’A’ from dual connect by rownum<10000;
9999 rows created.
SQL> commit;
Commit complete.
B.Now statistics are not populated.
SQL> select NUM_ROWS,BLOCKS,STALE_STATS from dba_tab_statistics where table_name=’TAB1′;
NUM_ROWS BLOCKS STA
———- ———- —
C.Let me gather statistics now
SQL> exec dbms_stats.gather_table_stats(‘TEST’,’TAB1′);
PL/SQL procedure successfully completed.
SQL> select NUM_ROWS,BLOCKS,STALE_STATS from dba_tab_statistics where table_name=’TAB1′;
NUM_ROWS BLOCKS STA
———- ———- —
9999 244 NO
D.Now again let me load more data
SQL> insert into tab1 select rownum,’A’ from dual connect by rownum<10000;
9999 rows created.
SQL> commit;
Commit complete.
E.Still DBA_TAB_STATISTICS shows STALE statistics NO.
SQL> select NUM_ROWS,BLOCKS,STALE_STATS from dba_tab_statistics where table_name=’TAB1′;
NUM_ROWS BLOCKS STA
———- ———- —
9999 244 NO
F.Now let me flush/write the database monitoring information from memory to disk
SQL> BEGIN
DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;
END;
/ 2 3 4
PL/SQL procedure successfully completed.
G.Now the statistics will show STALE
SQL> select NUM_ROWS,BLOCKS,STALE_STATS from dba_tab_statistics where table_name=’TAB1′;
NUM_ROWS BLOCKS STA
———- ———- —
9999 244 YES
F.Gathering Statistics for Fixed Objects FOR gather statistics oracle
Fixed objects are dynamic performance tables and their indexes. These objects record current database activity. Unlike other database tables, the database does not automatically use dynamic statistics for SQL statement referencing X$ tables when optimizer statistics are missing. Instead, the optimizer uses predefined default values. These defaults may not be representative and could potentially lead to a suboptimal execution plan. Thus, it is important to keep fixed object statistics current.
To gather schema statistics using GATHER_FIXED_OBJECTS_STATS:
-
Start SQL*Plus, and connect to the database with the appropriate privileges for the procedure that you intend to run.
-
Run the DBMS_STATS.GATHER_FIXED_OBJECTS_STATS procedure, specifying the desired parameters. Chapter 12 Gathering Optimizer Statistics Manually
G.Gathering Statistics for Volatile Tables Using Dynamic Statistics FOR gather statistics oracle
A.Let us think my table TAB2 is extremely volatile. So We want dynamic statistics to be collected on the fly to avoid optimizer gather wrong statistics.
First let me create the demo table.
SQL> create table tab2(c1 number(10),c2 char(100));
Table created.
Let me gather statistics now.
SQL> exec dbms_stats.gather_table_stats(‘TEST’,’TAB2′);
PL/SQL procedure successfully completed.
SQL> select NUM_ROWS,BLOCKS,STALE_STATS from dba_tab_statistics where table_name=’TAB2′;
NUM_ROWS BLOCKS STA
———- ———- —
0 0 NO
Now I need to load data again
SQL> insert into tab2 select rownum,’A’ from dual connect by rownum<10000;
9999 rows created.
SQL> commit;
Commit complete.
Now check explain plan does not show proper cardinality
SQL> set autotrace traceonly;
SQL> select * from tab2;
9999 rows selected.
Execution Plan
———————————————————-
Plan hash value: 2156729920
————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 115 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| TAB2 | 1 | 115 | 2 (0)| 00:00:01 |
Now We need to delete the statistics and lock the statistics. So optimizer will not gather statistics if there is any change in table.
sqlplus test/test
SQL> exec dbms_stats.delete_table_stats(‘TEST’,’TAB2′);
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.lock_table_stats(‘TEST’,’TAB2′);
PL/SQL procedure successfully completed.
Now as there is no statistics, Oracle optimizer uses dynamic sampling.
The following table describes the levels for dynamic statistics. Note the following:
-
If dynamic statistics are not disabled, then the database may choose to use dynamic statistics when a SQL statement uses parallel execution.
-
If the OPTIMIZER_ADAPTIVE_FEATURES initialization parameter is true, then the optimizer uses dynamic statistics when relevant SQL plan directives exist. The database maintains the resulting statistics in the server result cache, making them available to other queries.
SQL> set autotrace traceonly;
SQL> select * from tab2;
9999 rows selected.
Execution Plan
———————————————————-
Plan hash value: 2156729920
————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 8435 | 947K| 68 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| TAB2 | 8435 | 947K| 68 (0)| 00:00:01 |
————————————————————————–
Note
—–
– dynamic statistics used: dynamic sampling (level=2)
H.Enabling Concurrent Statistics Gathering FOR gather statistics oracle
Oracle Database 11g Release 2 (11.2.0.2) introduces a new statistics gathering mode, ‘concurrent statistics gathering’. The goal of this new mode is to enable a user to gather statistics on multiple tables in a schema (or database), and multiple (sub)partitions within a table concurrently. Gathering statistics on multiple tables and (sub)partitions concurrently can reduce the overall time it takes to gather statistics by allowing Oracle to fully utilize a multi-processor environment.
To enable concurrent statistics gathering, use the DBMS_STATS.SET_GLOBAL_PREFS procedure to set the CONCURRENT preference. Possible values are as follows:
-
MANUAL Concurrency is enabled only for manual statistics gathering.
-
AUTOMATIC Concurrency is enabled only for automatic statistics gathering.
-
ALL Concurrency is enabled for both manual and automatic statistics gathering.
-
OFF Concurrency is disabled for both manual and automatic statistics gathering. This is the default value.
This tutorial in this section explains how to enable concurrent statistics gathering.
Create a partitioned table.
SQL> create table t1 (id number, sometext varchar2(50),my_date date) tablespace data partition by hash (id) partitions 16;
Please note that concurrency is OFF.
SQL> SELECT DBMS_STATS.GET_PREFS(‘CONCURRENT’) FROM DUAL;
DBMS_STATS.GET_PREFS(‘CONCURRENT’)
——————————————————————————–
OFF
SQL> ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = ‘DEFAULT_PLAN’;
System altered.
Check if you have sufficient job process
SQL> show parameter job;
NAME TYPE VALUE
———————————— ———– ——————————
job_queue_processes integer 1000
Now you change concurrency to ‘ALL’
SQL> BEGIN
DBMS_STATS.SET_GLOBAL_PREFS(‘CONCURRENT’,’ALL’);
END;
/ 2 3 4
PL/SQL procedure successfully completed.
SQL> SELECT DBMS_STATS.GET_PREFS(‘CONCURRENT’) FROM DUAL;
DBMS_STATS.GET_PREFS(‘CONCURRENT’)
——————————————————————————–
ALL
Create a procedure to parallel insert data (30 sessions will insert data parallely)
SQL> create or replace procedure manyinserts as
v_m number;
begin
for i in 1..1000000 loop
select round(dbms_random.value() * 44444444444) + 1 into v_m from dual t;
insert /*+ new2 */ into t1 values (v_m, ‘DOES THIS’||dbms_random.value(),sysdate);
end loop;
commit;
end;
/
Procedure created.
SQL> create or replace procedure manysessions as
v_jobno number:=0;
begin
FOR i in 1..30 LOOP
dbms_job.submit(v_jobno,’manyinserts;’, sysdate);
END LOOP;
commit;
end;
/
Procedure created.
SQL> exec manysessions;
PL/SQL procedure successfully completed.
Now gather schema in another session.
SQL> EXEC DBMS_STATS.GATHER_SCHEMA_STATS(‘TEST’);
PL/SQL procedure successfully completed.
Please check status now.
SQL> /
SQL> SET LINESIZE 1000
SQL> COLUMN TARGET FORMAT a8
COLUMN TARGET_TYPE FORMAT a25
SQL> SQL> COLUMN JOB_NAME FORMAT a14
COLUMN START_TIME FORMAT a40
SELECT TARGET, TARGET_TYPE, JOB_NAME,
SQL> SQL> 2 TO_CHAR(START_TIME, ‘dd-mon-yyyy hh24:mi:ss’)
3 FROM DBA_OPTSTAT_OPERATION_TASKS
4 WHERE STATUS = ‘IN PROGRESS’
AND OPID = (SELECT MAX(ID)
5 6 FROM DBA_OPTSTAT_OPERATIONS
WHERE OPERATION = ‘gather_schema_stats’);
7
TARGET TARGET_TYPE JOB_NAME TO_CHAR(START_TIME,’DD-MON-YY
——– ————————- ————– —————————–
TEST.T1 TABLE 20-feb-2018 02:22:41
TEST.T1. TABLE PARTITION 20-feb-2018 02:22:41
SYS_P786
I.Incremental Statistics FOR gather statistics oracle
:-
In typical data warehousing environment existence of huge partitioned tables is very common, gathering statistics on such tables is challenging tasks. For partitioned tables there are two types of statistics Global and Partition level statistics. Gathering global statistics is very expensive and resource consuming operation as it scans whole table. Hence most of the time people use to reduce estimate_percent down to less than 1 percent. This does helps in reducing time taken to gather stats but may not be sufficient to represent the data distribution. Gathering partition level statistics is not so expensive as it gathers only for the partitions where data has been changed.
Traditionally statistics are gathered in two phase
-
Scan complete table to gather Global statistics
-
Scan only the partitions where data has been changed
Obviously global stats can be derived by using partition level stats like say for example number of rows at table level = just sum number of rows from all the partitions. But global stats like NDV(Number of Distinct Values) which is very important in calculating cardinality can’t be derived so easily. The only way to derive them is by scanning the whole table.
These synopsis data are stored in WRI$_OPTSTAT_SYNOPSIS$ and WRI$_OPTSTAT_SYNOPSIS_HEAD$ tables residing in SYSAUX tablespace. Table WRI$_OPTSTAT_SYNOPSIS$ will grow enormously as there will be individual synopsis created for each hash proportional to distinct value existing at table,partition and column level. Table WRI$_OPTSTAT_SYNOPSIS_HEAD$ will have each record for every table, partition, and column. In 11.1 release gathering incremental statistics would take longer time if you have wide tables with many partitions due to delete statement working on WRI$_OPTSTAT_SYNOPSIS$ table. In 11.2 this issue has been resolved by Range-Hash partitioning the WRI$_OPTSTAT_SYNOPSIS$ table.
Create a range based partition
CREATE TABLE t_range
(ID NUMBER,
CREATE_DATE DATE,
NAME CHAR(100))
PARTITION BY RANGE(CREATE_DATE)
INTERVAL(NUMTODSINTERVAL(1,’DAY’)) STORE IN (tbs1,tbs2,tbs3,tbs4)
(PARTITION part_old VALUES LESS THAN (TO_DATE(’04-FEB-2018′, ‘DD-MON-YYYY’)) TABLESPACE tbs1
);
Insert data in the table
SQL> Declare
Begin
For i in 1..10 loop
For j in 1..1000 loop
Insert into t_range values(j,sysdate+i,’DEBASIS’||to_char(i));
End loop;
End loop;
end;
/
PL/SQL procedure successfully completed.
Let me gather the statistics now:-
SQL> exec dbms_stats.gather_table_stats(‘SYS’,’T_RANGE’);
Please check the analyze date
SELECT partition_name,
to_char( last_analyzed, ‘DD-MON-YYYY, HH24:MI:SS’ ) last_analyze,
num_rows
FROM DBA_TAB_PARTITIONS
WHERE table_name = ‘T_RANGE’
ORDER BY partition_position;
SQL> /
PARTITION_NAME LAST_ANALYZE NUM_ROWS
—————————— —————————— ———-
PART_OLD 21-FEB-2018, 22:38:05 0
SYS_P463 21-FEB-2018, 22:38:05 1000
SYS_P464 21-FEB-2018, 22:38:05 1000
SYS_P465 21-FEB-2018, 22:38:05 1000
SYS_P466 21-FEB-2018, 22:38:05 1000
SYS_P467 21-FEB-2018, 22:38:05 1000
SYS_P468 21-FEB-2018, 22:38:05 1000
SYS_P469 21-FEB-2018, 22:38:05 1000
SYS_P470 21-FEB-2018, 22:38:05 1000
SYS_P471 21-FEB-2018, 22:38:05 1000
SYS_P472 21-FEB-2018, 22:38:05 1000
Please check initial setting of statistics related parameters
SQL> SELECT dbms_stats.get_prefs(‘INCREMENTAL’,’SYS’,’T_RANGE’) “INCREMENTAL” FROM dual;
INCREMENTAL
——————————————————————————–
FALSE
SQL> SELECT dbms_stats.get_prefs(‘PUBLISH’,’SYS’,’T_RANGE’) “PUBLISH” FROM dual;
PUBLISH
——————————————————————————–
TRUE
SQL> SELECT dbms_stats.get_prefs(‘ESTIMATE_PERCENT’,’SYS’,’T_RANGE’) “ESTIMATE_PERCENT” FROM dual;
ESTIMATE_PERCENT
——————————————————————————–
DBMS_STATS.AUTO_SAMPLE_SIZE
SQL> SELECT dbms_stats.get_prefs(‘GRANULARITY’,’SYS’,’T_RANGE’) “GRANULARITY” FROM dual;
GRANULARITY
——————————————————————————–
AUTO
Now let me set TRUE to INCREMENTAL statistics
exec dbms_stats.set_table_prefs(‘TEST’,’T1′,’INCREMENTAL’,’TRUE’);
Please check synopsis table now
SELECT o.name “Table Name”,
p.subname “Part”,
c.name “Column”,
h.analyzetime “Synopsis Creation Time”
FROM WRI$_OPTSTAT_SYNOPSIS_HEAD$ h,
OBJ$ o,
USER$ u,
COL$ c,
( ( SELECT TABPART$.bo# BO#,
TABPART$.obj# OBJ#
FROM TABPART$ tabpart$ )
UNION ALL
( SELECT TABCOMPART$.bo# BO#,
TABCOMPART$.obj# OBJ#
FROM TABCOMPART$ tabcompart$ ) ) tp,
OBJ$ p
WHERE u.name = ‘SYS’ AND
o.name = ‘T_RANGE’ AND
tp.obj# = p.obj# AND
h.bo# = tp.bo# AND
h.group# = tp.obj# * 2 AND
h.bo# = c.obj#(+) AND
h.intcol# = c.intcol#(+) AND
o.owner# = u.user# AND
h.bo# = o.obj#
ORDER BY 4,1,2,3
/
Table Name Part Column Synopsis
——————– ——————– ——————– ———
T_RANGE PART_OLD CREATE_DATE 21-FEB-18
T_RANGE PART_OLD ID 21-FEB-18
T_RANGE PART_OLD NAME 21-FEB-18
T_RANGE SYS_P463 CREATE_DATE 21-FEB-18
T_RANGE SYS_P463 ID 21-FEB-18
T_RANGE SYS_P463 NAME 21-FEB-18
T_RANGE SYS_P464 CREATE_DATE 21-FEB-18
T_RANGE SYS_P464 ID 21-FEB-18
T_RANGE SYS_P464 NAME 21-FEB-18
T_RANGE SYS_P465 CREATE_DATE 21-FEB-18
T_RANGE SYS_P465 ID 21-FEB-18
Table Name Part Column Synopsis
——————– ——————– ——————– ———
T_RANGE SYS_P465 NAME 21-FEB-18
T_RANGE SYS_P466 CREATE_DATE 21-FEB-18
T_RANGE SYS_P466 ID 21-FEB-18
T_RANGE SYS_P466 NAME 21-FEB-18
T_RANGE SYS_P467 CREATE_DATE 21-FEB-18
T_RANGE SYS_P467 ID 21-FEB-18
T_RANGE SYS_P467 NAME 21-FEB-18
T_RANGE SYS_P468 CREATE_DATE 21-FEB-18
T_RANGE SYS_P468 ID 21-FEB-18
T_RANGE SYS_P468 NAME 21-FEB-18
T_RANGE SYS_P469 CREATE_DATE 21-FEB-18
Table Name Part Column Synopsis
——————– ——————– ——————– ———
T_RANGE SYS_P469 ID 21-FEB-18
T_RANGE SYS_P469 NAME 21-FEB-18
T_RANGE SYS_P470 CREATE_DATE 21-FEB-18
T_RANGE SYS_P470 ID 21-FEB-18
T_RANGE SYS_P470 NAME 21-FEB-18
T_RANGE SYS_P471 CREATE_DATE 21-FEB-18
T_RANGE SYS_P471 ID 21-FEB-18
T_RANGE SYS_P471 NAME 21-FEB-18
T_RANGE SYS_P472 CREATE_DATE 21-FEB-18
T_RANGE SYS_P472 ID 21-FEB-18
T_RANGE SYS_P472 NAME 21-FEB-18
Table Name Part Column Synopsis
——————– ——————– ——————– ———
T_RANGE SYS_P473 CREATE_DATE 21-FEB-18
T_RANGE SYS_P473 ID 21-FEB-18
T_RANGE SYS_P473 NAME 21-FEB-18
Now I will again load data only to one partition
SQL> Declare
Begin
For j in 1..1000 loop
Insert into t_range values(j,sysdate+1,’DEBASIS’||to_char(j));
End loop;
end; 2 3 4 5 6
7 /
PL/SQL procedure successfully completed.
SQL> commit;
Commit complete.
Let me gather stats again
SQL> exec dbms_stats.gather_table_stats(‘SYS’,’T_RANGE’);
PL/SQL procedure successfully completed.
Now We will notice that the partition,where we load data is only analyzed avoiding costly full scan of all partitions
SQL> SELECT partition_name,
to_char( last_analyzed, ‘DD-MON-YYYY, HH24:MI:SS’ ) last_analyze,
num_rows
FROM DBA_TAB_PARTITIONS
WHERE table_name = ‘T_RANGE’
ORDER BY partition_position; 2 3 4 5 6
PARTITION_NAME LAST_ANALYZE NUM_ROWS
—————————— —————————— ———-
PART_OLD 21-FEB-2018, 22:46:58 0
SYS_P473 21-FEB-2018, 22:46:58 1000
SYS_P463 21-FEB-2018, 22:50:15 2000
SYS_P464 21-FEB-2018, 22:46:58 1000
SYS_P465 21-FEB-2018, 22:46:58 1000
SYS_P466 21-FEB-2018, 22:46:58 1000
SYS_P467 21-FEB-2018, 22:46:58 1000
SYS_P468 21-FEB-2018, 22:46:58 1000
SYS_P469 21-FEB-2018, 22:46:58 1000
SYS_P470 21-FEB-2018, 22:46:58 1000
SYS_P471 21-FEB-2018, 22:46:58 1000
SYS_P472 21-FEB-2018, 22:46:58 1000
We can see the preference for the INCREMENTAL still TRUE
SQL> SELECT dbms_stats.get_prefs(‘INCREMENTAL’,’SYS’,’T_RANGE’) “INCREMENTAL” FROM dual;
INCREMENTAL
——————————————————————————–
TRUE
Now let me turn INCREMENTAL statistics to FALSE again
SQL> exec dbms_stats.set_table_prefs(‘SYS’,’T_RANGE’,’INCREMENTAL’,’FALSE’);
PL/SQL procedure successfully completed.
SQL> SELECT dbms_stats.get_prefs(‘INCREMENTAL’,’SYS’,’T_RANGE’) “INCREMENTAL” FROM dual;
INCREMENTAL
——————————————————————————–
FALSE
SQL> SELECT o.name “Table Name”,
2 p.subname “Part”,
c.name “Column”,
3 4 h.analyzetime “Synopsis Creation Time”
FROM WRI$_OPTSTAT_SYNOPSIS_HEAD$ h,
5 6 OBJ$ o,
USER$ u,
7 8 COL$ c,
( ( SELECT TABPART$.bo# BO#,
TABPART$.obj# OBJ#
9 10 11 FROM TABPART$ tabpart$ )
UNION ALL
( SELECT TABCOMPART$.bo# BO#,
12 13 14 TABCOMPART$.obj# OBJ#
FROM TABCOMPART$ tabcompart$ ) ) tp,
15 16 OBJ$ p
WHERE u.name = ‘SYS’ AND
o 17 .name = ‘T_RANGE’ AND
18 19 tp.obj# = p.obj# AND
h.bo# = tp.bo# AND
20 21 h.group# = tp.obj# * 2 AND
h.bo# = c.obj#(+) AND
h.intcol# = c.intcol#(+) AND
22 23 24 o.owner# = u.user# AND
h.bo# = o.obj#
ORDER BY 4,1,2,3
25 26 27 /
Table Name Part Column Synopsis
——————– ——————– ——————– ———
T_RANGE PART_OLD CREATE_DATE 21-FEB-18
T_RANGE PART_OLD ID 21-FEB-18
T_RANGE PART_OLD NAME 21-FEB-18
T_RANGE SYS_P464 CREATE_DATE 21-FEB-18
T_RANGE SYS_P464 ID 21-FEB-18
T_RANGE SYS_P464 NAME 21-FEB-18
T_RANGE SYS_P465 CREATE_DATE 21-FEB-18
T_RANGE SYS_P465 ID 21-FEB-18
T_RANGE SYS_P465 NAME 21-FEB-18
T_RANGE SYS_P466 CREATE_DATE 21-FEB-18
T_RANGE SYS_P466 ID 21-FEB-18
Table Name Part Column Synopsis
——————– ——————– ——————– ———
T_RANGE SYS_P466 NAME 21-FEB-18
T_RANGE SYS_P467 CREATE_DATE 21-FEB-18
T_RANGE SYS_P467 ID 21-FEB-18
T_RANGE SYS_P467 NAME 21-FEB-18
T_RANGE SYS_P468 CREATE_DATE 21-FEB-18
T_RANGE SYS_P468 ID 21-FEB-18
T_RANGE SYS_P468 NAME 21-FEB-18
T_RANGE SYS_P469 CREATE_DATE 21-FEB-18
T_RANGE SYS_P469 ID 21-FEB-18
T_RANGE SYS_P469 NAME 21-FEB-18
T_RANGE SYS_P470 CREATE_DATE 21-FEB-18
Table Name Part Column Synopsis
——————– ——————– ——————– ———
T_RANGE SYS_P470 ID 21-FEB-18
T_RANGE SYS_P470 NAME 21-FEB-18
T_RANGE SYS_P471 CREATE_DATE 21-FEB-18
T_RANGE SYS_P471 ID 21-FEB-18
T_RANGE SYS_P471 NAME 21-FEB-18
T_RANGE SYS_P472 CREATE_DATE 21-FEB-18
T_RANGE SYS_P472 ID 21-FEB-18
T_RANGE SYS_P472 NAME 21-FEB-18
T_RANGE SYS_P473 CREATE_DATE 21-FEB-18
T_RANGE SYS_P473 ID 21-FEB-18
T_RANGE SYS_P473 NAME 21-FEB-18
Table Name Part Column Synopsis
——————– ——————– ——————– ———
T_RANGE SYS_P463 CREATE_DATE 21-FEB-18
T_RANGE SYS_P463 ID 21-FEB-18
T_RANGE SYS_P463 NAME 21-FEB-18
36 rows selected.
SQL> commit;
Commit complete.
Let me gather statistics again
SQL> exec dbms_stats.gather_table_stats(‘SYS’,’T_RANGE’);
PL/SQL procedure successfully completed.
Now please note statistics updated all the partitions of the table again which will go for costly full table scan for all partitions
SQL> SELECT partition_name,
to_char( last_analyzed, ‘DD-MON-YYYY, HH24:MI:SS’ ) last_analyze,
num_rows
FROM DBA_TAB_PARTITIONS
WHERE table_name = ‘T_RANGE’
ORDER BY partition_position; 2 3 4 5 6
PARTITION_NAME LAST_ANALYZE NUM_ROWS
—————————— —————————— ———-
PART_OLD 21-FEB-2018, 23:06:47 0
SYS_P473 21-FEB-2018, 23:06:47 1000
SYS_P463 21-FEB-2018, 23:06:47 3000
SYS_P464 21-FEB-2018, 23:06:47 1000
SYS_P465 21-FEB-2018, 23:06:47 1000
SYS_P466 21-FEB-2018, 23:06:47 1000
SYS_P467 21-FEB-2018, 23:06:47 1000
SYS_P468 21-FEB-2018, 23:06:47 1000
SYS_P469 21-FEB-2018, 23:06:47 1000
SYS_P470 21-FEB-2018, 23:06:47 1000
SYS_P471 21-FEB-2018, 23:06:47 1000
SYS_P472 21-FEB-2018, 23:06:47 1000
12 rows selected.
J.SQL Plan directives FOR gather statistics oracle
SQL Plan Directives are one of the functionalities that make up Adaptive Query Optimization in Oracle Database 12c. SQL plan directives are like “extra notes” for the optimizer, to remind it that it previously selected a suboptimal plan, typically because of incorrect cardinality estimates. Incorrect cardinality estimates are often caused by issues like missing statistics, stale statistics, complex predicates or complex operators. Unlike SQL profiles, which are statement specific, SQL plan directives are linked to query expressions, so they can be used by several statements containing matching query expressions. Situations like missing histograms or missing extended statistics may result in SQL plan directives being generated.
The database manages SQL plan directives internally. Situations like automatic reoptimization may result in SQL plan directives being written to the SGA and later persisted to the SYSAUX tablespace, at which point they can be displayed using the DBA_SQL_PLAN_DIRECTIVES and DBA_SQL_PLAN_DIR_OBJECTS views. Alternatively, existing SQL plan directives can be persisted manually using the DBMS_SPD package.
Let me create test seup first
sqlplus test/test
SQL> DROP TABLE tab1 PURGE;
Table dropped.
SQL> CREATE TABLE tab1 (
id NUMBER,
gender VARCHAR2(1),
has_y_chromosome VARCHAR2(1),
CONSTRAINT tab1_pk PRIMARY KEY (id),
CONSTRAINT tab1_gender_chk CHECK (gender IN (‘M’, ‘F’)),
CONSTRAINT tab1_has_y_chromosome_chk CHECK (has_y_chromosome IN (‘Y’, ‘N’))
);
Table created.
SQL>
INSERT /*+ APPEND */ INTO tab1
SELECT level, ‘M’, ‘Y’
FROM dual
CONNECT BY level <= 10;
COMMIT;
10 rows created.
SQL>Commit complete.
SQL>
INSERT /*+ APPEND */ INTO tab1
SELECT 10+level, ‘F’, ‘N’
FROM dual
CONNECT BY level <= 90;
COMMIT;
90 rows created.
SQL>Commit complete.
SQL> CREATE INDEX tab1_gender_idx ON tab1(gender);
CREATE INDEX tab1_has_y_chromosome_idx ON tab1(has_y_chromosome);
Index created.
Index created.
SQL> EXEC DBMS_STATS.gather_table_stats(USER, ‘TAB1’);
PL/SQL procedure successfully completed.
Now please check there is no histogram collected
SQL> COLUMN column_name FORMAT A20
SELECT column_id,
column_name,
histogram
FROM user_tab_columns
WHERE table_name = ‘TAB1’
ORDER BY column_id;
COLUMN_ID COLUMN_NAME HISTOGRAM
———- ——————– —————
1 ID NONE
2 GENDER NONE
3 HAS_Y_CHROMOSOME NONE
Please note that we observe cardinality miscalculation in explain plan,which is evident here.Cardinality mismatch always leads to improper explain plan hence poor performance.
SQL> SELECT /*+ GATHER_PLAN_STATISTICS */
*
FROM tab1
WHERE gender = ‘M’
AND has_y_chromosome = ‘Y’;
SET LINESIZE 200 PAGESIZE 100
SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => ‘allstats last’)); 2 3 4 5
ID G H
———- – –
1 M Y
2 M Y
3 M Y
4 M Y
5 M Y
6 M Y
7 M Y
8 M Y
9 M Y
10 M Y
10 rows selected.
PLAN_TABLE_OUTPUT
—————————————————————————————————————————————–
SQL_ID 5t8y8p5mpb99j, child number 1
————————————-
SELECT /*+ GATHER_PLAN_STATISTICS */ * FROM tab1 WHERE gender
= ‘M’ AND has_y_chromosome = ‘Y’
Plan hash value: 1552452781
—————————————————————————————————————–
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
—————————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.01 | 4 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB1 | 1 | 25 | 10 |00:00:00.01 | 4 ||* 2 | INDEX RANGE SCAN | TAB1_GENDER_IDX | 1 | 50 | 10 |00:00:00.01 | 2 |
—————————————————————————————————————–
Predicate Information (identified by operation id):
—————————————————
1 – filter(“HAS_Y_CHROMOSOME”=’Y’)
2 – access(“GENDER”=’M’)
21 rows selected.
We will also observe that Oracle found that this sql query can be optimized further
SQL> COLUMN sql_text FORMAT A40
COLUMN is_reoptimizable FORMAT A16
SELECT sql_text, is_reoptimizable
FROM v$sql
WHERE sql_id = ‘5t8y8p5mpb99j’;SQL> SQL> SQL> 2 3
SQL_TEXT IS_REOPTIMIZABLE
—————————————- —————-
SELECT /*+ GATHER_PLAN_STATISTICS */ Y
* FROM tab1 WHERE gender = ‘M’ AN
D has_y_chromosome = ‘Y’
SELECT /*+ GATHER_PLAN_STATISTICS */ Y
* FROM tab1 WHERE gender = ‘M’ AN
D has_y_chromosome = ‘Y’
At this point, the optimizer has written SQL plan directives into the SGA, but they are not visible to us as they have not been persisted to the SYSAUX tablespace.
SQL> SET LINESIZE 200
COLUMN dir_id FORMAT A20
COLUMN owner FORMAT A10
COLUMN object_name FORMAT A10
COLUMN col_name FORMAT A10SQL> SELECT TO_CHAR(d.directive_id) dir_id, o.owner, o.object_name,
o.subobject_name col_name, o.object_type, d.type, d.state, d.reason
2 3 FROM dba_sql_plan_directives d, dba_sql_plan_dir_objects o
WHERE d.directive_id=o.directive_id
4 5 AND o.owner = ‘TEST’
ORDER BY 1,2,3,4,5; 6no rows selected
We can wait for the database to persist the SQL plan directives, or manually persist them using the DBMS_SPD package.
SQL> EXEC DBMS_SPD.flush_sql_plan_directive;
PL/SQL procedure successfully completed.
If we repeat the previous query, the SQL plan directives will be visible.
SQL> SET LINESIZE 200
COLUMN dir_id FORMAT A20
COLUMN owner FORMAT A10
COLUMN object_name FORMAT A10
COLUMN col_name FORMAT A10
SQL> SELECT TO_CHAR(d.directive_id) dir_id, o.owner, o.object_name,
o.subobject_name col_name, o.object_type, d.type, d.state, d.reason
FROM dba_sql_plan_directives d, dba_sql_plan_dir_objects o
WHERE d.directive_id=o.directive_id
AND o.owner = ‘TEST’
ORDER BY 1,2,3,4,5;
DIR_ID OWNER OBJECT_NAM COL_NAME OBJECT TYPE STATE REASON
——————– ———- ———- ———- —— —————- ———- ————————————
796134444163473829 TEST TAB1 GENDER COLUMN DYNAMIC_SAMPLING USABLE SINGLE TABLE CARDINALITY MISESTIMATE
796134444163473829 TEST TAB1 TABLE DYNAMIC_SAMPLING USABLE SINGLE TABLE CARDINALITY MISESTIMATE
Now let me rerun the query and query explain plan will show proper cardinality with help of sql plan directive
SQL> SELECT /*+ GATHER_PLAN_STATISTICS */
*
FROM tab1
WHERE gender = ‘M’
AND has_y_chromosome = ‘Y’;
SET LINESIZE 200 PAGESIZE 100
SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => ‘allstats last’));
2 3 4 5
ID G H
———- – –
1 M Y
2 M Y
3 M Y
4 M Y
5 M Y
6 M Y
7 M Y
8 M Y
9 M Y
10 M Y
10 rows selected.
SQL> SQL> SQL>
PLAN_TABLE_OUTPUT
—————————————————————————————————————————————–
SQL_ID 5t8y8p5mpb99j, child number 2
————————————-
SELECT /*+ GATHER_PLAN_STATISTICS */ * FROM tab1 WHERE gender
= ‘M’ AND has_y_chromosome = ‘Y’
Plan hash value: 1552452781
—————————————————————————————————————–
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
—————————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.01 | 4 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB1 | 1 | 10 | 10 |00:00:00.01 | 4 |
|* 2 | INDEX RANGE SCAN | TAB1_GENDER_IDX | 1 | 10 | 10 |00:00:00.01 | 2 |
—————————————————————————————————————–
Predicate Information (identified by operation id):
—————————————————
1 – filter(“HAS_Y_CHROMOSOME”=’Y’)
2 – access(“GENDER”=’M’)
Note
—–
– dynamic statistics used: dynamic sampling (level=2)
– statistics feedback used for this statement
– 1 Sql Plan Directive used for this statement
27 rows selected.
K.Extended Statistics FOR gather statistics oracle
In real-world data, there is often a relationship or correlation between the data stored in different columns of the same table. For example, consider a customers table where the values in a cust_state_province column are influenced by the values in a country_id column, because the state of California is only going to be found in the United States. If the Oracle Optimizer is not aware of these real-world relationships, it could potentially miscalculate the cardinality estimate if multiple columns from the same table are used in the where clause of a statement. With extended statistics you have an opportunity to tell the optimizer about these real-world relationships between the columns.
By creating extended statistics on a group of columns, the optimizer can determine a more accurate cardinality estimate when the columns are used together in a where clause of a SQL statement.You can use DBMS_STATS.CREATE_EXTENDED_STATS to define the column group you want to have statistics gathered on as a whole. Once the group has been established Oracle will automatically maintain the statistics on that column group when statistics are gathered on the table.
If we continue with the initial example of the customers table, When the value of cust_state_province is ‘CA’ we know the value of country_id will be 52790 or the USA. There is also a skew in the data in these two columns; because the company used in the SH is based in San Francisco so the majority of rows in the table have the values ‘CA’ and 52790. Both the relationship between the columns and the skew in the data can make it difficult for the optimizer to calculate the cardinality of these columns correctly when they are used together in a query.
Let us connect to SH schema:-
SQL> select count(1) from customers where cust_state_province=’CA’ and country_id=52790;
COUNT(1)
———-
3341
It is evident from explain plain that cardinality estimate incorrect.
SQL> select * from table(dbms_xplan.display_cursor());
PLAN_TABLE_OUTPUT
——————————————————————————–
SQL_ID f3rfbwhagqcss, child number 0
————————————-
select count(1) from customers where cust_state_province=’CA’ and
country_id=52790
Plan hash value: 296924608
——————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————–
| 0 | SELECT STATEMENT | | | | 423 (100)| |
PLAN_TABLE_OUTPUT
——————————————————————————–
| 1 | SORT AGGREGATE | | 1 | 16 | | |
|* 2 | TABLE ACCESS FULL| CUSTOMERS | 128 | 2048 | 423 (1)| 00:00:01 |
——————————————————————————–
Predicate Information (identified by operation id):
—————————————————
2 – filter((“CUST_STATE_PROVINCE”=’CA’ AND “COUNTRY_ID”=52790))
20 rows selected.
Prior to Oracle Database 11g the only option open to us would be to make the optimizer aware of the data skew in both the country_id column (most rows have 52790 as the value) and the cust_state_province column (most rows have ‘CA’ as the value). We can do this by gathering histograms on the skewed columns.
SQL> exec dbms_stats.gather_table_stats(user,’CUSTOMERS’,method_opt=>’for all columns size skewonly’);
PL/SQL procedure successfully completed.
Let me create extended statistics on the two column used in where clause
select dbms_stats.create_extended_stats(null,’customers’,'(country_id,cust_state_province)’) from dual;
SQL> col COLUMN_NAME format a30;
SQL> col HISTOGRAM format a30;
SQL> select column_name,histogram from user_tab_col_statistics where table_name=’CUSTOMERS’;
COLUMN_NAME HISTOGRAM
—————————— ——————————
SYS_STUJGVLRVH5USVDU$XNV4_IR#4 FREQUENCY
CUST_STATE_PROVINCE_ID FREQUENCY
COUNTRY_ID FREQUENCY
CUST_MAIN_PHONE_NUMBER HYBRID
CUST_INCOME_LEVEL FREQUENCY
CUST_CREDIT_LIMIT FREQUENCY
CUST_EMAIL HYBRID
CUST_TOTAL FREQUENCY
CUST_TOTAL_ID FREQUENCY
CUST_SRC_ID NONE
CUST_EFF_FROM FREQUENCY
CUST_EFF_TO NONE
CUST_VALID FREQUENCY
CUST_ID HYBRID
CUST_FIRST_NAME HYBRID
CUST_LAST_NAME HYBRID
CUST_GENDER FREQUENCY
CUST_YEAR_OF_BIRTH FREQUENCY
CUST_MARITAL_STATUS FREQUENCY
CUST_STREET_ADDRESS HYBRID
CUST_POSTAL_CODE HYBRID
CUST_CITY HYBRID
CUST_CITY_ID HYBRID
CUST_STATE_PROVINCE FREQUENCY
SQL> select count(1) from customers where cust_state_province=’CA’ and country_id=52790;
COUNT(1)
———-
3341
We can now see it goes for dynamic sampling and adaptive optimizer feature
SQL> select * from table(dbms_xplan.display_cursor());
PLAN_TABLE_OUTPUT
—————————————————————————————————————————————–
SQL_ID f3rfbwhagqcss, child number 0
————————————-
select count(1) from customers where cust_state_province=’CA’ and
country_id=52790
Plan hash value: 296924608
——————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————–
| 0 | SELECT STATEMENT | | | | 423 (100)| |
| 1 | SORT AGGREGATE | | 1 | 16 | | |
|* 2 | TABLE ACCESS FULL| CUSTOMERS | 3424 | 54784 | 423 (1)| 00:00:01 |
——————————————————————————–
Predicate Information (identified by operation id):
—————————————————
2 – filter((“CUST_STATE_PROVINCE”=’CA’ AND “COUNTRY_ID”=52790))
Note
—–
PLAN_TABLE_OUTPUT
—————————————————————————————————————————————–
– dynamic statistics used: dynamic sampling (level=2)
– 1 Sql Plan Directive used for this statement
25 rows selected.
Optionally we can see report of the column
SQL> set long 100000
SQL> set lines 120
SQL> set pages 0
SQL> select dbms_stats.report_col_usage(user,’CUSTOMERS’) from dual;
LEGEND:
…….
EQ : Used in single table EQuality predicate
RANGE : Used in single table RANGE predicate
LIKE : Used in single table LIKE predicate
NULL : Used in single table is (not) NULL predicate
EQ_JOIN : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER : Used in single table FILTER predicate
JOIN : Used in JOIN predicate
GROUP_BY : Used in GROUP BY expression
…………………………………………………………………….
###############################################################################
COLUMN USAGE REPORT FOR SH.CUSTOMERS
………………………………
- COUNTRY_ID : EQ EQ_JOIN
- CUST_CITY_ID : EQ_JOIN
- CUST_ID : EQ_JOIN
- CUST_STATE_PROVINCE : EQ
- CUST_STATE_PROVINCE_ID : EQ_JOIN
- CUST_TOTAL_ID : EQ_JOIN
- SYS_STUJGVLRVH5USVDU$XNV4_IR#4 : EQ
- (CUST_STATE_PROVINCE, COUNTRY_ID) : FILTER
###############################################################################
Also I need to turn off optimizer dynamic sampling and optimizer adaptive feature to simulate extended statistics.
SQL> alter system set optimizer_dynamic_sampling=0;
SQL> alter system set optimizer_adaptive_features=false;
SQL> select count(1) from customers where cust_state_province=’CA’ and country_id=52790;
COUNT(1)
———-
3341
SQL> select * from table(dbms_xplan.display_cursor());
PLAN_TABLE_OUTPUT
——————————————————————————–
SQL_ID f3rfbwhagqcss, child number 2
————————————-
select count(1) from customers where cust_state_province=’CA’ and
country_id=52790
Plan hash value: 296924608
——————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————–
| 0 | SELECT STATEMENT | | | | 424 (100)| |
PLAN_TABLE_OUTPUT
——————————————————————————–
| 1 | SORT AGGREGATE | | 1 | 35 | | |
|* 2 | TABLE ACCESS FULL| CUSTOMERS | 13 | 455 | 424 (1)| 00:00:01 |
——————————————————————————–
Predicate Information (identified by operation id):
—————————————————
2 – filter((“CUST_STATE_PROVINCE”=’CA’ AND “COUNTRY_ID”=52790))
Let me drop the extended statistics
SQL> exec dbms_stats.drop_extended_stats(null,’customers’,'(country_id,cust_state_province)’);
PL/SQL procedure successfully completed.
I will recreate statistics again
SQL> select dbms_stats.create_extended_stats(null,’customers’,'(country_id,cust_state_province)’) from dual;
DBMS_STATS.CREATE_EXTENDED_STATS(NULL,’CUSTOMERS’,'(COUNTRY_ID,CUST_STATE_PROVINCE)’)
—————————————————————————————————————————————–
SYS_STUJGVLRVH5USVDU$XNV4_IR#4
Let me gather statistics again
SQL> exec dbms_stats.gather_table_stats(user,’CUSTOMERS’);
PL/SQL procedure successfully completed.
Still it will show bad cardinality estimation
SQL> select count(1) from customers where cust_state_province=’CA’ and country_id=52790;
COUNT(1)
———-
3341
SQL> select * from table(dbms_xplan.display_cursor());
PLAN_TABLE_OUTPUT
—————————————————————————————————————————————–
SQL_ID f3rfbwhagqcss, child number 2
————————————-
select count(1) from customers where cust_state_province=’CA’ and
country_id=52790
Plan hash value: 296924608
——————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————–
| 0 | SELECT STATEMENT | | | | 423 (100)| |
PLAN_TABLE_OUTPUT
————————————————————————————————————————————————————————————————————————————————————————————————————
| 1 | SORT AGGREGATE | | 1 | 16 | | |
|* 2 | TABLE ACCESS FULL| CUSTOMERS | 1115 | 17840 | 423 (1)| 00:00:01 |
——————————————————————————–
Predicate Information (identified by operation id):
—————————————————
2 – filter((“CUST_STATE_PROVINCE”=’CA’ AND “COUNTRY_ID”=52790))
20 rows selected.
Now let me again gather statistics with skewonly option of histogram.It will show correct estimate.
SQL> exec dbms_stats.gather_table_stats(user,’CUSTOMERS’,method_opt=>’for all columns size skewonly’);
PL/SQL procedure successfully completed.
SQL> select count(1) from customers where cust_state_province=’CA’ and country_id=52790;
COUNT(1)
———-
3341
SQL> select * from table(dbms_xplan.display_cursor());
PLAN_TABLE_OUTPUT
—————————————————————————————————————————————–
SQL_ID 74w1jdh78t4w8, child number 0
————————————-
select count(1) from customers where cust_state_province=’CA’ and
country_id=52790
Plan hash value: 296924608
——————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————–
| 0 | SELECT STATEMENT | | | | 423 (100)| |
| 1 | SORT AGGREGATE | | 1 | 16 | | |
|* 2 | TABLE ACCESS FULL| CUSTOMERS | 3341 | 53456 | 423 (1)| 00:00:01 |
——————————————————————————–
Predicate Information (identified by operation id):
—————————————————
2 – filter((“CUST_STATE_PROVINCE”=’CA’ AND “COUNTRY_ID”=52790))
L.How to enable old statistics from history FOR gather statistics oracle
Let me check what are old statistics collected in previous test case
SQL> COL TABLE_NAME FORMAT a10
SQL>SELECT TABLE_NAME,
TO_CHAR(STATS_UPDATE_TIME,’YYYY-MM-DD:HH24:MI:SS’) AS STATS_MOD_TIME
FROM DBA_TAB_STATS_HISTORY
WHERE TABLE_NAME=’CUSTOMERS’
AND OWNER=’SH’
ORDER BY STATS_UPDATE_TIME DESC;
TABLE_NAME STATS_MOD_TIME
———- ——————-
CUSTOMERS 2018-02-22:03:30:57
CUSTOMERS 2018-02-22:03:24:07
CUSTOMERS 2018-02-22:03:20:27
CUSTOMERS 2018-02-22:03:00:37
CUSTOMERS 2018-02-22:02:16:48
CUSTOMERS 2018-02-22:02:01:26
CUSTOMERS 2018-02-22:01:42:21
CUSTOMERS 2018-02-07:12:15:38
8 rows selected.
Let me enable older statistics now
SQL> BEGIN
DBMS_STATS.RESTORE_TABLE_STATS( ‘SH’,’CUSTOMERS’,
TO_TIMESTAMP(‘2018-02-22:03:24:07′,’YYYY-MM-DD:HH24:MI:SS’) );
END;
/
2 3 4 5
PL/SQL procedure successfully completed.
Now if we check explain plan,it went back to previous explain plan cost
SQL> select count(1) from customers where cust_state_province=’CA’ and country_id=52790;
COUNT(1)
———-
3341
SQL> select * from table(dbms_xplan.display_cursor());
PLAN_TABLE_OUTPUT
—————————————————————————————————————————————–
SQL_ID f3rfbwhagqcss, child number 2
————————————-
select count(1) from customers where cust_state_province=’CA’ and
country_id=52790
Plan hash value: 296924608
——————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————–
| 0 | SELECT STATEMENT | | | | 423 (100)| |
PLAN_TABLE_OUTPUT
————————————————————————————————————————————————————————————————————————————————————————————————————
| 1 | SORT AGGREGATE | | 1 | 16 | | |
|* 2 | TABLE ACCESS FULL| CUSTOMERS | 1115 | 17840 | 423 (1)| 00:00:01 |
——————————————————————————–
Predicate Information (identified by operation id):
—————————————————
2 – filter((“CUST_STATE_PROVINCE”=’CA’ AND “COUNTRY_ID”=52790))
20 rows selected.
M.Online Statistics FOR GATHER STATISTICS ORACLE
In Oracle Database 12c, online statistics gathering “piggybacks” statistics gather as part of a direct-path data loading operation such as, create table as select (CTAS) and insert as select (IAS) operations. Gathering statistics as part of the data loading operation means no additional full data scan is required to have statistics available immediately after the data is loaded.
Online statistics gathering does not gather histograms or index statistics, as these types of statistics require additional data scans, which could have a large impact on the performance of the data load. To gather the necessary histogram and index statistics without re-gathering the base column statistics use the DBMS_STATS.GATHER_TABLE_STATS procedure with the new options parameter set to GATHER AUTO. Note that for performance reasons, GATHER AUTO builds histogram using a sample of rows rather than all rows in the table.