Monday, March 15, 2010

How do I complete a massive DELETE operation in my Oracle database without having to pay the heavy performance overhead?

How do I complete a massive DELETE operation in my Oracle database without having to pay the heavy performance overhead?

Perform the massive DELETE operation as a direct-path (direct-load) INSERT (INSERT with APPEND hint) into a table whose logging parameter is set to NOLOGGING. This operation will complete significantly faster than DELETE and with minimum logging, but you have to take a backup afterwards to establish a new baseline.

Direct-path INSERT is a special database operation. Like SQL*Loader, it writes data directly to the database files, bypassing the buffer cache. It does this with minimum logging, recording only data dictionary changes. The logic behind this method is that because the data files are already up-to-date when an instance failure occurs, logging isn't necessary.


The two distinct cases in which direct-path INSERT is important are when:
1. The database is in noarchivelog mode. Media recovery is not possible, and you don't need redo data for that either.
2. The database is in archivelog mode. It logs redo blocks for media recovery by default. However, if you explicitly set a table to NOLOGGING mode, the database won't log redo blocks.

Therefore, with the direct-path INSERT when the database is in noarchivelog mode or when it is in archivelog mode and the table is in NOLOGGING mode, it performs only minimum redo logging—to protect the data dictionary.

Proof:
Practical to prove Direct-path INSERT(insert) is faster than delete & update:






Step 1: , create a table named TestData with half a million rows, a primary key, one non-unique index, and a check constraint. We will use this table in all of the example and performance tests. It employs data from the all_objects view as its source.



create table TestData

(

objectId integer not null,

objectName varchar2(30) not null,

objectType varchar2(30) not null,

runId integer not null,

createDate date ,

status varchar2(7) ,

description varchar2(255)

)

nologging;

Table created.

Step 2:

SQL> select *from v$version;

BANNER

----------------------------------------------------------------

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod

PL/SQL Release 10.2.0.1.0 - Production

CORE 10.2.0.1.0 Production

TNS for 32-bit Windows: Version 10.2.0.1.0 - Production

NLSRTL Version 10.2.0.1.0 - Production

Next, populate it with data using the following block (Since the all_objects view (Oracle 10.2.0.1.0) has 49K+ rows, you need to do the insert 49 times to total 500K+ rows in the TestData table):

declare

vCount pls_integer := 49;

begin

for nIndex in 1..49

loop

insert /*+ APPEND */

into TestData

(

objectId,

objectName,

objectType,

runId,

createDate,

status,

description

)

select object_id,

object_name,

object_type,

nIndex,

created,

status,

object_name

from all_objects;



commit;

end loop;

end;

/

PL/SQL procedure successfully completed.

Step 3:
Check the rows count:
select count(*)from TestData;
=>24,42209 rows…

Add the primary key on the objectId and runId columns, one non-unique index on the objectName and objectType columns, and one check constraint on the runId:

alter table TestData add constraint TestData_PK
primary key(objectId,runId);

create index TestData_NameType on TestData
(
objectName,objectType
);

alter table TestData add constraint TestData_CK
check (runId > 0);

Table altered…

Step3:Gather the statistics using the following block:

begin
dbms_stats.gather_table_stats
(
ownname => user,
tabname => 'TestData',
method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO',
degree => 4,
cascade => true
);

PL/SQL procedure successfully completed.

Step 4:
Now, create two tables, TestData_Logging and TestData2_Nologging, with identical structures to TestData but with no indexes or constraints:

create table TestData_Logging as select * from TestData
where 1 = 0;

create table TestData_Nologging nologging as select * from TestData where 1 = 0;

TestData_Logging is in LOGGING mode, while TestData_Nologging is in NOLOGGING mode:

select table_name,logging from user_tables
where table_name like 'TESTDATA\_%' escape '\';

TABLE_NAME LOGGING
------------------------------ -------
TESTDATA_LOGGING YES
TESTDATA_NOLOGGING NO

Take a snapshot of the redo size before doing a regular INSERT:

select a.name, b.value from v$statname a,v$mystat b
where a.statistic# = b.statistic#
and a.name = 'redo size';

NAME VALUE
----------------- -----------
redo size 175103192

Perform the INSERT into the TestData_Logging table (Scenario 1):

set timing on
insert into TestData_Logging
select * from TestData;
2442209 rows created.

Elapsed:00:00:57.00

Take a snapshot of redo and undo after the INSERT:

select a.name,
b.value from v$statname a,
v$mystat b
where a.statistic# = b.statistic#
and a.name = 'redo size';

NAME VALUE
----------------- -----------
redo size 514476836

select used_ublk
from v$transaction;

USED_UBLK
----------
1032

Populating the TestData_Logging table with 2442209 rows in LOGGING mode took 59 seconds and forced the database to generate 175mb(93,019,404 – 39,867,216 = 53,152,188) of redo and 1024 undo blocks.

Now, repeat the same steps for the table in NOLOGGING mode (Scenario 2):

Obtain performance results before load:
select a.name,b.value
from v$statname a,v$mystat b
where a.statistic# = b.statistic# and a.name = 'redo size';

NAME VALUE
----------------- -----------

redo size 514477004

insert /*+ APPEND */
into TestData_Nologging
select * from TestData;

2442209 rows created.

Elapsed:00:00:41.87

Obtain performance results after load:

select a.name,b.value
from v$statname a,v$mystat b
where a.statistic# = b.statistic# and a.name = 'redo size';

NAME VALUE
----------------- -----------
redo size 514971616

select used_ublk  from v$transaction;
----------

1

Populating the TestData_Nologging table with around 24,42209 rows in NOLOGGING mode took 41.87 seconds and generated less redo and 1 undo block.




Below shows the performance numbers for both scenarios.

Elapsed Time (sec.) Redo Size (MB) Undo Blocks

Scenario 1: Regular INSERT, LOGGING mode 59 secs More redo From above results 1024

Scenario 2: Direct-path INSERT, NOLOGGING mode 41 sec Less redo from above results 1

         Hence,Direct insert load is faster way to load data and to delete data(indirectly)

Best regards,

Rafi.

No comments:

Post a Comment