分类: Oracle
2008-05-08 21:53:23
Archiving historical database data is an important, complex and in most cases forgotten task of database projects. Historical data has to be extracted and deleted from the database as a part of the archiving procedure. Extracted data is kept for several years as archived data, while the database retains only the actual, active set of data. If we do not archive, the data volume will increase, and so will the project costs. Just buying a new disk is not a permanent solution for handling expanding archive data. An Oracle DBA must understand the application model before stepping into implementing an archive procedure. A simple archive plan has the following basic steps:
In this article, we will examine the problem of deleting historical data and present some practical tips on how to execute and check the deleting performance.
Highwater Mark (HWM)
For each object, Oracle maintains a record of the highest relative block used to hold data.
HWM is the last data block formatted to receive data, not necessarily carrying data. The table highwater mark (HWM) is the number stored in the table segment header. Adding new data to the table moves HWM to the next block, or to the last used block. Oracle optimizer relies on the HWM during calculation for the full table scan, and on full table scans, Oracle optimizer will always scan all database blocks
until the HWM pointer. This highwater mark is maintained in multiples of five blocks. Oracle has provided a way to find HWM for the table using the DBMS_SPACE package, which is not dependent on the analyzed status of the table. Deleting data from the table leaves the HWM unchanged. The next table insert will use the first free block on the free block list, which is beyond the highwater mark causing the table to continue growing.
Delete Operation
Delete command example:
SQL > delete * from artist.tb1 where INSERT_DATE > TO_CHAR('12-05.2001');
SQL > commit;
The delete operation will scan the table until it reaches the HWM (Highwater Mark) position and afterwards will remove records which match the desired where condition. The deletion operation will not change the HWM point and the next full table scan will scan again until the HWM, regardless of whether data exists in the table or not. All the previous extent remains allocated for the table. When the matching column has an index on it, optimizer will most likely scan only the index for the matching records.
All deleted entries from the index tree will be marked as deleted but still retain allocated space in the index tree.
Oracle's internal mechanism will generate the rollback content to provide backup information for deleted table data in case the operation is cancelled. On commit, deletions are marked as permanent. The database changes will be logged in the redo log file and transferred in the rollback segments.
For making performance measurements I choose Oracle database 9.0.1 Enterprise Edition on the Sun Solaris operating system. Let's assume in database schema ARTIST we have one big table ARTIST_TEST with 19856 records. From this table we are going to delete 6768 records using following syntax:
delete from artist_test where SECOND_COLL='JAVA CLASS';
The table has only one index that is never dropped or rebuilt. Before running every test we will make regular compute-analysis of the table.
SQL code to rebuild table and prepare system for each measurement:
SQL> create table artist_test ( 2 no number, first_col number, second_coll varchar2(1000), rest varchar2(1000), 3 constraint artist_pk primary key (no)) tablespace artist 4 / Table created. SQL> insert into artist_test select rownum,OBJECT_ID,OBJECT_TYPE,rpad('H',100) from all_objects; 19856 rows created. SQL> commit; Commit complete. SQL> analyze table artist_test compute statistics; Table analyzed. SQL> alter system flush shared_pool; System altered.
The Command ANALYZE TABLE will provide fresh information for the optimizer and the command ALTER SYSTEM FLUSH SHARED_POOL will erase all previously cached statements from the shared pool. During a test, there is regular activity on the database that simulates the real production system. To obtain as much precise information for own session activity, a special view has been created named "V$ROGER" under sys account. This view is just a link to the regular database dynamic performance view "V$MYSTAT" provided by Oracle for collecting local session information.
SQL> create or replace view v$roger as select sn.name,ms.value from v$mystat ms,v$statname sn where ms.statistic# = sn.statistic#; View created. SQL> create public synonym v$roger for v$roger Synonym created. SQL> grant select on v$roger to public; Grant succeeded.
The delete statement statistics are generated using special event 10046. Setting this event will start tracing with level 8, generating in the trace file standard SQL_TRACE information, bind values and wait events. Short description for that event:
SQL> alter session set events '10046 trace name context forever, level 8'; Session altered. 10046 EVENT levels: 1 - Enable standard SQL_TRACE functionality (Default) 4 - As Level 1 PLUS trace bind values 8 - As Level 1 PLUS trace waits This is especially useful for spotting latch wait etc. but can also be used to spot full table scans and index scans. 12 - As Level 1 PLUS both trace bind values and waits
A trace file will be processed with regular Oracle tool TKPROF using following syntax:
# tkprof artist_ora_23641.trc sql_cost3.prf sort=exeela,fchela explain=artist/artist
Timing information is generated using the database procedure GET_TIME from the DBMS_UTILITY package provided by Oracle.
time_before:=DBMS_UTILITY.GET_TIME; time_after:=DBMS_UTILITY.GET_TIME; time_elapsed:=time_after-time_before;
Test Case 1 – Basic Delete
delete from artist_test where SECOND_COLL='JAVA CLASS';
Test Case 2– Delete with PLSQL and forall loop, committing only once
DECLARE TYPE artist_value IS TABLE OF VARCHAR2(1000) INDEX BY BINARY_INTEGER; del_value artist_value; rowkey PLS_INTEGER; BEGIN SELECT rowid BULK COLLECT INTO del_value FROM artist_test WHERE SECOND_COLL='JAVA CLASS'; IF del_value IS NOT NULL THEN FORALL rowkey IN del_value.FIRST..del_value.LAST DELETE artist_test WHERE rowid = del_value(rowkey); commit; END IF; END; /
Test Case 3 – Delete using anonymous PL/SQL block, looping through the cursor, and deleting one tow at a time. Loop has counter, programmed to trigger commit after each 1000 records.
SQL> declare rec_read number(6) := 0; rec_rest number(6) := 0; vc_var_out varchar2(41) := 'Delete Archive Data'; cursor rec_sql is select SECOND_COLL,rowid from artist_test where SECOND_COLL='JAVA CLASS' order by SECOND_COLL,rowid; begin for rec_sql_run in rec_sql loop rec_read := rec_read + 1 ; rec_rest := rec_rest + 1 ; delete from artist_test A where A.rowid = rec_sql_run.rowid ; if rec_rest > 1000 THEN COMMIT; rec_rest := 1; dbms_output.put_line('COMMIT AT..'|| rec_read); END IF; end loop; end; /
Test Case 4
a.) Deleting using PL/SQL stored procedure, commit each 1000 records.
b.) Deleting using PL/SQL stored procedure, commit only once.
CREATE OR REPLACE PROCEDURE delete_table (commit_size in number default 1000) is sel_id INTEGER; del_id INTEGER; exec_sel INTEGER; exec_del INTEGER; cur_stmt VARCHAR2(2000); del_rowid ROWID; BEGIN cur_stmt := 'SELECT rowid FROM artist_test where SECOND_COLL=''JAVA CLASS'''; sel_id := DBMS_SQL.OPEN_CURSOR; DBMS_SQL.PARSE(sel_id,cur_stmt,dbms_sql.v7); DBMS_SQL.DEFINE_COLUMN_ROWID(sel_id,1,del_rowid); exec_sel := DBMS_SQL.EXECUTE(sel_id); del_id := DBMS_SQL.OPEN_CURSOR; DBMS_SQL.PARSE(del_id,'delete from artist_test where rowid = :del_rowid',dbms_sql.v7); LOOP IF DBMS_SQL.FETCH_ROWS(sel_id) >0 THEN DBMS_SQL.COLUMN_VALUE(sel_id,1,del_rowid); DBMS_SQL.BIND_VARIABLE(del_id,'del_rowid',del_rowid); exec_del := DBMS_SQL.EXECUTE(del_id); if mod(dbms_sql.last_row_count,commit_size) = 0 then COMMIT; end if; ELSE exit; END IF; END LOOP; COMMIT; DBMS_SQL.CLOSE_CURSOR(sel_id); DBMS_SQL.CLOSE_CURSOR(del_id); END; /
Test Case 5 – Delete using standard anonymous PL/SQL block, using execute immediate function and committing only once.
SQL > declare proc_rows number := 0; begin execute immediate 'begin delete from artist_test where SECOND_COLL=''JAVA CLASS''; :x :=sql%rowcount; commit; end;' using OUT proc_rows; dbms_output.put_line('Records processed........'|| proc_rows); end; /
The result from tracing files, the v$roger session performance view and from the timing information obtained using DBMS_UTILITY.GET_TIME procedure was collected and sorted in the following table:
Delete Test |
Elapsed Time (sec) |
Execute Count |
Parse Count |
Parse Count Hard |
Recursive Calls |
Enqueue requests |
Sorts (memory) |
Basic Delete |
35,75 |
156 |
87 |
29 |
2839 |
33 |
61 |
PL/SQL Loop forall |
26,94 |
296 |
168 |
36 |
5008 |
46 |
109 |
PL/SQL Loop |
21,25 |
7219 |
342 |
36 |
18873 |
302 |
67 |
Stored Procedure |
15,52 |
7105 |
197 |
42 |
25830 |
54 |
118 |
Stored Procedure |
12,03 |
7100 |
197 |
38 |
25700 |
50 |
114 |
Execute Immediate |
9,87 |
298 |
162 |
42 |
5084 |
50 |
113 |
Elapsed Time –real time needed to finish test
Execute Count – number of execution recorded for the session
Parse Count – total number of parsing calls for the session (soft and hard)
Parse Count Hard – number of hard parsing calls for the session
Recursive Calls – number of recursive calls for the session
Enqueue Requests – number of locking requests for he session
Sorts (memory) – number of sorts in memory for the session
The best result was using EXECUTE_IMMEDIATE calls inside an anonymous PL/SQL block, during test case 5. This is not surprising, since Oracle suggests this method for mass data deletion. The EXECUTE IMMEDIATE call came first time in version Oracle8i. It is supposed to be a replacement for DBMS_SQL package. EXECUTE_IMMEDIATE calls parses and immediately executes a dynamic SQL statement or a PL/SQL block, which is preferred and faster than other coding tests.
I did not look in the tests for other potential problems, rising CPU usage and disk I/O utilization due to mass delete action. The assumption was that we were going to delete from the big table via a regular batch job, on system offloaded from regular users. In our test case, ARTIST_TEST table was not partitioned and distributed
over different disk devices, which is an additional point to consider for performance
improvements.
Every DBA has to test all existing methods and find one that best suite.
Some last recommendations to speed a delete operation:
Oracle 9i has several new and improved features aimed at making the availability, manageability and serviceability of the database easier. Beside Oracle 9i's strong partition orientation, at least 1/5 of the running Oracle databases have no implemented partitions.
Deleting historical data at the end of month becomes a question of existence:
"To delete old data and continue working or close the business due to bad performance and high costs."