1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
|
Data Compression
The common goal of compressing data is to save disk space. Since we are dealing with performance,
in this section I will speak about another advantage of data compression that is
frequently forgotten: improving response time.
■Note Data compression is a feature available in Enterprise Edition only.
The idea is quite simple. If a SQL statement has to process a lot of data through a full table
(or partition) scan, it is likely that the main contributor to its resource usage profile is related to
I/O operations. In such a situation, decreasing the amount of data to be read from the disk will
increase performance. Actually, the performance should increase almost proportionally to the
compression factor. The following example, based on the script data_compression.sql, illustrates
this:
CHAPTER 12 ■ O PTIMIZ ING T HE PHYSICAL D ES IGN 547
SQL> CREATE TABLE t NOCOMPRESS AS
2 SELECT rownum AS n, rpad(' ',500,mod(rownum,15)) AS pad
3 FROM dual
4 CONNECT BY level <= 2000000;
SQL> execute dbms_stats.gather_table_stats(ownname=>user, tabname=>'t')
SQL> SELECT table_name, blocks FROM user_tables WHERE table_name = 'T';
TABLE_NAME BLOCKS
------------------------------ ----------
T 143486
SQL> SELECT count(n) FROM t;
COUNT(*)
----------
2000000
Elapsed: 00:00:12.68
SQL> ALTER TABLE t MOVE COMPRESS;
SQL> execute dbms_stats.gather_table_stats(ownname=>user, tabname=>'t')
SQL> SELECT table_name, blocks FROM user_tables WHERE table_name = 'T';
TABLE_NAME BLOCKS
------------------------------ ----------
T 27274
SQL> SELECT count(n) FROM t;
COUNT(*)
----------
2000000
Elapsed: 00:00:02.76
SQL> SELECT 143486/27274, 12.68/2.76 FROM dual;
143486/27274 12.68/2.76
------------ ----------
5.26090782 4.5942029
To take advantage of data compression for full-scan operations as shown in the example,
it essential to have spare CPU resources. This is not because of the CPU overhead of uncompressing
the blocks (which is very small; they are not compressed using a zip-like algorithm)
548 CHAPTER 12 ■ OPTIMI ZING THE PHYSICAL DESIGN
but simply because the operations performed by the SQL engine (in the previous example,
accessing the blocks and doing the count) are executed in a shorter period of time. For
example, on my test system, the CPU utilization during the execution of the test query was
about 7 percent without compression and 11 percent with compression.
To use data compression, the following requirements must be fulfilled:
The table cannot have more than 255 columns.
The table cannot be created with the parameter rowdependencies.
It is important to note that although data compression does have some advantages, it also
has several drawbacks. The most significant is that the database engine, up to Oracle Database
10g, compresses data blocks only when data is inserted through the direct path interface.
Hence, it will compress data blocks only with the following operations:
CREATE TABLE ... COMPRESS ... AS SELECT ...
ALTER TABLE ... MOVE COMPRESS
INSERT /*+ append */ INTO ... SELECT ...
INSERT /*+ parallel(...) */ INTO ... SELECT ...
Loads performed by applications using the OCI direct-path interface (for example, the
SQL*Loader utility)
The result is that data inserted through regular INSERT statements is inserted in noncompressed
blocks. Another disadvantage is that not only do UPDATE statements commonly lead
to migrated rows stored in noncompressed blocks but also that the free space in compressed
blocks caused by DELETE statements is never reused. For these reasons, I suggest using data
compression only on segments that are (mostly) read-only. For example, in a partitioned table
storing a long history where only the last few partitions are modified, it could be useful to
compress the partitions that are (mostly) read-only. Data marts and completely refreshed
materialized views are also good candidates for data compression.
As of Oracle Database 11g, the following two compression methods are available:
compress for direct_load operations: This method is equivalent to the compression
method used up to Oracle Database 10g (that is, it is equivalent to specifying compress).
compress for all operations: This method partially overcomes some of the limitations of
the previous one. However, the Advanced Compression option is required.
Since the new compression method is dynamic (data might be compressed not when it is
inserted but later when the block storing it is modified), it is difficult to give advice about its
utilization. Actually, there are still several situations in which the new compression method is
not better than the old one. To figure out whether the new compression method is able to
correctly handle data that is not (mostly) read-only, I strongly advise you to carefully test it with
the expected load. |