How Much Space is Required to Store a Whole Lot of Nothing?

29 04 2010

April 29, 2010

While looking for something completely unrelated I encountered one of the test cases that I posted to an OTN forum thread about a year ago.  There are a couple of interesting items in the test case, so I thought that I would reproduce it here, verifying the test case on Oracle 11.2.0.1.  The OTN forum thread asked if NULL values in table columns take up space, and if so does it depend on the data type of the column.  I answered the question like this:

It is my understanding that NULL values in the last columns (non-null values appear only in the first set of columns defined for the table) in of a row do not consume space, while NULL values in the first set of columns, if they are followed by columns containing non-NULL values will consume space. At this time, I am having trouble locating a reference which backs up my understanding of how NULLs affect storage space.

The test case was originally constructed to be executed against Oracle Database 10.2.0.4, and the output showed what was achieved with an ASSM autoallocate tablespace with an 8KB block size.  So, what happens when the test case is executed on Oracle Database 11.2.0.1 with an ASSM autoallocate tablespace with an 8KB block size?  The results may surprise you.

First, we need to create a table with a couple of different column data types, and then collect statistics for the table:

CREATE TABLE T1 (
  C1 VARCHAR2(300),
  C2 NUMBER(22,4),
  C3 DATE,
  C4 BLOB,
  C5 CLOB);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1')

Surprised by the results yet?

SELECT
  NUM_ROWS,
  BLOCKS,
  EMPTY_BLOCKS,
  AVG_ROW_LEN,
  NUM_FREELIST_BLOCKS
FROM
  USER_TABLES
WHERE
  TABLE_NAME='T1';

  NUM_ROWS     BLOCKS EMPTY_BLOCKS AVG_ROW_LEN NUM_FREELIST_BLOCKS
---------- ---------- ------------ ----------- -------------------
         0          0            0           0                   0

SELECT
  EXTENT_ID,
  BYTES,
  BLOCKS
FROM
  USER_EXTENTS
WHERE
  SEGMENT_NAME='T1'
ORDER BY
  EXTENT_ID;

no rows selected

Probably not if you read about Oracle Database 11.2.0.1’s deferred segment creation feature.  Yes, the above results differed from that of Oracle Database 10.2.0.4, which showed the following for the second SQL statement:

 EXTENT_ID      BYTES     BLOCKS
---------- ---------- ----------
         0      65536          8

Next, insert 1,000,000 rows containing NULL values in all columns and re-gather the statistics:

INSERT INTO T1 (C1)
SELECT
  NULL
FROM
  DUAL
CONNECT BY
  LEVEL<=500000;

INSERT INTO T1 (C1)
SELECT
  NULL
FROM
  DUAL
CONNECT BY
  LEVEL<=500000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1')

Now let’s check the table’s statistics:

SELECT
  NUM_ROWS,
  BLOCKS,
  EMPTY_BLOCKS,
  AVG_ROW_LEN,
  NUM_FREELIST_BLOCKS
FROM
  USER_TABLES
WHERE
  TABLE_NAME='T1';

  NUM_ROWS     BLOCKS EMPTY_BLOCKS AVG_ROW_LEN NUM_FREELIST_BLOCKS
---------- ---------- ------------ ----------- -------------------
   1000000       1630            0           1                   0

Oracle Database 10.2.0.4 was not quite as accurate when estimating the number of rows in the table:

  NUM_ROWS     BLOCKS EMPTY_BLOCKS AVG_ROW_LEN NUM_FREELIST_BLOCKS
---------- ---------- ------------ ----------- -------------------
    984619       1630            0           1                   0

Next let’s take a look at the extent allocation, which was the same for both releases of Oracle Database:

SELECT
  EXTENT_ID,
  BYTES,
  BLOCKS
FROM
  USER_EXTENTS
WHERE
  SEGMENT_NAME='T1'
ORDER BY
  EXTENT_ID;

 EXTENT_ID      BYTES     BLOCKS
---------- ---------- ----------
         0      65536          8
         1      65536          8
         2      65536          8
         3      65536          8
         4      65536          8
         5      65536          8
         6      65536          8
         7      65536          8
         8      65536          8
         9      65536          8
        10      65536          8
        11      65536          8
        12      65536          8
        13      65536          8
        14      65536          8
        15      65536          8
        16    1048576        128
        17    1048576        128
        18    1048576        128
        19    1048576        128
        20    1048576        128
        21    1048576        128
        22    1048576        128
        23    1048576        128
        24    1048576        128
        25    1048576        128
        26    1048576        128
        27    1048576        128

SELECT
  SUM(BYTES) BYTES,
  SUM(BLOCKS) BLOCKS
FROM
  USER_EXTENTS
WHERE
  SEGMENT_NAME='T1'
ORDER BY
  EXTENT_ID;

     BYTES     BLOCKS
---------- ----------
  13631488       1664

The above shows that the table is using 1630 blocks to store the rows with the completely NULL values, there are 1664 blocks allocated to extents used by the table, and 13,631,488 bytes used by the extents.  NULLs consuming space?

A little fun with the ROWID pseudocolumn, dissecting the components of that pseudocolumn (side note, I do not recall where I originally determined how to break apart the 18 byte long displayed ROWID, but it is found in the documentation):

SELECT
  SUBSTR(ROWID,1,6) OBJECT_ID,
  SUBSTR(ROWID,7,3) FILE_ID,
  SUBSTR(ROWID,10,6) BLOCK_ID,
  SUBSTR(ROWID,16,3) ROW_ID,
  LENGTHB(ROWID) LEN,
  COUNT(*) OVER (PARTITION BY SUBSTR(ROWID,10,6) ORDER BY ROWNUM) ROWN,
  SUM(LENGTH(ROWID)) OVER (PARTITION BY SUBSTR(ROWID,10,6) ORDER BY ROWNUM) BYTES
FROM
  T1
WHERE
  ROWNUM<=1400;

OBJECT FIL BLOCK_ ROW        LEN       ROWN      BYTES
------ --- ------ --- ---------- ---------- ----------
AAAULA AAH AAGLHT AAA         18          1         18
AAAULA AAH AAGLHT AAB         18          2         36
AAAULA AAH AAGLHT AAC         18          3         54
AAAULA AAH AAGLHT AAD         18          4         72
AAAULA AAH AAGLHT AAE         18          5         90
AAAULA AAH AAGLHT AAF         18          6        108
AAAULA AAH AAGLHT AAG         18          7        126
AAAULA AAH AAGLHT AAH         18          8        144
AAAULA AAH AAGLHT AAI         18          9        162
AAAULA AAH AAGLHT AAJ         18         10        180
AAAULA AAH AAGLHT AAK         18         11        198
AAAULA AAH AAGLHT AAL         18         12        216
AAAULA AAH AAGLHT AAM         18         13        234
AAAULA AAH AAGLHT AAN         18         14        252
AAAULA AAH AAGLHT AAO         18         15        270
AAAULA AAH AAGLHT AAP         18         16        288
AAAULA AAH AAGLHT AAQ         18         17        306
AAAULA AAH AAGLHT AAR         18         18        324
AAAULA AAH AAGLHT AAS         18         19        342
AAAULA AAH AAGLHT AAT         18         20        360
...
AAAULA AAH AAGLHT AKP         18        656      11808
AAAULA AAH AAGLHT AKQ         18        657      11826
AAAULA AAH AAGLHT AKR         18        658      11844
AAAULA AAH AAGLHT AKS         18        659      11862
AAAULA AAH AAGLHT AKT         18        660      11880
AAAULA AAH AAGLHU AAA         18          1         18
AAAULA AAH AAGLHU AAB         18          2         36
AAAULA AAH AAGLHU AAC         18          3         54
AAAULA AAH AAGLHU AAD         18          4         72
AAAULA AAH AAGLHU AAE         18          5         90
AAAULA AAH AAGLHU AAF         18          6        108
AAAULA AAH AAGLHU AAG         18          7        126
AAAULA AAH AAGLHU AAH         18          8        144
AAAULA AAH AAGLHU AAI         18          9        162
AAAULA AAH AAGLHU AAJ         18         10        180
AAAULA AAH AAGLHU AAK         18         11        198
AAAULA AAH AAGLHU AAL         18         12        216
AAAULA AAH AAGLHU AAM         18         13        234
AAAULA AAH AAGLHU AAN         18         14        252
...
AAAULA AAH AAGLHU AKP         18        656      11808
AAAULA AAH AAGLHU AKQ         18        657      11826
AAAULA AAH AAGLHU AKR         18        658      11844
AAAULA AAH AAGLHU AKS         18        659      11862
AAAULA AAH AAGLHU AKT         18        660      11880
AAAULA AAH AAGLHV AAA         18          1         18
AAAULA AAH AAGLHV AAB         18          2         36
AAAULA AAH AAGLHV AAC         18          3         54
AAAULA AAH AAGLHV AAD         18          4         72
AAAULA AAH AAGLHV AAE         18          5         90
AAAULA AAH AAGLHV AAF         18          6        108
AAAULA AAH AAGLHV AAG         18          7        126
AAAULA AAH AAGLHV AAH         18          8        144
AAAULA AAH AAGLHV AAI         18          9        162
AAAULA AAH AAGLHV AAJ         18         10        180
AAAULA AAH AAGLHV AAK         18         11        198
AAAULA AAH AAGLHV AAL         18         12        216
AAAULA AAH AAGLHV AAM         18         13        234
AAAULA AAH AAGLHV AAN         18         14        252
...
AAAULA AAH AAGLHV ABM         18         77       1386
AAAULA AAH AAGLHV ABN         18         78       1404
AAAULA AAH AAGLHV ABO         18         79       1422
AAAULA AAH AAGLHV ABP         18         80       1440

The above shows that Oracle was packing 660 of these completely NULL rows into each 8KB block with a default PCT_FREE of 10%, leaving about 7370 bytes of space for storing the 660 rows per block, indicating that each row was consuming about 11 bytes.  Obviously from the above, the actual ROWID for a row does not occupy a full 18 bytes of data storage per row.  It is actually 10 bytes (reference) that is externalized as a 18 byte (reference) base 64 encoded character string.  In retrospect, I probably should have done the following instead to decode the ROWID using the DBMS_ROWID package, and calculate the per row overhead plus the space required (1 byte) to store the data:

SELECT
  DBMS_ROWID.ROWID_OBJECT(ROWID) OBJECT_ID,
  DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) FILE_ID,
  DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) BLOCK_NUMBER,
  DBMS_ROWID.ROWID_ROW_NUMBER(ROWID) ROW_ID,
  COUNT(*) OVER (PARTITION BY SUBSTR(ROWID,10,6) ORDER BY ROWNUM) ROWN,
  SUM(11) OVER (PARTITION BY SUBSTR(ROWID,10,6) ORDER BY ROWNUM) BYTES
FROM
  T1
WHERE
  ROWNUM<=1400;

 OBJECT_ID    FILE_ID BLOCK_NUMBER     ROW_ID       ROWN      BYTES

---------- ---------- ------------ ---------- ---------- ----------
     82624          7      1618387          0          1         11
     82624          7      1618387          1          2         22
     82624          7      1618387          2          3         33
     82624          7      1618387          3          4         44
     82624          7      1618387          4          5         55
     82624          7      1618387          5          6         66
     82624          7      1618387          6          7         77
     82624          7      1618387          7          8         88
     82624          7      1618387          8          9         99
     82624          7      1618387          9         10        110
     82624          7      1618387         10         11        121
     82624          7      1618387         11         12        132
     82624          7      1618387         12         13        143
     82624          7      1618387         13         14        154
     82624          7      1618387         14         15        165
...
     82624          7      1618387        655        656       7216
     82624          7      1618387        656        657       7227
     82624          7      1618387        657        658       7238
     82624          7      1618387        658        659       7249
     82624          7      1618387        659        660       7260
     82624          7      1618388          0          1         11
     82624          7      1618388          1          2         22
     82624          7      1618388          2          3         33
     82624          7      1618388          3          4         44
     82624          7      1618388          4          5         55
     82624          7      1618388          5          6         66
     82624          7      1618388          6          7         77
     82624          7      1618388          7          8         88
     82624          7      1618388          8          9         99
     82624          7      1618388          9         10        110
...
     82624          7      1618388        656        657       7227
     82624          7      1618388        657        658       7238
     82624          7      1618388        658        659       7249
     82624          7      1618388        659        660       7260
     82624          7      1618389          0          1         11
     82624          7      1618389          1          2         22
     82624          7      1618389          2          3         33
     82624          7      1618389          3          4         44
     82624          7      1618389          4          5         55
     82624          7      1618389          5          6         66
     82624          7      1618389          6          7         77
     82624          7      1618389          7          8         88
     82624          7      1618389          8          9         99
     82624          7      1618389          9         10        110
...
     82624          7      1618389         77         78        858
     82624          7      1618389         78         79        869
     82624          7      1618389         79         80        880

A comparison test with a second table containing a single column:

CREATE TABLE T2 (
  C1 VARCHAR2(1));

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2')

INSERT INTO T2 (C1)
SELECT
  NULL
FROM
  DUAL
CONNECT BY
  LEVEL<=500000;

INSERT INTO T2 (C1)
SELECT
  NULL
FROM
  DUAL
CONNECT BY
  LEVEL<=500000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2')

If the columns containing NULL values required storage space in table T1, then we should expect that table T2 should require less space than that for table T1.

SELECT
  SUM(BYTES) BYTES,
  SUM(BLOCKS) BLOCKS
FROM
  USER_EXTENTS
WHERE
  SEGMENT_NAME='T2'
ORDER BY
  EXTENT_ID;

     BYTES     BLOCKS
---------- ----------
  13631488       1664

The 13,631,488 bytes and 1,664 blocks appears to be the same values as displayed for table T1.

For extra credit we might try something like this:

UPDATE
  T1
SET
  C2=1;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1')

SELECT
  NUM_ROWS,
  BLOCKS,
  EMPTY_BLOCKS,
  AVG_ROW_LEN,
  NUM_FREELIST_BLOCKS
FROM
  USER_TABLES
WHERE
  TABLE_NAME='T1';

  NUM_ROWS     BLOCKS EMPTY_BLOCKS AVG_ROW_LEN NUM_FREELIST_BLOCKS
---------- ---------- ------------ ----------- -------------------
   1000000       1630            0           4                   0

The average row length increased from 1 to 4 when the second column of the table was set to a value of 1 for all rows in the table – now the NULL values in column 1 require a single byte of storage space.

Feel free to experiment with this example.








Follow

Get every new post delivered to your Inbox.

Join 144 other followers