Idle Thoughts – SSD, Redo Logs, and Sector Size

18 12 2011

December 18, 2011

It seems that I have been quite busy lately with computer related tasks that are not associated with Oracle Database, making it difficult to slow down and focus on items that have a foundation in logic.  Today I have had some spare time to dig further into the recently released “Oracle Core” book.  A Note on page 123 (obviously a logical page number) gave me a moment to pause.  Quoting a small portion of the Note:

“Recently disk manufacturers have started to produce disks with a 4KB sector size… From 11.2 Oracle lets the DBA choose which is the lesser of two evils by allowing you to specify the block size for the log file as 512 bytes, 1KB, or 4KB.” 

Do you see it yet, the reason to pause?

-

-

-

I am typing this blog article on a laptop that is a bit over a year old with two 256GB Crucial SSD drives in a RAID 0 arrangement.  Even though the laptop supports 3Gb/s transfer speeds from the internal drives rather than the more recent 6Gb/s transfer speeds, the drive arrangement is very quick (quick and potentially subject to a very bad data loss if one drive in the RAID 0 array fails), hitting a speed of 539 MB per second in a parallel full table scan.

I believe that the Crucial SSD drives have a 4 KB sector size, much like many other SSD drives.  It seems that some of Intel’s higher-end single layer drives use a 16 KB sector size (4000h = 16,384, see page 17 for the X25 series and page 22 for the 510 series).  I recall from reading the Oracle Database documentation the following statement:

“Unlike the database block size, which can be between 2K and 32K, redo log files always default to a block size that is equal to the physical sector size of the disk. Historically, this has typically been 512 bytes (512B).

Some newer high-capacity disk drives offer 4K byte (4K) sector sizes for both increased ECC capability and improved format efficiency. Most Oracle Database platforms are able to detect this larger sector size. The database then automatically creates redo log files with a 4K block size on those disks.” 

Do you see it yet, the reason to pause?

-

-

-

I recently read an interesting article titled De-Confusing SSD (for Oracle Databases) that inspired several interesting comments.  I think that the following two statements in the article contributed to the comment count:

SSD’s base memory unit is a cell, which holds 1 bit in SLC and 2 bits in MLC. Cells are organized in pages (usually 4k) and pages are organized in blocks (512K). Data can be read and written in pages, but is always deleted in blocks. This will become really important in a moment.

* Placing redo logs on SSD is not recommended. Exadata now has a “Smart Flash Logging” feature that uses redo logs on SSD. Note that it uses redo logs *also* on SSD. This feature allows Oracle to write redo in parallel to a file on SSD and a file on the magnetic disk, and finish the operation when one of the calls is successful.

Do you see it yet, the reason to pause?

-

-

-

So, the “Oracle Core” book states that as of Oracle Database 11.2 the DBA is able to select a 512 byte, 1024 byte, or 4096 byte (4 KB) block size for the redo log files.  Idle thinking… I have Oracle Database 11.2.0.2 installed on this laptop for educational purposes, let’s try executing a simple SQL statement that retrieves the block size for the redo log files:

SELECT
  GROUP#,
  BLOCKSIZE
FROM
  V$LOG
ORDER BY
  GROUP#;

GROUP#  BLOCKSIZE
------ ----------
     1        512
     2        512
     3        512 

So, if the sector size of my Crucial SSD drive is 4 KB, I have a (default) block size for the redo log files that potentially conflicts with the documentation, and if this is not an isolated issue, it might explain (in part) why the De-Confusing SSD (for Oracle Databases) article states that placing redo logs on SSD is not recommended.

Do you see it yet, the reason to pause?

-

-

-

I wonder if in testing redo log performance on SSD drives, if those drives were provided a “fair” test environment, where the BLOCKSIZE of the redo log files was sized appropriately for the write characteristics of SSD drives?  Any thoughts?


Actions

Information

4 responses

18 12 2011
Guy Harrison

I tried placing redo logs on SSD in an Exadata system using default (512) and 4K blocksizes. The results are at http://guyharrison.squarespace.com/blog/2011/12/6/using-ssd-for-redo-on-exadata-pt-2.html. It looked to me like setting the redo log block size incorrectly causes definite dramatic degradation, but setting it correctly (or at least in this case to 4K) doesn’t make SSD faster than SAS for sequential writes.

19 12 2011
Charles Hooper

Guy,

Thanks for sharing your performance results regarding 512 byte and 4096 byte blocksizes for the redo logs – the results are interesting. I wonder if the native sector size of the SSD drives in an Exadata system are 4 KB or if they are actually larger than that size (for example, the apparent 16 KB sector size of some Intel SSD drives).

In your article, you state that, “elapsed time reduced by 70% [when using the 4096 byte redo log blocksize]“. A marketing person could probably spin your results a couple of different ways:

309.68 / 1009.67 = 30.67% of the time required for 512 byte

1009.67 / 309.68 = 326.04% faster for 4096 byte

(1009.67 – 309.68) / 309.68 = 2.2604 = 226.04% time decrease compared to ending point

(1009.67 – 309.68) / 1009.67 = 0.6933 = 69.33% time decrease compared to starting point

Regardless of the above number spinning, based on your test results, SSD in the system is still slower than the spinning rust (serial attached SCSI hard drives). Now if only Oracle Database supported a 16 KB blocksize for the redo logs…

Thanks again for sharing the results.

18 12 2011
Noons

Oh boy! I can see the next ROT (rule of thumb) already: “Use 4KB for file system size for redo logs on SSD!”. Regardless of release level, setup, storage architecture, etc.
Agree entirely: I see heaps of reasons to pause. And T-H-I-N-K!
As usual: ECC applies – Experiment! Check! Compare!

Before we’re all saddled with another 10 years of ROT setting in…

I’m curious on how long it’ll be before someone will claim SSD on SAN Fast Provisioning needs a 4K file system block size…
Mark my words: it’ll happen!

19 12 2011
Charles Hooper

Nuno,

Thanks for sharing your comment.

You are probably right that someone will take the results out of context and then formulate a general rule. Time to super-size your redo log block size. Let’s see, the table states that I need a 16 KB block size for this Intel SSD… I just type these commands and …

I want to make certain that people understand that I am not recommending to use or not use a 4096 byte block size for redo on SSD. It was but an idle thought while I awaited a LOG FILE SYNC wait to complete. ;-)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 143 other followers

%d bloggers like this: