<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: DB_BLOCK_SIZE and DB_FILE_MULTIBLOCK_READ_COUNT 3 &#8211; What is Wrong with this Quote?</title>
	<atom:link href="http://hoopercharles.wordpress.com/2010/12/04/db_block_size-and-db_file_multiblock_read_count-3-what-is-wrong-with-this-quote/feed/" rel="self" type="application/rss+xml" />
	<link>http://hoopercharles.wordpress.com/2010/12/04/db_block_size-and-db_file_multiblock_read_count-3-what-is-wrong-with-this-quote/</link>
	<description>Miscellaneous Random Oracle Topics: Stop, Think, ... Understand</description>
	<lastBuildDate>Mon, 13 May 2013 14:10:06 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Charles Hooper</title>
		<link>http://hoopercharles.wordpress.com/2010/12/04/db_block_size-and-db_file_multiblock_read_count-3-what-is-wrong-with-this-quote/#comment-2283</link>
		<dc:creator><![CDATA[Charles Hooper]]></dc:creator>
		<pubDate>Tue, 07 Dec 2010 10:32:30 +0000</pubDate>
		<guid isPermaLink="false">http://hoopercharles.wordpress.com/?p=3804#comment-2283</guid>
		<description><![CDATA[Centinul,

Thank you for the links - those will be very helpful for the sections of the book that offer advice about index rebuilds.  It might be interesting to read Richard Foote&#039;s review of the book, if he bought a copy.]]></description>
		<content:encoded><![CDATA[<p>Centinul,</p>
<p>Thank you for the links &#8211; those will be very helpful for the sections of the book that offer advice about index rebuilds.  It might be interesting to read Richard Foote&#8217;s review of the book, if he bought a copy.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Centinul</title>
		<link>http://hoopercharles.wordpress.com/2010/12/04/db_block_size-and-db_file_multiblock_read_count-3-what-is-wrong-with-this-quote/#comment-2265</link>
		<dc:creator><![CDATA[Centinul]]></dc:creator>
		<pubDate>Mon, 06 Dec 2010 12:25:47 +0000</pubDate>
		<guid isPermaLink="false">http://hoopercharles.wordpress.com/?p=3804#comment-2265</guid>
		<description><![CDATA[Richard Foote has a three part series on indexes in larger block size tablespaces and the myths associated with them. I thought posting the links may be relevant to the discussion:

http://richardfoote.wordpress.com/2009/02/18/larger-block-tablespace-for-indexes-revisited-part-i-the-tourist/
http://richardfoote.wordpress.com/2009/02/23/larger-block-tablespace-for-indexes-revisted-part-ii-money/
http://richardfoote.wordpress.com/2009/03/02/larger-block-tablespace-for-indexes-revisited-part-iii-prove-yourself/]]></description>
		<content:encoded><![CDATA[<p>Richard Foote has a three part series on indexes in larger block size tablespaces and the myths associated with them. I thought posting the links may be relevant to the discussion:</p>
<p><a href="http://richardfoote.wordpress.com/2009/02/18/larger-block-tablespace-for-indexes-revisited-part-i-the-tourist/" rel="nofollow">http://richardfoote.wordpress.com/2009/02/18/larger-block-tablespace-for-indexes-revisited-part-i-the-tourist/</a><br />
<a href="http://richardfoote.wordpress.com/2009/02/23/larger-block-tablespace-for-indexes-revisted-part-ii-money/" rel="nofollow">http://richardfoote.wordpress.com/2009/02/23/larger-block-tablespace-for-indexes-revisted-part-ii-money/</a><br />
<a href="http://richardfoote.wordpress.com/2009/03/02/larger-block-tablespace-for-indexes-revisited-part-iii-prove-yourself/" rel="nofollow">http://richardfoote.wordpress.com/2009/03/02/larger-block-tablespace-for-indexes-revisited-part-iii-prove-yourself/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Charles Hooper</title>
		<link>http://hoopercharles.wordpress.com/2010/12/04/db_block_size-and-db_file_multiblock_read_count-3-what-is-wrong-with-this-quote/#comment-2254</link>
		<dc:creator><![CDATA[Charles Hooper]]></dc:creator>
		<pubDate>Sun, 05 Dec 2010 11:42:17 +0000</pubDate>
		<guid isPermaLink="false">http://hoopercharles.wordpress.com/?p=3804#comment-2254</guid>
		<description><![CDATA[Jonathan,

Thank you for the extensive explanation, it is very helpful.  This quote from th book might be just one example that shows why it is a bad idea to recycle material that was published in 2001 - the errors and problems caused by the missing details are amplified nine years later due to the new features added to Oracle Database over that time.

In your comment, you stated &quot; There is no direct relationship between cost and db_block_size, there are only side effects, and it is far from trivial to understand these side effects.&quot; And &quot; I believe a strong proponent of putting indexes into tablespaces with the largest possible block size.&quot;  The book is also a strong proponent of replacing most full table scans with some sort of index access path.  There is an interesting tie-in with Randolf Geist&#039;s System Statistics presentation, which includes a section (starting at slide 59) that discusses the effects of non-uniform block sizes on the Oracle optimizer&#039;s costing of access paths.  Randolf&#039;s test case seems to show a significant drop in the optimizer&#039;s costing calculations for the objects in the larger block size - dropping the cost number without improving the execution time.  Randolf&#039;s test case used a 10,000 block example with 1 row per block in a locally managed tablespace using  manual segment space management and a default 8KB block size (DB_BLOCK_SIZE).  His test demonstrated what happens to the execution plan costs when the cost model changes between the various cost models:
&lt;strong&gt;Traditional I/O Costing:&lt;/strong&gt;
2KB block tablespace: 2441
8KB block tablespace: 1517
16KB block tablespace: 1199

&lt;strong&gt;NOWORKLOAD System Statistics:&lt;/strong&gt;
2KB block tablespace: 7708
8KB block tablespace: 2708
16KB block tablespace: 1875

&lt;strong&gt;WORKLOAD System Statistics:&lt;/strong&gt;
2KB block tablespace: 10833
8KB block tablespace: 2708
16KB block tablespace: 1354

Quite clearly from the above, if someone is convinced to tune by cost (I do not recall the book suggesting that), one would create a database with a 2KB default tablespace, create a 32KB tablespace, and then create all objects in the 32KB tablespace.  It might be worth experimentation to see what happens to the costing of index access paths when the indexes are recreated in a 32KB or 16KB tablespace with the table data still residing in the 8KB (or 2KB) default tablespace - would this change automatically tune the whole database to use indexes rather than full tablescans?  I see the start of a silver bullet.]]></description>
		<content:encoded><![CDATA[<p>Jonathan,</p>
<p>Thank you for the extensive explanation, it is very helpful.  This quote from th book might be just one example that shows why it is a bad idea to recycle material that was published in 2001 &#8211; the errors and problems caused by the missing details are amplified nine years later due to the new features added to Oracle Database over that time.</p>
<p>In your comment, you stated &#8221; There is no direct relationship between cost and db_block_size, there are only side effects, and it is far from trivial to understand these side effects.&#8221; And &#8221; I believe a strong proponent of putting indexes into tablespaces with the largest possible block size.&#8221;  The book is also a strong proponent of replacing most full table scans with some sort of index access path.  There is an interesting tie-in with Randolf Geist&#8217;s System Statistics presentation, which includes a section (starting at slide 59) that discusses the effects of non-uniform block sizes on the Oracle optimizer&#8217;s costing of access paths.  Randolf&#8217;s test case seems to show a significant drop in the optimizer&#8217;s costing calculations for the objects in the larger block size &#8211; dropping the cost number without improving the execution time.  Randolf&#8217;s test case used a 10,000 block example with 1 row per block in a locally managed tablespace using  manual segment space management and a default 8KB block size (DB_BLOCK_SIZE).  His test demonstrated what happens to the execution plan costs when the cost model changes between the various cost models:<br />
<strong>Traditional I/O Costing:</strong><br />
2KB block tablespace: 2441<br />
8KB block tablespace: 1517<br />
16KB block tablespace: 1199</p>
<p><strong>NOWORKLOAD System Statistics:</strong><br />
2KB block tablespace: 7708<br />
8KB block tablespace: 2708<br />
16KB block tablespace: 1875</p>
<p><strong>WORKLOAD System Statistics:</strong><br />
2KB block tablespace: 10833<br />
8KB block tablespace: 2708<br />
16KB block tablespace: 1354</p>
<p>Quite clearly from the above, if someone is convinced to tune by cost (I do not recall the book suggesting that), one would create a database with a 2KB default tablespace, create a 32KB tablespace, and then create all objects in the 32KB tablespace.  It might be worth experimentation to see what happens to the costing of index access paths when the indexes are recreated in a 32KB or 16KB tablespace with the table data still residing in the 8KB (or 2KB) default tablespace &#8211; would this change automatically tune the whole database to use indexes rather than full tablescans?  I see the start of a silver bullet.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://hoopercharles.wordpress.com/2010/12/04/db_block_size-and-db_file_multiblock_read_count-3-what-is-wrong-with-this-quote/#comment-2243</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Sat, 04 Dec 2010 10:33:06 +0000</pubDate>
		<guid isPermaLink="false">http://hoopercharles.wordpress.com/?p=3804#comment-2243</guid>
		<description><![CDATA[&lt;I&gt;
Just the place for a Snark ! I have said it twice:
            That alone should encourage the crew.
Just the place for a Snark ! I have said it thrice
            What I tell you three times is true.

            Lewis Carroll – The hunting of the Snark
&lt;/i&gt;

The first thought that springs to mind is that when listing 4 points you don&#039;t usually start at number 7.

Second is that the clustering_factor doesn&#039;t make it to the list (although it is mentioned immediately afterwards - which also makes me wonder whether the author read the list before inserting it).

If the author thinks the &quot;db_block_size&quot; is one of &quot;the four factors&quot;, you have to wonder why the db_file_multiblock_read_count is not, why the values for the MBRC, mreadtim, sreadtim system statistics are not, and why the author does not believe that the block sizes of the tablespaces holding the table and its indexes are not factors (the author is, I believe a strong proponent of putting indexes into tablespaces with the largest possible block size).

I hope the author includes a detailed discussion of why the db_block_size affects the optimizer&#039;s choice. There is no direct relationship between cost and db_block_size, there are only side effects, and it is far from trivial to understand these side effects.

Like db_block_size, the avg_row_len is not a factor that the optimizer uses when making the choice between a tablescan and an indexed access path. There are, of course, (subtle) side effects (again) - e.g. longer rows means fewer rows per block which affects the probability of two adjacent key values pointing to two different blocks (hence an increase in the clustering_factor, hence the cost of using the index); then again, longer rows means fewer rows per block, which means more table blocks which increases the cost of the tablescan !  (We might also note that the average index key length affects the choice - which also means the level of prefix compression affects the choice, since both will affect the number of leaf blocks in the index, and the leaf block count really IS a factor influencing the cost of using the index.)

The &quot;selectivity of a column&quot; - what about multi-column indexes ?

The &quot;cardinality&quot; is derived from the selectivity as an estimate of the number of output rows - it isn&#039;t an input to the cost, so doesn&#039;t affect the choice of tablescan or index.  If you wanted an explanation for a novice you might couch it in terms of cardinality - but that&#039;s like talking about &quot;the Sun rising&quot; rather than &quot;the Earth rotating&quot;.


&quot;High&quot; selectivity - I think we could do with the author&#039;s definition of selectivity here. For the purposes of the optimizer, &quot;selectivity&quot; is a number between 0 and 1 and, all other things being equal, the high the selectivity for an indexed access patt the less likely the optimizer is to choose it.

The text switches from the cost to the run time without making the point that the cost is an estimate of the run time and the run time could be completely different.  It is (for example) easy to have a very high clustering_factor and still have a very fast run time because of the way that Oracle generates the clustering_factor. 


The trouble with getting things right is that it takes so much longer, and so much more effort than just putting a few vaguely relevant words in roughly the right order.]]></description>
		<content:encoded><![CDATA[<p><i><br />
Just the place for a Snark ! I have said it twice:<br />
            That alone should encourage the crew.<br />
Just the place for a Snark ! I have said it thrice<br />
            What I tell you three times is true.</p>
<p>            Lewis Carroll – The hunting of the Snark<br />
</i></p>
<p>The first thought that springs to mind is that when listing 4 points you don&#8217;t usually start at number 7.</p>
<p>Second is that the clustering_factor doesn&#8217;t make it to the list (although it is mentioned immediately afterwards &#8211; which also makes me wonder whether the author read the list before inserting it).</p>
<p>If the author thinks the &#8220;db_block_size&#8221; is one of &#8220;the four factors&#8221;, you have to wonder why the db_file_multiblock_read_count is not, why the values for the MBRC, mreadtim, sreadtim system statistics are not, and why the author does not believe that the block sizes of the tablespaces holding the table and its indexes are not factors (the author is, I believe a strong proponent of putting indexes into tablespaces with the largest possible block size).</p>
<p>I hope the author includes a detailed discussion of why the db_block_size affects the optimizer&#8217;s choice. There is no direct relationship between cost and db_block_size, there are only side effects, and it is far from trivial to understand these side effects.</p>
<p>Like db_block_size, the avg_row_len is not a factor that the optimizer uses when making the choice between a tablescan and an indexed access path. There are, of course, (subtle) side effects (again) &#8211; e.g. longer rows means fewer rows per block which affects the probability of two adjacent key values pointing to two different blocks (hence an increase in the clustering_factor, hence the cost of using the index); then again, longer rows means fewer rows per block, which means more table blocks which increases the cost of the tablescan !  (We might also note that the average index key length affects the choice &#8211; which also means the level of prefix compression affects the choice, since both will affect the number of leaf blocks in the index, and the leaf block count really IS a factor influencing the cost of using the index.)</p>
<p>The &#8220;selectivity of a column&#8221; &#8211; what about multi-column indexes ?</p>
<p>The &#8220;cardinality&#8221; is derived from the selectivity as an estimate of the number of output rows &#8211; it isn&#8217;t an input to the cost, so doesn&#8217;t affect the choice of tablescan or index.  If you wanted an explanation for a novice you might couch it in terms of cardinality &#8211; but that&#8217;s like talking about &#8220;the Sun rising&#8221; rather than &#8220;the Earth rotating&#8221;.</p>
<p>&#8220;High&#8221; selectivity &#8211; I think we could do with the author&#8217;s definition of selectivity here. For the purposes of the optimizer, &#8220;selectivity&#8221; is a number between 0 and 1 and, all other things being equal, the high the selectivity for an indexed access patt the less likely the optimizer is to choose it.</p>
<p>The text switches from the cost to the run time without making the point that the cost is an estimate of the run time and the run time could be completely different.  It is (for example) easy to have a very high clustering_factor and still have a very fast run time because of the way that Oracle generates the clustering_factor. </p>
<p>The trouble with getting things right is that it takes so much longer, and so much more effort than just putting a few vaguely relevant words in roughly the right order.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
