True or False – Wait Events

31 03 2010

March 31, 2010

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

A continuation of the series of true or false quizzes for the week – maybe this one is too easy.  Today’s quiz is on the topic of wait events.   Assuming that the questions apply to a release of Oracle that has been available for up to five years, state why you believe that the answer to the question is true, or why you believe that the answer to the question is false.  Any answers that attempt to utilize logical fallacies will be scored as incorrect.

1. Assume that a database instance is being prepared for 1,500 to 2,000 sessions.  Given an unlimited budget, it is possible to configure the database instance to completely eliminate wait events.

2. Consider two identical Oracle databases with a single session connected to each database instance.  Each of the two sessions submits exactly the same SQL statement, with the first session experiencing no wait events while executing the query, and the second session experiencing nearly continuous I/O wait events.  The lack of wait events implies that the query execution for the first session is optimal, and should be the performance target for the second session.

3. Idle wait events are insignificant in performance tuning exercises, and should be ignored.

4. For every one minute of elapsed time, each CPU in the server is capable of accumulating 60 seconds of CPU wait time.

Did I mention that I dislike true/false type questions?  But then these are not simple true/false questions.


Actions

Information

5 responses

31 03 2010
Charles Hooper

#4 False. Technically, a single CPU is able to produce 60 second of usable CPU time for every one minute of elapsed time. While the time processing on the CPU may be viewed from the end-user’s perpective as a CPU wait (and may be presented as such on a Statspack/AWR report), from an Oracle wait event perspective “running” on the CPU does not qualify as an Oracle wait event. While a session is in a wait event, that session may still consume some of the CPU time (probably in kernel mode). From an operating system perspective, there could be many processes sitting in the CPU run queue – is this the meaning of the invented term “CPU wait time”? Even if it is, there would not be a hard upper limit of 60 accumulated seconds per minute for all processes waiting in the run queue.

2 04 2010
Log Buffer #184, a Carnival of the Vanities for DBAs | The Pythian Blog

[...] It is always good to start the day with a pop quiz to get the brain into gear: Charles Hooper posted a 3-part series with seemingly innocent True/False questions. He covers sorting, SQL tuning and wait events. [...]

5 04 2010
Charles Hooper

#1 False. Think about this for a moment – the lack of wait events for a single session indicates that the session is doing nothing except consuming CPU time (or sitting in the CPU run queue, or somehow sleeping/suspended). Thus, there are only two ways to completely eliminate wait events when the instance is configured for 1,500 to 2,000 sessions: a) SHUTDOWN ABORT; b) put all 1,500 to 2,000 sessions into a tight PL/SQL loop that does not query the database and never returns a result to the client; returning a row to the client causes SQL*Net type wait events while the server waits for the next request from the client computer. Translation: assuming that none of the sessions are executing using parallel query, for each second when the sessions are not burning the server’s CPU cycles, the sessions will accumulate time in one of the 1,118 (as of Oracle 11.2.0.1) wait events, whether that is a server-side wait such as ‘db file sequential read’ or a client-side wait such as ‘SQL*Net message from client’. So, with 2,000 sessions connected for 1 hour, that is 2,000 * 3,600 = 7,200,000 session-seconds, and if the server had 100 CPUs, only 360,000 seconds of that 7,200,000 session-seconds could truely be on the CPUs – the rest of the time will be in wait events (or “lost” due to the session processes sitting in the CPU run queues).

5 04 2010
Charles Hooper

#3 False. At the system-wide level the usefulness of the “idle” wait events is limited, at best. Through the life cycle of the session (between the start and termination) the accumulated time for the idle wait events for that session will be added to the system-wide totals for those wait events. When the session terminates, the accumulated time for the “idle” wait events is not subtracted from the system-wide totals for those wait events. However, you could look at the delta values for the ‘SQL*Net message from client’ wait event, for instance, to determine roughly how many idle sessions were connected to the database – but then that calculation could be grossly miscalculated if many of the sessions were connected over a high latency network connection and fetched only one row at a time. The calcuation could also be grossly overstated if a bug in a middleware product causes a sudden burst of hundreds of connection requests that happens and is fixed between the time when the beginning and ending statistics are captured.

However, at the session level, particularily in a 10046 extended SQL trace file, the idle wait events are a key to unlocking the real reason for slow performance. In such a trace file, the SQL*Net idle wait events provide an indication between two time stamps of how much of the elapsed time was spent either in the network or on the client, and by analyzing the elapsed time and position of each of the individual waits it should be possible to determine if the delay is caused by end-user “think time”, client-side computer bottlenecks, or potentially network performance issues.

5 04 2010
Charles Hooper

#2 False. Tricky question – just because the two Oracle databases are the same, the Oracle instances are not necessarily configured the same (the instances could have very different initialization parameters), nor does the server hardware necessarily need to be identical. It very well could be that the first server picked an unfortunate execution plan that makes extensive use of nested loop joins for row sources containing a large number of rows, while the second server had a cold buffer cache and/or used full table scans and/or hash joins, and/or parallel query to retrieve the results significantly faster than the first server, even though the second server experienced wait events.

Not enough information is supplied (there are too many unknowns) to be able to answer this question as True.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 137 other followers

%d bloggers like this: