Network Monitoring Experimentations 1

15 12 2009

December 15, 2009 

(Forward to the Next Post in the Series)

Wireshark, formerly known as Ethereal, is a free network packet capture program for several operating system platforms.  Wireshark is able to reveal a wide range of network communication problems:
http://www.wireshark.org/download.html

The problems might include self-inflicted problems that result from implementing changes to improve one performance problem, as suggested here:
  http://www.dslreports.com/faq/tweaks 
or here:
  http://download.microsoft.com/download/2/8/0/2800a518-7ac6-4aac-bd85-74d2c52e1ec6/tuning.doc

The problems might include client computers with 100Mb network cards connected into gigabit switches, which then results in unexpected packet retransmits until the client computer is moved to a 100Mb switch that then connects to the gigabit switch (this is a rare problem, somewhere I have a Wireshark capture that shows this behavior).

The problems might include failing network equipment or bad network wiring or excessive EMF in the environment that distorts traffic on CAT 5e, CAT 6, and wireless connections.

The problems might include a client application that unexpectedly takes 10 to 20 seconds to “log in” when it should take 1 second or less.

The problems might include issues with inappropriate fetch array sizes, poor choices for SDU size, forcing jumbo TCP/IP frames through intermediate network hardware that does not support frame sizes larger than roughly 1500 bytes.

The problems might include a high latency network or WAN connections.

First, let’s look at a Wireshark capture of a successful connection attempt from a client computer connected to the network by an 802.11G (54Mb) wireless connection:

There is nothing terribly out of the ordinary with the above.  The client computer in packet 1 sent an ARP broadcast packet to the network asking for the MAC address of the network card on the network that is associated with IP address 192.185.10.52, and the response should be returned to the client computer at IP address 192.185.10.51.  Roughly 0.002 seconds later the client computer attempted to connect to the database server using the TNS protocol.  Roughly 0.1 seconds later the connection completed.  Roughly 0.06 seconds after the connection attempt finished, the client computer started sending queries to the database server.  There were a couple of delays between the submission of the SQL statement and the response from the server, such as the 0.12 second delay between packet 26 and 27, but nothing significant.

It is quite possible that network problems will occur, as in the following:

In the above, notice that the server (IP address 192.185.10.52) is resending packets that it assumed were lost in transit due to the long delays between packets without receiving an ACK from the client computer (the ACK packet may have been lost).  Notice also the long delays between packets that might either be a symptom of network problems, or CPU/Wait Event that could be captured in a 10046 extended SQL trace.

Next, let’s take a look at the effects of adjusting the fetch array size (number of rows retrieved in each fetch call – ARRAYSIZE setting in SQL*Plus) when executing a SQL statement in SQL*Plus that selects from a table having an average row length of 245 bytes, with the client on a wired 100Mb connection, and with the standard Oracle SDU size.  The server is still at IP address 192.185.10.52, the client computer (same as used above) is now at IP address 192.185.10.53.

Fetch Array Size 1:

Fetch Array Size 15:
 

Fetch Array Size 100:

Fetch Array Size 1000:

In the above, you might notice that after every two packets that are sent by the server, the client computer sends back an ACK packet – this is typical behavior.  So, what happens when someone “optimizes” the network card parameters?

Fetch Array Size 15 with “Optimized” ACK frequency (Same Data):

Fetch Array Size 100 with “Optimized” ACK frequency (Same Data):

In the above, notice the number of packets transmitted before the client sends an ACK packet, and typically just before the client sends the ACK, there is a delay of roughly 0.2 seconds.  OK, a little slower.  So, what happens when we switch from the table with the average row length of 245 bytes to a table containing roughly 1MB to 2MB JPEG pictures?  Compare how long it takes to reach the 35th packet in the following two screenshots:

Fetch Array Size 100:

Fetch Array Size 100 with “Optimized” ACK frequency (Same Data):

(Late Additions to the Post)

Fetch Array Size 100 – Table with Pictures (802.11G):

Fetch Array Size 100 – Table with Pictures, Optimized ACK (802.11G):

SQL*Plus SELECT from the table with the average row length of 245 bytes using the 802.11G (54Mb) connection:

Fetch Array Size 1 with “Optimized” ACK frequency:

Fetch Array Size 15 with “Optimized” ACK frequency (Same Data):

Fetch Array Size 100 with “Optimized” ACK frequency (Same Data):

Fetch Array Size 1000 with “Optimized” ACK frequency (Same Data):


Actions

Information

7 responses

2 01 2010
Blogroll Report 11/12/2009-18/12/2009 « Coskan’s Approach to Oracle

[...] to monitor networking of database operations with Wireshark Charles Hooper-Network Monitoring Experimentations 1 Charles Hooper-Network Monitoring Experimentations [...]

27 09 2010
OOW10: All Over « ORAganism

[...] – blog post (first of a 7 part series) from Charles [...]

9 03 2011
ghassem koolivand

Dear Charles
Thank you for this article.It’s very useful.
I have a question about TCP ACK Frequency.I have tried doing all thing that I understand from the article but when I increase TCP ACK Frequency from default value(2) to 4 or 13 my response time increase, although the count of packet decrease.what’s my problem??? why this happen ?
If needed I can send the Wireshark’s dump to you(if I have your email)

Thank you so much
Ghassem

9 03 2011
Charles Hooper

Ghassem,

Excellent question, and I had hoped that this article addressed the issue that you mentioned. I think that my article might be a little confusing because I used the term “optimized” to decribe a situation where a person *thought* that they were improving performance by adjusting the ACK frequency from the default value of 2 to something like 4 or 13 – because that is what some article suggest to change. The ACK problem can be seen in a couple of the screen captures in this article. For example this screen capture that shows the default ACK value of 2:

Compare the above screen capture with this screen capture that shows what happened when the ACK value was set to 13:

The screen captures show the first 35 packets of the Wireshark capture. In the picture that shows the capture with the ACK value set at 2, the first packet is seen at 34.097254 and the 35th packet is seen at 34.105656, so the elapsed time is roughly 0.008402 seconds. In the picture that shows the ACK value set at 13, the first packet is seen at 45.184532 and the 35th packet is seen at 45.591271, so the elapsed time is roughly 0.406739 seconds — 0.398337 seconds longer than the default ACK value. Why did this happen? Go back through the picture that shows the ACK value set at 13 and see where the 0.406739 were lost. We see a roughly 0.2 second delay between packet 11 and 12, and a roughly 0.2 second delay between packets 22 and 23.

0.2 seconds is essentially a magic number of sorts – the receiving computer was expecting to receive a total of 13 packets before it had to send back an ACK, there is a 0.2 second maximum timeout before a receiver must send an ACK, and the sending computer halted the transmission of packets before the 13th packet was sent because it had not received confirmation that the previous packets were received correctly. This problem will happen when the ACK frequency is not set the same on the receiving and sending computers, or in some cases if fewer than the expected packets were transmitted (search for the term Nagle http://www.google.com/search?q=nagle ).

10 03 2011
ghassem koolivand

Hi Charles
Thank you so much for your exact and useful reply.
That’s interesting about deadlock between nagle’s algorithm and Delay ACK and I think it’s my most important problem but I did not get your mean about “This problem will happen when the ACK frequency is not set the same on the receiving and sending computers”. Why is it important that the ACK frequency should be same in sender and receiver?
If it’s possible could you explain it
Thanks’
GHASSEM

10 03 2011
Charles Hooper

Ghassem,

Think about it this way:
You are playing a game of catch with your friend who is on the other side of a tall fence. You have a basket full of baseballs, each of which is numbered. You know that after every two baseballs you throw, your friend will shout back: “I caught both of the baseballs”. If he only receives one of the baseballs, eventually (0.2 seconds for the computer network communication) he will shout over the fence “I only received the second baseball, the first one was lost” – at this point you will write a #1 on a baseball and throw that baseball over the fence; when this happens, your friend will immediately shout back “I received the first baseball, please continue throwing”.

Now think about this small change to the situation:
You think that your friend will shout back after every two baseballs are received, while he thinks that he should only tell you when he receives 13 baseballs. You throw the first two baseballs and wait for an acknowledgement from your friend that the first two baseballs were received. Your friend is still waiting for baseballs 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, and 13 before telling you that he received the first two. After a while (0.2 seconds for the computer network communication), your friend gives up and shouts over the wall “I received baseballs #1 and #2, but baseballs 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, and 13 were lost, please send those again”. At this point you might shout back at your friend that you never threw baseballs 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, and 13.

The computer network communication is a bit more complicated than the above example. In computer network communication, you could have been in the process of throwing baseballs #3 & #4, and #5 & #6 while waiting for your friend to acknowledge the receipt of baseballs #1 & #2, but you would also expect to receive ACKs for baseballs #3 & #4, and #5 & #6.

12 03 2011
ghassem koolivand

Hi Charles
Thank you
I got it but I have problem to set it in Linux and HP-UX.I can set it in Windows but unfortunately I have not found a parameter to set TCP ACK Frequency in ones. Could you help me to set it

Ghassem

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 141 other followers

%d bloggers like this: