Network Monitoring Experimentations 3

5 01 2010

January 5, 2010

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

Previously, I showed a couple of examples of network communication working well, as well as a couple counter-examples where the network communication caused problems for Oracle Database communication.  Problems like those shown in the previous articles are not exclusive to just database communication, but instead potentially impact all network communication between two devices attached to the network.

Here is a question that I asked potential job candidates who were interviewing for a position at the company – not so much looking for a correct answer, but rather the thought process demonstrated by the person who was looking to be hired:

—————————————————————————————————–

Situation: All servers are connected to an HP 4160GL 60 port gigabit managed switch using CAT 5e cables.  A new Dell PowerEdge 2850 server running Windows 2003 Standard Edition and with all security updates is installed.  The PowerEdge 2850 indicated that its network card was connected at gigabit speeds, and the HP 4160GL switch also reported that the server was connected at gigabit speeds.  Copying files from the Dell PowerEdge 1750 server running Red Hat Enterprise Linux ES 3 Server running SAMBA was extremely slow.  The PowerEdge 1750 indicated that its network card was connected at gigabit speeds, and the HP 4160GL switch also reports that the server is connected at gigabit speeds.

Initial Analysis: Pull copying the Windows 2000 Service Pack 4 file, which is roughly 132MB in size, to the PowerEdge 2850 (Windows 2003) from the PowerEdge 1750 (RH Linux) using Windows Explorer required more than 45 minutes to complete.  Pull copying the same file to a five year old laptop with a 100Mbps connection running Windows 95 from the PowerEdge 1750 using Windows Explorer required roughly 22 seconds to complete (and 17 seconds on a desktop computer with a 100Mbps connection).  Pull copying the same file to a year old desktop computer with a Gigabit connection running Windows XP from the PowerEdge 1750 using Windows Explorer required roughly 4 seconds to complete.

What would you do to troubleshoot this problem?  What is the cause of the problem?  What is the solution?

This is the start of the problematic transfer, about 1.3 seconds after logging was initiated with Wireshark (actually Ethereal at the time) – IP address .47 is the Windows 2003 Server with Service Pack 1, IP address .46 is the Red Hat Enterprise Linux ES 3 Server w/ SAMBA:

Roughly 4 seconds into the copy, only about 200 packets were logged in the packet capture program:

The problem was corrected and then the transfer was captured again.  This is the start of the fixed run (about 1 second after logging was initiated) – IP address .47 is the Windows 2003 Server with Service Pack 1, IP address .46 is the Red Hat Enterprise Linux ES 3 Server w/ SAMBA:

Roughly 3.4 seconds after the file copy started, the file copy completed.  Roughly 80,745 packets were transmitted across the network during the copy operation:

—————————————————————————————————–

What would you do to troubleshoot this problem?  What is the cause of the problem?  What is the solution?