Guest Intermittent Network Pauses Posted May 6, 2008 Posted May 6, 2008 Hi There, We have a HP Class-C Chassis with 4 Blade servers, all running windows server 2003. Two of these servers are clustered in Active Passive mode. This cluster is connected to HP EVA3000 SAN array and the network interface of the cluster connects to Cisco 3750 stack. For the past number of months we have been having issues where the clients, running windows XP lose their network drives and after a pause of approx 10-15 seconds reconnect and in many cases a reboot of the workstation is required. We have performed a number of test to isolate the root casue of the issue. The network was fully checked and ruled out as the casue. Packet capture did not reveal anything unusual, except traffic stopping from the cluster during the pause. We perfomed a number of tests on the cluster nodes and one of the tests was a copy test. We ran perfmon and started copying files between local drives and SAN drives. Test1 :copy a Gigabyte file from Local C: to D: Test2: copy the same file from local C: to SAN Test3: Copy the same file from SAN to local C: During all of the above tests we observed, on perfmon, the CPU utilsation dropping to zero and the the network Interface utilisation dropping to zero at exactly the same time. While the CPU utilisation recovered almost immedailty, the network utilisation stayed at 0% for the duration of the copy. These tests caused exactly the same outages that our users experience. While this was happening I could still ping the server at all time. The servers are running Windows 2003 Server SP2. RSS is disabled on the Nics. TOE is disabled. Teaming is disabled as well. Has anyone seen or had this or similar issues. Please help. Thank you
Guest Squidi Posted May 8, 2008 Posted May 8, 2008 Re: Intermittent Network Pauses I have this exact same problem. My setup is: Windows 2003 R2 Sp2 x64 Active/Passive Setup attached to a Xiotech Array. The systems are 2950 Dell Servers, quad processr 8gb of memory. We have the exact same problem, that every so often network activity to the servers pauses for 10-15 seconds. We had this problem pre-sp2 and upgraded to sp2 to try and mitigate it. Things I have discovered: 1. It seems to correlate with periods when many connections being timed out. So if you watch tcpmon ( sysinternals tool ) and see a bunch of TIME_WAIT connections, if the system pauses the number of TIME_WAIT connections will be drastically less. But correlation isn't causation. I think this is a side effect, not the problem. 2. I get this feeling that it is something to do with rpc getting hungup doing reverse lookups. But I can't prove it. Have you disabled the tcp chimney stuff? -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in
Guest $hawn Posted May 9, 2008 Posted May 9, 2008 Re: Intermittent Network Pauses We've seen this exact problem as well since around February or March this year. Our hardware is Dell PowerEdge 2650's (2 node active/passive cluster on W2k3 SP2 32-bit Enterprise), with a Dell/EMC SAN. It seems to be somehow connected to increased network activity or large file transfers, but there are never any useful events in the logs or illuminating activity on any performance counters. Despite many, many hours spent on the phone, so far neither MS or Dell has been able to isolate the root cause. :(
Guest Squidi Posted May 12, 2008 Posted May 12, 2008 Re: Intermittent Network Pauses $hawn, Have you looked into your storport driver versions? I figure you have, but I have a very similar setup to yours at a different facility and I overlooked the Microsoft KB's that upgrade the storport and we had a nagging performance issue that was caused by older storport drivers. The other guy, Do you have VSS in use in any form? We don't use it for snapshots, but we have a backup program that uses it to backup the SQL database on one of the nodes. Do you ever see VSS messages in your event viewer? -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in
Guest $hawn Posted May 12, 2008 Posted May 12, 2008 Re: Intermittent Network Pauses Yes. Actually today we replaced an entire server with a new Dell; all brand new hardware and the latest drivers for everything. The dropouts are still happening. Dell swears there's nothing wrong with the SAN. I'm starting to think that it could be a Win 2003 OS update? I might try removing all updates since January to see if it goes away... > Have you looked into your storport driver versions? I figure you have, > but I have a very similar setup to yours at a different facility and I > overlooked the Microsoft KB's that upgrade the storport and we had a > nagging performance issue that was caused by older storport drivers.
Guest Squidi Posted May 13, 2008 Posted May 13, 2008 Re: Intermittent Network Pauses $hawn;3729513 Wrote: > Yes. Actually today we replaced an entire server with a new Dell; all > brand > new hardware and the latest drivers for everything. The dropouts are > still > happening. > Dell swears there's nothing wrong with the SAN. > > I'm starting to think that it could be a Win 2003 OS update? I might > try > removing all updates since January to see if it goes away... > If it works let me know, the problem started in October (ish) of 2007 for me, and I stopped updating the machines shortly thereafter because I didn't want to throw in extra variables. Then I did all of the updates ( including sp2 ) because I'm out of ideas. -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in
Guest Intermittent Network Pauses Posted May 15, 2008 Posted May 15, 2008 RE: Intermittent Network Pauses We logged a call with MS and they asked us to upgrade couple of drivers. We will be doing the upgrade in the next day or so and will post the out come. Drivers to be Upgraded: 1) Update elxstor driver to the latest version. ELXSTOR.SYS |Emulex |5.1:20.7 |Aug 04 2006 |Storport Miniport Driver for LightPulse HBAs 2) Update hpcisss2.sys or we can contact HP to get latest Proliant support Pack HPCISSS2.SYS |Hewlett-Packard Company |6.8:0.32 |Jun 21 2007 |Smart Array SAS/SATA Controller Storport Driver "Intermittent Network Pauses" wrote: > Hi There, > > We have a HP Class-C Chassis with 4 Blade servers, all running windows > server 2003. Two of these servers are clustered in Active Passive mode. This > cluster is connected to HP EVA3000 SAN array and the network interface of the > cluster connects to Cisco 3750 stack. > For the past number of months we have been having issues where the clients, > running windows XP lose their network drives and after a pause of approx > 10-15 seconds reconnect and in many cases a reboot of the workstation is > required. > > We have performed a number of test to isolate the root casue of the issue. > The network was fully checked and ruled out as the casue. Packet capture did > not reveal anything unusual, except traffic stopping from the cluster during > the pause. > > We perfomed a number of tests on the cluster nodes and one of the tests was > a copy test. > We ran perfmon and started copying files between local drives and SAN drives. > > Test1 :copy a Gigabyte file from Local C: to D: > Test2: copy the same file from local C: to SAN > Test3: Copy the same file from SAN to local C: > > During all of the above tests we observed, on perfmon, the CPU utilsation > dropping to zero and the the network Interface utilisation dropping to zero > at exactly the same time. While the CPU utilisation recovered almost > immedailty, the network utilisation stayed at 0% for the duration of the copy. > These tests caused exactly the same outages that our users experience. > While this was happening I could still ping the server at all time. > > The servers are running Windows 2003 Server SP2. > RSS is disabled on the Nics. > TOE is disabled. > Teaming is disabled as well. > > Has anyone seen or had this or similar issues. Please help. > > Thank you > > > >
Guest Squidi Posted May 16, 2008 Posted May 16, 2008 Re: Intermittent Network Pauses So, new elxstor driver do the trick? -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in
Guest Intermittent Network Pauses Posted May 18, 2008 Posted May 18, 2008 Re: Intermittent Network Pauses Hi Squidi, Na, no difference at all. Still pausing. The data has been re-arranged in the SAN but still not help. Anymore ideas Squidi. "Squidi" wrote: > > So, new elxstor driver do the trick? > > > -- > Squidi > ------------------------------------------------------------------------ > Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 > View this thread: http://forums.techarena.in/showthread.php?t=962570 > > http://forums.techarena.in > >
Guest Squidi Posted May 20, 2008 Posted May 20, 2008 Re: Intermittent Network Pauses Below is a little perl script that keeps track of when it happens ( roughly ). Part of the problem with figuring this out is that there is no record of when or how often it happens. If you point it at a text file on a share ( smbstat \\server\share\file.txt ), it opens it, shows it to you, closes it and records the the time in the log file. I then look through the log file with something like: awk '{if ( $7 > 1 ) {print $0}}' smbstat4.log Which spits out everything that took greater then a second. No making fun of my perlfu. This is one of the ways to do it! I also posted on Microsoft's forums. The response was basically, "Call PSS". You two's clusters are on the supported list, maybe you would have better luck. ---perl------------------------------------------------- use Time::HiRes qw(usleep gettimeofday tv_interval); if ( $#ARGV == -1 ) { die "Usage: smbstat <uncpathnametofile>\n"; } $filename = @ARGV[0]; do { open OUTFILE, ">>smbstat4.log" or die $!; $before = [gettimeofday]; open NETFILE, $filename or die $!; while (<NETFILE>) { print $_; } ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks)= stat($filename); print $size; close NETFILE; $after = [gettimeofday]; $i = tv_interval $before, $after; $t = localtime; print OUTFILE $t, " Interval: $i \n"; close OUTFILE; sleep(5); } while (1); -------------------------------------- -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in
Guest $hawn Posted May 27, 2008 Posted May 27, 2008 Re: Intermittent Network Pauses We rebuilt one of our fileservers, but leaving off all Windows Updates since Oct 2007. So far this seems to have done the trick! Now if we can only figure out which update caused the problem... "Squidi" wrote: > > $hawn;3729513 Wrote: > > Yes. Actually today we replaced an entire server with a new Dell; all > > brand > > new hardware and the latest drivers for everything. The dropouts are > > still > > happening. > > Dell swears there's nothing wrong with the SAN. > > > > I'm starting to think that it could be a Win 2003 OS update? I might > > try > > removing all updates since January to see if it goes away... > > > > If it works let me know, the problem started in October (ish) of 2007 > for me, and I stopped updating the machines shortly thereafter because > I didn't want to throw in extra variables. Then I did all of the > updates ( including sp2 ) because I'm out of ideas. > > > -- > Squidi > ------------------------------------------------------------------------ > Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 > View this thread: http://forums.techarena.in/showthread.php?t=962570 > > http://forums.techarena.in > >
Guest Squidi Posted May 27, 2008 Posted May 27, 2008 Re: Intermittent Network Pauses That's great! Are you at SP2 or not? If you are in a good condition, could you generate a list of updates ( with something like http://www.nirsoft.net/utils/wul.html, or whatever ) that would be a "safe list"? Maybe I could work backwards. -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in
Guest $hawn Posted May 28, 2008 Posted May 28, 2008 Re: Intermittent Network Pauses Yes, we are at SP2. I used the utility you mentioned to generate this list of patches that are installed ("Safe List"), pasted below (sorry it's so ugly); it's every applicable Windows Update for Server 2k3 through Sept 2007. If you are able to isolate which update that was released after this list is causing the problem, I think a lot of us would be very grateful. :) NAME/DESCRIPTION/INSTALL DATE/DISPLAY VERSION/UPDATE TYPE/APPLICATION/WEB LINK/UNINSTALL COMMAND/LAST MODIFIED TIME KB914961 Windows Server 2003 Service Pack 2 5/13/2008 Service Pack Windows Server 2003 http://support.microsoft.com/?kbid=914961 C:\WINDOWS\$NtServicePackUninstall$\spuninst\spuninst.exe 5/13/2008 1:42:55 PM KB921503 Security Update for Windows Server 2003 (KB921503) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=921503 C:\WINDOWS\$NtUninstallKB921503$\spuninst\spuninst.exe 5/13/2008 2:14:02 PM KB924667-v2 Security Update for Windows Server 2003 (KB924667-v2) 5/13/2008 2 Update Windows Server 2003 http://support.microsoft.com/?kbid=924667-v2 C:\WINDOWS\$NtUninstallKB924667-v2$\spuninst\spuninst.exe 5/13/2008 2:16:57 PM KB925398_WMP64 Security Update for Windows Media Player 6.4 (KB925398) N/A Windows Media Player 6.4 http://support.microsoft.com/?kbid=925398_WMP64 5/13/2008 2:14:27 PM KB925398_WMP64 5/13/2008 Update Windows Media Player 6.4 http://support.microsoft.com/?kbid=925398_WMP64 C:\WINDOWS\$NtUninstallKB925398_WMP64$\spuninst\spuninst.exe 5/13/2008 2:14:27 PM KB925902 Security Update for Windows Server 2003 (KB925902) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=925902 C:\WINDOWS\$NtUninstallKB925902$\spuninst\spuninst.exe 5/13/2008 2:17:06 PM KB926122 Security Update for Windows Server 2003 (KB926122) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=926122 C:\WINDOWS\$NtUninstallKB926122$\spuninst\spuninst.exe 5/13/2008 2:13:55 PM KB927891 Update for Windows Server 2003 (KB927891) 5/13/2008 5 Update Windows Server 2003 http://support.microsoft.com/?kbid=927891 C:\WINDOWS\$NtUninstallKB927891$\spuninst\spuninst.exe 5/13/2008 2:14:09 PM KB929123 Security Update for Windows Server 2003 (KB929123) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=929123 C:\WINDOWS\$NtUninstallKB929123$\spuninst\spuninst.exe 5/13/2008 2:17:38 PM KB930178 Security Update for Windows Server 2003 (KB930178) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=930178 C:\WINDOWS\$NtUninstallKB930178$\spuninst\spuninst.exe 5/13/2008 2:17:31 PM KB931784 Security Update for Windows Server 2003 (KB931784) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=931784 C:\WINDOWS\$NtUninstallKB931784$\spuninst\spuninst.exe 5/13/2008 2:14:41 PM KB932168 Security Update for Windows Server 2003 (KB932168) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=932168 C:\WINDOWS\$NtUninstallKB932168$\spuninst\spuninst.exe 5/13/2008 2:13:42 PM KB933360 Update for Windows Server 2003 (KB933360) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=933360 C:\WINDOWS\$NtUninstallKB933360$\spuninst\spuninst.exe 5/13/2008 2:14:15 PM KB935839 Security Update for Windows Server 2003 (KB935839) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=935839 C:\WINDOWS\$NtUninstallKB935839$\spuninst\spuninst.exe 5/13/2008 2:13:34 PM KB935840 Security Update for Windows Server 2003 (KB935840) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=935840 C:\WINDOWS\$NtUninstallKB935840$\spuninst\spuninst.exe 5/13/2008 2:17:44 PM KB936021 Security Update for Windows Server 2003 (KB936021) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=936021 C:\WINDOWS\$NtUninstallKB936021$\spuninst\spuninst.exe 5/13/2008 2:17:23 PM KB936782 Security Update for Windows Server 2003 (KB936782) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=936782 C:\WINDOWS\$NtUninstallKB936782$\spuninst\spuninst.exe 5/13/2008 2:13:49 PM KB937143 Security Update for Windows Server 2003 (KB937143) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=937143 C:\WINDOWS\$NtUninstallKB937143$\spuninst\spuninst.exe 5/13/2008 2:17:16 PM KB938127 Security Update for Windows Server 2003 (KB938127) 5/13/2008 1 Update Windows Server 2003 http://support.microsoft.com/?kbid=938127 C:\WINDOWS\$NtUninstallKB938127$\spuninst\spuninst.exe 5/13/2008 2:14:33 PM MMC30Core 5/13/2008 Update Windows Server 2003 http://support.microsoft.com/kb/MC30Core C:\WINDOWS\$NtUninstallMMC30Core$\spuninst\spuninst.exe 5/13/2008 12:57:33 PM R2-In-band 5/13/2008 Update Windows Server 2003 http://support.microsoft.com/?kbid=R2-In-band C:\WINDOWS\$NtUninstallR2-In-band$\spuninst\spuninst.exe 5/13/2008 12:57:15 PM R2-New-files 5/13/2008 Update Windows Server 2003 http://support.microsoft.com/?kbid=R2-New-files C:\WINDOWS\$NtUninstallR2-New-files$\spuninst\spuninst.exe 5/13/2008 12:57:44 PM SP1 Microsoft .NET Framework 1.1 Service Pack 1 N/A SP .NETFramework http://support.microsoft.com/?kbid=SP1 5/13/2008 1:42:10 PM
Guest Squidi Posted July 15, 2008 Posted July 15, 2008 Re: Intermittent Network Pauses Hello, After many weeks of fiddling with everything and anything the storage group turned on Write Cache for our LUNs on the Xiotech array and the problem disappeared. It appears that we having very poor performance from our setup when writing and reading from a single LUN. I have not figured out where the bottleneck is, however, we are seeing 8-11 mB/s speeds from the volumes. Which is terrible. I've got USB disks that work better then that. For anyone else with similar problems: Perfmon->Physical Disk->% Idle Time shouldn't be 0 for long stretches. -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?u=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in
Recommended Posts