Guest bisi Posted February 21, 2008 Posted February 21, 2008 Hello, we have in our company 3 Windows 2003 servers which one to twice a week simply stop working during the night. We are pinging them with Nagios and from one moment to the other, they stop responding even if the machine is still running. After rebooting them, everything works fine again. Since it is always in the night, my suspicion is the backup(we are using TSM), but I am not quite sure about this. Here is the extract from the event log when the problem appears: The first message appearing is: "The browser service was unable to retrieve a list of servers from the browser master \\<Domain Controller> on the network \Device \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}. Browser master: \\<Domain Controller> Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}" One minute later: "The browser service has failed to retrieve the backup list too many times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5- D16230ED4EB5}. The backup browser is stopping." A few minutes later: "This computer was not able to set up a secure session with a domain controller in domain <domain name> due to the following: Not enough storage is available to process this command. This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator. " Between the previous event and the next one, there is about 1 hour. In mean time, the server stopped responding to a ping and is unavailable from the network, but still running This is the event which appears after an hour: "The server was unable to allocate from the system nonpaged pool because the pool was empty." Later: "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register with DCOM within the required timeout." The server was unable to allocate from the system nonpaged pool because the pool was empty.
Guest Meinolf Weber Posted February 21, 2008 Posted February 21, 2008 Re: Windows servers stops responding Hello bisi, Did you run dcdiag /v and netdiag /v to check for errors on all machines, i assume that are domain controllers? If you have errors, please post the complete output here. Best regards Meinolf Weber Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights. ** Please do NOT email, only reply to Newsgroups ** HELP us help YOU!!! http://www.blakjak.demon.co.uk/mul_crss.htm > Hello, > we have in our company 3 Windows 2003 servers which one to twice a > week simply stop working during the night. We are pinging them with > Nagios and from one moment to the other, they stop responding even if > the machine is still running. After rebooting them, everything works > fine again. Since it is always in the night, my suspicion is the > backup(we are using TSM), but I am not quite sure about this. > Here is the extract from the event log when the problem appears: > The first message appearing is: > > "The browser service was unable to retrieve a list of servers from the > browser master \\<Domain Controller> on the network \Device > \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}. > > Browser master: \\<Domain Controller> > Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}" > One minute later: > "The browser service has failed to retrieve the backup list too many > times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5- > D16230ED4EB5}. The backup browser is stopping." > A few minutes later: > > "This computer was not able to set up a secure session with a domain > controller in domain <domain name> due to the following: > Not enough storage is available to process this command. > This may lead to authentication problems. Make sure that this computer > is connected to the network. If the problem persists, please contact > your domain administrator. " > Between the previous event and the next one, there is about 1 hour. In > mean time, the server stopped responding to a ping and is unavailable > from the network, but still running > This is the event which appears after an hour: > "The server was unable to allocate from the system nonpaged pool > because the pool was empty." > > Later: > "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register > with DCOM within the required timeout." > The server was unable to allocate from the system nonpaged pool > because the pool was empty. >
Guest paulreims@gmail.com Posted February 21, 2008 Posted February 21, 2008 Re: Windows servers stops responding On 21 fév, 15:27, Meinolf Weber <meiweb(nospam)@gmx.de> wrote: > Hello bisi, > > Did you run dcdiag /v and netdiag /v to check for errors on all machines, > i assume that are domain controllers? If you have errors, please post the > complete output here. > > Best regards > > Meinolf Weber > Disclaimer: This posting is provided "AS IS" with no warranties, and confers > no rights. > ** Please do NOT email, only reply to Newsgroups > ** HELP us help YOU!!!http://www.blakjak.demon.co.uk/mul_crss.htm > > > > > Hello, > > we have in our company 3 Windows 2003 servers which one to twice a > > week simply stop working during the night. We are pinging them with > > Nagios and from one moment to the other, they stop responding even if > > the machine is still running. After rebooting them, everything works > > fine again. Since it is always in the night, my suspicion is the > > backup(we are using TSM), but I am not quite sure about this. > > Here is the extract from the event log when the problem appears: > > The first message appearing is: > > > "The browser service was unable to retrieve a list of servers from the > > browser master \\<Domain Controller> on the network \Device > > \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}. > > > Browser master: \\<Domain Controller> > > Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}" > > One minute later: > > "The browser service has failed to retrieve the backup list too many > > times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5- > > D16230ED4EB5}. The backup browser is stopping." > > A few minutes later: > > > "This computer was not able to set up a secure session with a domain > > controller in domain <domain name> due to the following: > > Not enough storage is available to process this command. > > This may lead to authentication problems. Make sure that this computer > > is connected to the network. If the problem persists, please contact > > your domain administrator. " > > Between the previous event and the next one, there is about 1 hour. In > > mean time, the server stopped responding to a ping and is unavailable > > from the network, but still running > > This is the event which appears after an hour: > > "The server was unable to allocate from the system nonpaged pool > > because the pool was empty." > > > Later: > > "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register > > with DCOM within the required timeout." > > The server was unable to allocate from the system nonpaged pool > > because the pool was empty.- Masquer le texte des messages précédents - > > - Afficher le texte des messages précédents - Hello, the machines are no domain controllers, there is one machine which does mainly SQL Server for a few applications, the other one host XAP an the third one hosts also some applications... netdiag did not show me any errors, everything seemed OK for me. eventually could I try to run netdiag the next time when the server has the problem, but since network connection does not work once the problem appears, I dont know if I will get some valuable information..... Any other ideas? Best regards CB
Guest Danny Sanders Posted February 21, 2008 Posted February 21, 2008 Re: Windows servers stops responding What model server are we talking about? We have about 6 HP ML 380s I think they are that display the same symptoms. We've kind of narrowed it down to the NIC. hth DDS "bisi" <bisibis@pt.lu> wrote in message news:8a9eb31a-9a35-41be-9aac-9377dc6b374e@d5g2000hsc.googlegroups.com... > Hello, > we have in our company 3 Windows 2003 servers which one to twice a > week simply stop working during the night. We are pinging them with > Nagios and from one moment to the other, they stop responding even if > the machine is still running. After rebooting them, everything works > fine again. Since it is always in the night, my suspicion is the > backup(we are using TSM), but I am not quite sure about this. > Here is the extract from the event log when the problem appears: > > The first message appearing is: > > "The browser service was unable to retrieve a list of servers from the > browser master \\<Domain Controller> on the network \Device > \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}. > > Browser master: \\<Domain Controller> > Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}" > > > > One minute later: > "The browser service has failed to retrieve the backup list too many > times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5- > D16230ED4EB5}. The backup browser is stopping." > > > A few minutes later: > > "This computer was not able to set up a secure session with a domain > controller in domain <domain name> due to the following: > Not enough storage is available to process this command. > This may lead to authentication problems. Make sure that this computer > is connected to the network. If the problem persists, please contact > your domain administrator. " > > > Between the previous event and the next one, there is about 1 hour. In > mean time, the server stopped responding to a ping and is unavailable > from the network, but still running > This is the event which appears after an hour: > > "The server was unable to allocate from the system nonpaged pool > because the pool was empty." > > > Later: > "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register > with DCOM within the required timeout." > > > > > The server was unable to allocate from the system nonpaged pool > because the pool was empty. > > > >
Guest paulreims@gmail.com Posted February 25, 2008 Posted February 25, 2008 Re: Windows servers stops responding On 21 fév, 19:20, "Danny Sanders" <DSand...@NOSPAMciber.com> wrote: > What model server are we talking about? We have about 6 HP ML 380s I think > they are that display the same symptoms. We've kind of narrowed it down to > the NIC. > > hth > DDS > > "bisi" <bisi...@pt.lu> wrote in message > > news:8a9eb31a-9a35-41be-9aac-9377dc6b374e@d5g2000hsc.googlegroups.com... > > > > > Hello, > > we have in our company 3 Windows 2003 servers which one to twice a > > week simply stop working during the night. We are pinging them with > > Nagios and from one moment to the other, they stop responding even if > > the machine is still running. After rebooting them, everything works > > fine again. Since it is always in the night, my suspicion is the > > backup(we are using TSM), but I am not quite sure about this. > > Here is the extract from the event log when the problem appears: > > > The first message appearing is: > > > "The browser service was unable to retrieve a list of servers from the > > browser master \\<Domain Controller> on the network \Device > > \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}. > > > Browser master: \\<Domain Controller> > > Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}" > > > One minute later: > > "The browser service has failed to retrieve the backup list too many > > times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5- > > D16230ED4EB5}. The backup browser is stopping." > > > A few minutes later: > > > "This computer was not able to set up a secure session with a domain > > controller in domain <domain name> due to the following: > > Not enough storage is available to process this command. > > This may lead to authentication problems. Make sure that this computer > > is connected to the network. If the problem persists, please contact > > your domain administrator. " > > > Between the previous event and the next one, there is about 1 hour. In > > mean time, the server stopped responding to a ping and is unavailable > > from the network, but still running > > This is the event which appears after an hour: > > > "The server was unable to allocate from the system nonpaged pool > > because the pool was empty." > > > Later: > > "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register > > with DCOM within the required timeout." > > > The server was unable to allocate from the system nonpaged pool > > because the pool was empty.- Masquer le texte des messages précédents - > > - Afficher le texte des messages précédents - Hello, Servers are all HP-DL servers with the the "HP NC7781 gigabit Server Adapter" - The first one has driver version 7.103.0.0 and does NOT do network teaming - The second one uses driver version 6.64.0.0 on both cards and does network teaming. Driver version of the teaming driver is 7.41 - on the third server, I do not have access for the moment... Does this confirm your theory with the network card? Best regards CB
Guest Danny Sanders Posted February 26, 2008 Posted February 26, 2008 Re: Windows servers stops responding HP is now owning up to the problem and they say a new driver is coming out to fix the problem. For now they say to disable the TCP Offload engine (TOE). Turns out having this enabled causes a memory leak. hth DDS <paulreims@gmail.com> wrote in message news:73e095b4-855d-4179-83bb-42cbea92eef8@62g2000hsn.googlegroups.com... On 21 fév, 19:20, "Danny Sanders" <DSand...@NOSPAMciber.com> wrote: > What model server are we talking about? We have about 6 HP ML 380s I think > they are that display the same symptoms. We've kind of narrowed it down to > the NIC. > > hth > DDS > > "bisi" <bisi...@pt.lu> wrote in message > > news:8a9eb31a-9a35-41be-9aac-9377dc6b374e@d5g2000hsc.googlegroups.com... > > > > > Hello, > > we have in our company 3 Windows 2003 servers which one to twice a > > week simply stop working during the night. We are pinging them with > > Nagios and from one moment to the other, they stop responding even if > > the machine is still running. After rebooting them, everything works > > fine again. Since it is always in the night, my suspicion is the > > backup(we are using TSM), but I am not quite sure about this. > > Here is the extract from the event log when the problem appears: > > > The first message appearing is: > > > "The browser service was unable to retrieve a list of servers from the > > browser master \\<Domain Controller> on the network \Device > > \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}. > > > Browser master: \\<Domain Controller> > > Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}" > > > One minute later: > > "The browser service has failed to retrieve the backup list too many > > times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5- > > D16230ED4EB5}. The backup browser is stopping." > > > A few minutes later: > > > "This computer was not able to set up a secure session with a domain > > controller in domain <domain name> due to the following: > > Not enough storage is available to process this command. > > This may lead to authentication problems. Make sure that this computer > > is connected to the network. If the problem persists, please contact > > your domain administrator. " > > > Between the previous event and the next one, there is about 1 hour. In > > mean time, the server stopped responding to a ping and is unavailable > > from the network, but still running > > This is the event which appears after an hour: > > > "The server was unable to allocate from the system nonpaged pool > > because the pool was empty." > > > Later: > > "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register > > with DCOM within the required timeout." > > > The server was unable to allocate from the system nonpaged pool > > because the pool was empty.- Masquer le texte des messages précédents - > > - Afficher le texte des messages précédents - Hello, Servers are all HP-DL servers with the the "HP NC7781 gigabit Server Adapter" - The first one has driver version 7.103.0.0 and does NOT do network teaming - The second one uses driver version 6.64.0.0 on both cards and does network teaming. Driver version of the teaming driver is 7.41 - on the third server, I do not have access for the moment... Does this confirm your theory with the network card? Best regards CB
Recommended Posts