Jump to content

Windows servers stops responding


Recommended Posts

Posted

Hello,

we have in our company 3 Windows 2003 servers which one to twice a

week simply stop working during the night. We are pinging them with

Nagios and from one moment to the other, they stop responding even if

the machine is still running. After rebooting them, everything works

fine again. Since it is always in the night, my suspicion is the

backup(we are using TSM), but I am not quite sure about this.

Here is the extract from the event log when the problem appears:

 

The first message appearing is:

 

"The browser service was unable to retrieve a list of servers from the

browser master \\<Domain Controller> on the network \Device

\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}.

 

Browser master: \\<Domain Controller>

Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}"

 

 

 

One minute later:

"The browser service has failed to retrieve the backup list too many

times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-

D16230ED4EB5}. The backup browser is stopping."

 

 

A few minutes later:

 

"This computer was not able to set up a secure session with a domain

controller in domain <domain name> due to the following:

Not enough storage is available to process this command.

This may lead to authentication problems. Make sure that this computer

is connected to the network. If the problem persists, please contact

your domain administrator. "

 

 

Between the previous event and the next one, there is about 1 hour. In

mean time, the server stopped responding to a ping and is unavailable

from the network, but still running

This is the event which appears after an hour:

 

"The server was unable to allocate from the system nonpaged pool

because the pool was empty."

 

 

Later:

"The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register

with DCOM within the required timeout."

 

 

 

 

The server was unable to allocate from the system nonpaged pool

because the pool was empty.

  • Replies 5
  • Created
  • Last Reply
Guest Meinolf Weber
Posted

Re: Windows servers stops responding

 

Hello bisi,

 

Did you run dcdiag /v and netdiag /v to check for errors on all machines,

i assume that are domain controllers? If you have errors, please post the

complete output here.

 

Best regards

 

Meinolf Weber

Disclaimer: This posting is provided "AS IS" with no warranties, and confers

no rights.

** Please do NOT email, only reply to Newsgroups

** HELP us help YOU!!! http://www.blakjak.demon.co.uk/mul_crss.htm

> Hello,

> we have in our company 3 Windows 2003 servers which one to twice a

> week simply stop working during the night. We are pinging them with

> Nagios and from one moment to the other, they stop responding even if

> the machine is still running. After rebooting them, everything works

> fine again. Since it is always in the night, my suspicion is the

> backup(we are using TSM), but I am not quite sure about this.

> Here is the extract from the event log when the problem appears:

> The first message appearing is:

>

> "The browser service was unable to retrieve a list of servers from the

> browser master \\<Domain Controller> on the network \Device

> \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}.

>

> Browser master: \\<Domain Controller>

> Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}"

> One minute later:

> "The browser service has failed to retrieve the backup list too many

> times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-

> D16230ED4EB5}. The backup browser is stopping."

> A few minutes later:

>

> "This computer was not able to set up a secure session with a domain

> controller in domain <domain name> due to the following:

> Not enough storage is available to process this command.

> This may lead to authentication problems. Make sure that this computer

> is connected to the network. If the problem persists, please contact

> your domain administrator. "

> Between the previous event and the next one, there is about 1 hour. In

> mean time, the server stopped responding to a ping and is unavailable

> from the network, but still running

> This is the event which appears after an hour:

> "The server was unable to allocate from the system nonpaged pool

> because the pool was empty."

>

> Later:

> "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register

> with DCOM within the required timeout."

> The server was unable to allocate from the system nonpaged pool

> because the pool was empty.

>

Guest paulreims@gmail.com
Posted

Re: Windows servers stops responding

 

On 21 fév, 15:27, Meinolf Weber <meiweb(nospam)@gmx.de> wrote:

> Hello bisi,

>

> Did you run dcdiag /v and netdiag /v to check for errors on all  machines,

> i assume that are domain controllers? If you have errors, please post the

> complete output here.

>

> Best regards

>

> Meinolf Weber

> Disclaimer: This posting is provided "AS IS" with no warranties, and confers

> no rights.

> ** Please do NOT email, only reply to Newsgroups

> ** HELP us help YOU!!!http://www.blakjak.demon.co.uk/mul_crss.htm

>

>

>

> > Hello,

> > we have in our company 3 Windows 2003 servers which one to twice a

> > week simply stop working during the night. We are pinging them with

> > Nagios and from one moment to the other, they stop responding even if

> > the machine is still running. After rebooting them, everything works

> > fine again. Since it is always in the night, my suspicion is the

> > backup(we are using TSM), but I am not quite sure about this.

> > Here is the extract from the event log when the problem appears:

> > The first message appearing is:

>

> > "The browser service was unable to retrieve a list of servers from the

> > browser master \\<Domain Controller> on the network \Device

> > \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}.

>

> > Browser master: \\<Domain Controller>

> > Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}"

> > One minute later:

> > "The browser service has failed to retrieve the backup list too many

> > times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-

> > D16230ED4EB5}. The backup browser is stopping."

> > A few minutes later:

>

> > "This computer was not able to set up a secure session with a domain

> > controller in domain <domain name> due to the following:

> > Not enough storage is available to process this command.

> > This may lead to authentication problems. Make sure that this computer

> > is connected to the network. If the problem persists, please contact

> > your domain administrator.  "

> > Between the previous event and the next one, there is about 1 hour. In

> > mean time, the server stopped responding to a ping and is unavailable

> > from the network, but still running

> > This is the event which appears after an hour:

> > "The server was unable to allocate from the system nonpaged pool

> > because the pool was empty."

>

> > Later:

> > "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register

> > with DCOM within the required timeout."

> > The server was unable to allocate from the system nonpaged pool

> > because the pool was empty.- Masquer le texte des messages précédents -

>

> - Afficher le texte des messages précédents -

 

Hello,

the machines are no domain controllers, there is one machine which

does mainly SQL Server for a few applications, the other one host XAP

an the third one hosts also some applications...

netdiag did not show me any errors, everything seemed OK for me.

eventually could I try to run netdiag the next time when the server

has the problem, but since network connection does not work once the

problem appears, I dont know if I will get some valuable

information.....

 

Any other ideas?

Best regards

CB

Guest Danny Sanders
Posted

Re: Windows servers stops responding

 

What model server are we talking about? We have about 6 HP ML 380s I think

they are that display the same symptoms. We've kind of narrowed it down to

the NIC.

 

hth

DDS

 

"bisi" <bisibis@pt.lu> wrote in message

news:8a9eb31a-9a35-41be-9aac-9377dc6b374e@d5g2000hsc.googlegroups.com...

> Hello,

> we have in our company 3 Windows 2003 servers which one to twice a

> week simply stop working during the night. We are pinging them with

> Nagios and from one moment to the other, they stop responding even if

> the machine is still running. After rebooting them, everything works

> fine again. Since it is always in the night, my suspicion is the

> backup(we are using TSM), but I am not quite sure about this.

> Here is the extract from the event log when the problem appears:

>

> The first message appearing is:

>

> "The browser service was unable to retrieve a list of servers from the

> browser master \\<Domain Controller> on the network \Device

> \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}.

>

> Browser master: \\<Domain Controller>

> Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}"

>

>

>

> One minute later:

> "The browser service has failed to retrieve the backup list too many

> times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-

> D16230ED4EB5}. The backup browser is stopping."

>

>

> A few minutes later:

>

> "This computer was not able to set up a secure session with a domain

> controller in domain <domain name> due to the following:

> Not enough storage is available to process this command.

> This may lead to authentication problems. Make sure that this computer

> is connected to the network. If the problem persists, please contact

> your domain administrator. "

>

>

> Between the previous event and the next one, there is about 1 hour. In

> mean time, the server stopped responding to a ping and is unavailable

> from the network, but still running

> This is the event which appears after an hour:

>

> "The server was unable to allocate from the system nonpaged pool

> because the pool was empty."

>

>

> Later:

> "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register

> with DCOM within the required timeout."

>

>

>

>

> The server was unable to allocate from the system nonpaged pool

> because the pool was empty.

>

>

>

>

Guest paulreims@gmail.com
Posted

Re: Windows servers stops responding

 

On 21 fév, 19:20, "Danny Sanders" <DSand...@NOSPAMciber.com> wrote:

> What model server are we talking about? We have about 6 HP ML 380s I think

> they are that display the same symptoms. We've kind of narrowed it down to

> the NIC.

>

> hth

> DDS

>

> "bisi" <bisi...@pt.lu> wrote in message

>

> news:8a9eb31a-9a35-41be-9aac-9377dc6b374e@d5g2000hsc.googlegroups.com...

>

>

>

> > Hello,

> > we have in our company 3 Windows 2003 servers which one to twice a

> > week simply stop working during the night. We are pinging them with

> > Nagios and from one moment to the other, they stop responding even if

> > the machine is still running. After rebooting them, everything works

> > fine again. Since it is always in the night, my suspicion is the

> > backup(we are using TSM), but I am not quite sure about this.

> > Here is the extract from the event log when the problem appears:

>

> > The first message appearing is:

>

> > "The browser service was unable to retrieve a list of servers from the

> > browser master \\<Domain Controller> on the network \Device

> > \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}.

>

> > Browser master: \\<Domain Controller>

> > Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}"

>

> > One minute later:

> > "The browser service has failed to retrieve the backup list too many

> > times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-

> > D16230ED4EB5}. The backup browser is stopping."

>

> > A few minutes later:

>

> > "This computer was not able to set up a secure session with a domain

> > controller in domain <domain name> due to the following:

> > Not enough storage is available to process this command.

> > This may lead to authentication problems. Make sure that this computer

> > is connected to the network. If the problem persists, please contact

> > your domain administrator.  "

>

> > Between the previous event and the next one, there is about 1 hour. In

> > mean time, the server stopped responding to a ping and is unavailable

> > from the network, but still running

> > This is the event which appears after an hour:

>

> > "The server was unable to allocate from the system nonpaged pool

> > because the pool was empty."

>

> > Later:

> > "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register

> > with DCOM within the required timeout."

>

> > The server was unable to allocate from the system nonpaged pool

> > because the pool was empty.- Masquer le texte des messages précédents -

>

> - Afficher le texte des messages précédents -

 

Hello,

Servers are all HP-DL servers with the the "HP NC7781 gigabit Server

Adapter"

- The first one has driver version 7.103.0.0 and does NOT do

network teaming

- The second one uses driver version 6.64.0.0 on both cards and

does network teaming. Driver version of the teaming driver is 7.41

- on the third server, I do not have access for the moment...

 

Does this confirm your theory with the network card?

 

Best regards

CB

Guest Danny Sanders
Posted

Re: Windows servers stops responding

 

HP is now owning up to the problem and they say a new driver is coming out

to fix the problem. For now they say to disable the TCP Offload engine

(TOE). Turns out having this enabled causes a memory leak.

 

 

hth

DDS

 

<paulreims@gmail.com> wrote in message

news:73e095b4-855d-4179-83bb-42cbea92eef8@62g2000hsn.googlegroups.com...

On 21 fév, 19:20, "Danny Sanders" <DSand...@NOSPAMciber.com> wrote:

> What model server are we talking about? We have about 6 HP ML 380s I think

> they are that display the same symptoms. We've kind of narrowed it down to

> the NIC.

>

> hth

> DDS

>

> "bisi" <bisi...@pt.lu> wrote in message

>

> news:8a9eb31a-9a35-41be-9aac-9377dc6b374e@d5g2000hsc.googlegroups.com...

>

>

>

> > Hello,

> > we have in our company 3 Windows 2003 servers which one to twice a

> > week simply stop working during the night. We are pinging them with

> > Nagios and from one moment to the other, they stop responding even if

> > the machine is still running. After rebooting them, everything works

> > fine again. Since it is always in the night, my suspicion is the

> > backup(we are using TSM), but I am not quite sure about this.

> > Here is the extract from the event log when the problem appears:

>

> > The first message appearing is:

>

> > "The browser service was unable to retrieve a list of servers from the

> > browser master \\<Domain Controller> on the network \Device

> > \NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}.

>

> > Browser master: \\<Domain Controller>

> > Network: \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-D16230ED4EB5}"

>

> > One minute later:

> > "The browser service has failed to retrieve the backup list too many

> > times on transport \Device\NetBT_Tcpip_{1F73F5B1-3468-4FF2-8CF5-

> > D16230ED4EB5}. The backup browser is stopping."

>

> > A few minutes later:

>

> > "This computer was not able to set up a secure session with a domain

> > controller in domain <domain name> due to the following:

> > Not enough storage is available to process this command.

> > This may lead to authentication problems. Make sure that this computer

> > is connected to the network. If the problem persists, please contact

> > your domain administrator. "

>

> > Between the previous event and the next one, there is about 1 hour. In

> > mean time, the server stopped responding to a ping and is unavailable

> > from the network, but still running

> > This is the event which appears after an hour:

>

> > "The server was unable to allocate from the system nonpaged pool

> > because the pool was empty."

>

> > Later:

> > "The server {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4} did not register

> > with DCOM within the required timeout."

>

> > The server was unable to allocate from the system nonpaged pool

> > because the pool was empty.- Masquer le texte des messages précédents -

>

> - Afficher le texte des messages précédents -

 

Hello,

Servers are all HP-DL servers with the the "HP NC7781 gigabit Server

Adapter"

- The first one has driver version 7.103.0.0 and does NOT do

network teaming

- The second one uses driver version 6.64.0.0 on both cards and

does network teaming. Driver version of the teaming driver is 7.41

- on the third server, I do not have access for the moment...

 

Does this confirm your theory with the network card?

 

Best regards

CB


×
×
  • Create New...