Jump to content

CRITICAL - Random BSODs on boot


Recommended Posts

Guest Stephen
Posted

I have a critical issue - I have a Windows 2003 Server that is

experiencing random BSODs

on boot. Each time, the machine boots up all the way - and if I'm quick,

I can log in. But within seconds or minutes, I get a random BSOD. So far,

I've seen:

 

Stop 8E

klif.sys

 

Page_fault_in_nonpaged_area

Stop 50

tcpip.sys

 

Page_fault_in_nonpaged_area

Stop 50

 

IRQL_not_less_or_equal

Stop 0A

 

Bugcode_NDIS_driver

Stop 7C

 

 

Booting in safe mode doesn't help. Neither does it help if I boot up

after disconnecting the internet network cable.

 

The server is perhaps a month behind on the Windows Update patches - so

I suppose if one of those addressed a vulnerability that would cause this,

that might at least provide an easy solution. (If I can keep the server up

long enough to install it)

 

 

Anyone familiar with this set of symptoms?

  • Replies 6
  • Created
  • Last Reply

Popular Days

Guest Stephen
Posted

Re: CRITICAL - Random BSODs on boot

 

 

Update:

 

Booting in Safe Mode *does* work - but booting in "Safe Mode with

Networking" yields the BSOD.

 

To me, that makes it more likely that this is an external attack -

although, I'm not sure how, if it happens when the WAN cable is

disconnected.

 

 

 

Stephen

Guest Stephen
Posted

Re: CRITICAL - Random BSODs on boot

 

 

As a further update, I was also able to get the machine to boot by

disconnecting the WAN cable and doing a Diagnostic Boot. Working from

there, I was eventually able to get it to boot with network access and stay

up long enough for me to pull off a Windows Update. Somewhere along the

way, I cleared out the temp directory as well, and as a final result, I've

been able to boot normally and have not experienced a BSOD yet.

 

I don't know how much of this is coincidence, or if it was the direct

result of what I did, but as long as it stays solved - that's what matters.

Guest Stephen
Posted

Re: CRITICAL - Random BSODs on boot

 

 

I was premature. The server stayed up for a couple of hours, and then

went down again under another BSOD:

 

BAD_POOL_CALLER

Stop 0x000000C2

Guest Jabez Gan [MVP]
Posted

Re: CRITICAL - Random BSODs on boot

 

Hey Stephen,

 

Please try:

http://aumha.org/a/stop.php#0xc2

 

Seems like your network card driver could be causing the issue...?

 

--

Jabez Gan

Microsoft MVP: Windows Server

http://www.msblog.org

 

 

"Stephen" <unavailable@notvalid.com> wrote in message

news:CdednYUTBe7XkGXanZ2dnUVZ_s6dnZ2d@giganews.com...

>

> I was premature. The server stayed up for a couple of hours, and then

> went down again under another BSOD:

>

> BAD_POOL_CALLER

> Stop 0x000000C2

>

>

Guest Edwin vMierlo [MVP]
Posted

Re: CRITICAL - Random BSODs on boot

 

>(If I can keep the server up

> long enough to install it)

>

 

I agree with Jabez Gan, it seems that you have a network driver issue

 

Try starting in Safe Mode without Networking, it should not start your

network drivers.

 

Which does mean that you need to download drivers and patches on another

machine and then "offline" transfer them to the problem host.

 

rgds,

Edwin.

Guest Stephen
Posted

Re: CRITICAL - Random BSODs on boot

 

 

"Jabez Gan [MVP]" <mingteikg@blizNOSPAMhosting.com> wrote in message

news:76E59ADC-11FD-48C7-A0E7-F52634CD9036@microsoft.com...

> Hey Stephen,

>

> Please try:

> http://aumha.org/a/stop.php#0xc2

>

> Seems like your network card driver could be causing the issue...?

>

> --

> Jabez Gan

 

 

If this were the exact same BSOD every time, that might appear more

likely, but I'm getting such a wide variation on them - are you sure it's

the network card driver? To clarify, these drivers have been in place for a

month (the server is a new build) with no problems. There has been nothing

new installed recently - no modifications. I find it hard to believe that

all of a sudden, the drivers that have been working fine for a month starts

causing a major catastrophe like this.

 

After looking into this further, I started to think that this issue

might be caused by Kaspersky AV, because of the first 10 minidumps I

analyzed, 7 of them were caused by klif.sys, which is one of their files. I

have little snippets of them below:

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 01:

 

 

PAGE_FAULT_IN_NONPAGED_AREA (50)

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

BUGCHECK_STR: 0x50

 

PROCESS_NAME: dfssvc.exe

 

FOLLOWUP_IP:

klif+1e1ff

b9a6c1ff ?? ???

 

SYMBOL_NAME: klif+1e1ff

 

MODULE_NAME: klif

 

IMAGE_NAME: klif.sys

 

FAILURE_BUCKET_ID: 0x50_klif+1e1ff

 

BUCKET_ID: 0x50_klif+1e1ff

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 02:

 

 

KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)

 

FAULTING_IP:

klif+e6e6

b9a926e6 8b4e0c mov ecx,dword ptr [esi+0Ch]

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

STACK_TEXT:

WARNING: Stack unwind information not available. Following frames may be

wrong.

b971ebc8 b9a92761 00000042 00000548 b9aa1427 klif+0xe6e6

b971ebcc 00000000 00000548 b9aa1427 00000548 klif+0xe761

 

FOLLOWUP_IP:

klif+e6e6

b9a926e6 8b4e0c mov ecx,dword ptr [esi+0Ch]

 

SYMBOL_NAME: klif+e6e6

 

MODULE_NAME: klif

 

IMAGE_NAME: klif.sys

 

FAILURE_BUCKET_ID: 0x8E_klif+e6e6

 

BUCKET_ID: 0x8E_klif+e6e6

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 03:

 

 

PAGE_FAULT_IN_NONPAGED_AREA (50)

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

PROCESS_NAME: ntfrs.exe

 

FOLLOWUP_IP:

klif+1e1ff

b9a9c1ff ?? ???

 

SYMBOL_NAME: klif+1e1ff

 

MODULE_NAME: klif

 

IMAGE_NAME: klif.sys

 

FAILURE_BUCKET_ID: 0x50_klif+1e1ff

 

BUCKET_ID: 0x50_klif+1e1ff

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 04:

 

 

IRQL_NOT_LESS_OR_EQUAL (a)

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

PROCESS_NAME: update.exe

 

FOLLOWUP_IP:

klif+1eed5

b9a6bed5 ?? ???

 

SYMBOL_NAME: klif+1eed5

 

MODULE_NAME: klif

 

IMAGE_NAME: klif.sys

 

FAILURE_BUCKET_ID: 0xA_klif+1eed5

 

BUCKET_ID: 0xA_klif+1eed5

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 05:

 

 

BUGCODE_NDIS_DRIVER (7c)

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

PROCESS_NAME: Idle

 

SYMBOL_NAME: e1e5132+21f8

 

MODULE_NAME: e1e5132

 

IMAGE_NAME: e1e5132.sys

 

FAILURE_BUCKET_ID: 0x7C_e1e5132+21f8

 

BUCKET_ID: 0x7C_e1e5132+21f8

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 06:

 

 

PAGE_FAULT_IN_NONPAGED_AREA (50)

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

MODULE_NAME: afd

 

IMAGE_NAME: afd.sys

 

DEBUG_FLR_IMAGE_TIMESTAMP: 45d6a080

 

FAILURE_BUCKET_ID: 0x50_afd!AfdFreePollInfo+24

 

BUCKET_ID: 0x50_afd!AfdFreePollInfo+24

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 07:

 

 

PAGE_FAULT_IN_NONPAGED_AREA (50)

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

PROCESS_NAME: lsass.exe

 

FOLLOWUP_IP:

klif+1e1ff

b9a621ff ?? ???

 

SYMBOL_NAME: klif+1e1ff

 

MODULE_NAME: klif

 

IMAGE_NAME: klif.sys

 

FAILURE_BUCKET_ID: 0x50_klif+1e1ff

 

BUCKET_ID: 0x50_klif+1e1ff

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 08:

 

 

BAD_POOL_CALLER (c2)

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

FOLLOWUP_IP:

klif+1e1ff

b9a031ff ?? ???

 

SYMBOL_NAME: klif+1e1ff

 

MODULE_NAME: klif

 

IMAGE_NAME: klif.sys

 

FAILURE_BUCKET_ID: 0xc2_7_Proc_klif+1e1ff

 

BUCKET_ID: 0xc2_7_Proc_klif+1e1ff

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 09:

 

 

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

PROCESS_NAME: System

 

MODULE_NAME: netbt

 

IMAGE_NAME: netbt.sys

 

FAILURE_BUCKET_ID: 0xD1_W_netbt!NbtDereferenceLowerConnection+39

 

BUCKET_ID: 0xD1_W_netbt!NbtDereferenceLowerConnection+39

 

-------------------------------------------------------------

Minidump - 2008-04-05 - 10:

 

 

PAGE_FAULT_IN_NONPAGED_AREA (50)

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

PROCESS_NAME: System

 

FOLLOWUP_IP:

klif+12cf1

b9aafcf1 ?? ???

 

SYMBOL_NAME: klif+12cf1

 

MODULE_NAME: klif

 

IMAGE_NAME: klif.sys

 

FAILURE_BUCKET_ID: 0x50_klif+12cf1

 

BUCKET_ID: 0x50_klif+12cf1

 

-------------------------------------------------------------

 

 

 

 

However, I uninstalled Kaspersky a little after 4am today, and the server

still crashed another four times since then. Two of those crashes were

either severe enough or quick enough that no minidump was generated.

However, the results of the two remaining minidumps are listed below:

 

 

 

 

-----------------------------------------------------------------------------------------------

Minidump - 2008-04-06 - 08:

 

 

Mini Kernel Dump File: Only registers and stack trace are available

 

Symbol search path is: C:\WINDOWS\Symbols\Windows Server 2003 SP2

Retail\;C:\WINDOWS\Symbols\Windows XP SP2 Retail\

Executable search path is:

Unable to load image \WINDOWS\system32\ntkrnlpa.exe, Win32 error 0n2

*** WARNING: Unable to verify timestamp for ntkrnlpa.exe

Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (4 procs) Free

x86 compatible

Product: LanManNt, suite: Enterprise TerminalServer SingleUserTS

Kernel base = 0x80800000 PsLoadedModuleList = 0x808a6ea8

Debug session time: Sun Apr 6 06:07:36.315 2008 (GMT-4)

System Uptime: 0 days 1:47:06.707

Unable to load image \WINDOWS\system32\ntkrnlpa.exe, Win32 error 0n2

*** WARNING: Unable to verify timestamp for ntkrnlpa.exe

Loading Kernel Symbols

................................................................................................................

Loading User Symbols

Loading unloaded module list

...................

*******************************************************************************

*

*

* Bugcheck Analysis

*

*

*

*******************************************************************************

 

Use !analyze -v to get detailed debugging information.

 

BugCheck 50, {81426c7c, 0, 8081c5f7, 0}

 

 

Could not read faulting driver name

 

 

Probably caused by : ntkrnlpa.exe ( nt!IoStartPacket+65 )

 

Followup: MachineOwner

---------

 

0: kd> !analyze -v

*******************************************************************************

*

*

* Bugcheck Analysis

*

*

*

*******************************************************************************

 

PAGE_FAULT_IN_NONPAGED_AREA (50)

Invalid system memory was referenced. This cannot be protected by

try-except,

it must be protected by a Probe. Typically the address is just plain bad or

it

is pointing at freed memory.

Arguments:

Arg1: 81426c7c, memory referenced.

Arg2: 00000000, value 0 = read operation, 1 = write operation.

Arg3: 8081c5f7, If non-zero, the instruction address which referenced the

bad memory

address.

Arg4: 00000000, (reserved)

 

Debugging Details:

------------------

 

 

Could not read faulting driver name

 

 

 

READ_ADDRESS: 81426c7c

 

FAULTING_IP:

nt!IoStartPacket+65

8081c5f7 ?? ???

 

MM_INTERNAL_CODE: 0

 

CUSTOMER_CRASH_COUNT: 8

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

BUGCHECK_STR: 0x50

 

PROCESS_NAME: svchost.exe

 

CURRENT_IRQL: 1

 

LAST_CONTROL_TRANSFER: from 8085eced to 80827c63

 

STACK_TEXT:

b9072b70 8085eced 00000050 81426c7c 00000000 nt!KeDelayExecutionThread+0x99

b9072be8 8088c798 00000000 81426c7c 00000000 nt!MiQueryAddressState+0x29d

b9072c00 8081c5f7 badb0d00 00000000 b9072c20 nt!ExFreePoolWithTag+0x462

b9072c7c 808f5d84 890bc830 b9072d64 0075fe38 nt!IoStartPacket+0x65

b9072d00 808eed08 000001d0 000001c8 00000000 nt!IopGetDeviceInterfaces+0x170

b9072d34 8088978c 000001d0 000001c8 00000000 nt!IopLoadDriver+0x634

b9072d64 7c8285ec badb0d00 0075fe10 00000000 nt!MiReserveSystemPtes+0x1ca

WARNING: Frame IP not in any known module. Following frames may be wrong.

b9072d68 badb0d00 0075fe10 00000000 00000000 0x7c8285ec

b9072d6c 0075fe10 00000000 00000000 00000000 0xbadb0d00

b9072d70 00000000 00000000 00000000 00000000 0x75fe10

 

 

STACK_COMMAND: kb

 

FOLLOWUP_IP:

nt!IoStartPacket+65

8081c5f7 ?? ???

 

SYMBOL_STACK_INDEX: 3

 

SYMBOL_NAME: nt!IoStartPacket+65

 

FOLLOWUP_NAME: MachineOwner

 

MODULE_NAME: nt

 

IMAGE_NAME: ntkrnlpa.exe

 

DEBUG_FLR_IMAGE_TIMESTAMP: 45ec0a19

 

FAILURE_BUCKET_ID: 0x50_nt!IoStartPacket+65

 

BUCKET_ID: 0x50_nt!IoStartPacket+65

 

Followup: MachineOwner

---------

 

 

-----------------------------------------------------------------------------------------------

Minidump - 2008-04-06 - 09:

 

 

 

Mini Kernel Dump File: Only registers and stack trace are available

 

Symbol search path is: C:\WINDOWS\Symbols\Windows Server 2003 SP2

Retail\;C:\WINDOWS\Symbols\Windows XP SP2 Retail\

Executable search path is:

Unable to load image \WINDOWS\system32\ntkrnlpa.exe, Win32 error 0n2

*** WARNING: Unable to verify timestamp for ntkrnlpa.exe

Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (4 procs) Free

x86 compatible

Product: LanManNt, suite: Enterprise TerminalServer SingleUserTS

Kernel base = 0x80800000 PsLoadedModuleList = 0x808a6ea8

Debug session time: Sun Apr 6 10:02:47.859 2008 (GMT-4)

System Uptime: 0 days 0:01:28.468

Unable to load image \WINDOWS\system32\ntkrnlpa.exe, Win32 error 0n2

*** WARNING: Unable to verify timestamp for ntkrnlpa.exe

Loading Kernel Symbols

.............................................................................................................

Loading User Symbols

Loading unloaded module list

......

*******************************************************************************

*

*

* Bugcheck Analysis

*

*

*

*******************************************************************************

 

Use !analyze -v to get detailed debugging information.

 

BugCheck A, {812bc040, d0000002, 0, 8083f149}

 

*** WARNING: Unable to verify timestamp for afd.sys

 

 

Probably caused by : afd.sys ( afd!AfdCleanupReceiveDatagramIrp+42 )

 

Followup: MachineOwner

---------

 

2: kd> !analyze -v

*******************************************************************************

*

*

* Bugcheck Analysis

*

*

*

*******************************************************************************

 

IRQL_NOT_LESS_OR_EQUAL (a)

An attempt was made to access a pageable (or completely invalid) address at

an

interrupt request level (IRQL) that is too high. This is usually

caused by drivers using improper addresses.

If a kernel debugger is available get the stack backtrace.

Arguments:

Arg1: 812bc040, memory referenced

Arg2: d0000002, IRQL

Arg3: 00000000, bitfield :

bit 0 : value 0 = read operation, 1 = write operation

bit 3 : value 0 = not an execute operation, 1 = execute operation (only on

chips which support this level of status)

Arg4: 8083f149, address which referenced memory

 

Debugging Details:

------------------

 

 

 

 

READ_ADDRESS: 812bc040

 

CURRENT_IRQL: 2

 

FAULTING_IP:

nt!MiCleanSection+861

8083f149 ?? ???

 

CUSTOMER_CRASH_COUNT: 9

 

DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP

 

BUGCHECK_STR: 0xA

 

PROCESS_NAME: lsass.exe

 

LAST_CONTROL_TRANSFER: from 8083f149 to 8088c963

 

STACK_TEXT:

b6b68b34 8083f149 badb0d00 892bb970 89056a98 nt!ExFreePoolWithTag+0x62d

b6b68bdc b9afd661 812bc038 890813a0 892a09a0 nt!MiCleanSection+0x861

b6b68bf8 b9af4911 890813a0 c0000120 892a0978

afd!AfdCleanupReceiveDatagramIrp+0x42

b6b68c1c b9af1d1f 892a09a0 892a0994 c0000120 afd!AfdCompleteIrpList+0x4c

b6b68c58 b9aee79b 8a102280 89424f38 b6b68c7c afd!AfdCleanup+0x98

b6b68c68 8081df65 89424e20 8a156008 8a156008 afd!AfdDispatch+0xe0

b6b68c7c 808f9732 8a102268 8a390560 8a102280 nt!IoCsqInitialize+0x31

b6b68cac 80934bac 8a0af860 89424e20 0016019f

nt!IoReportResourceForDetection+0x176

b6b68cdc 809344ad 8a0af860 00000001 8a390560 nt!NtQueryObject+0x14c

b6b68d04 80934546 e21fe5d8 8a102280 00000778 nt!ObpQueryNameString+0x43f

b6b68d48 80934663 00000778 00000001 b6b68d64 nt!ObpQueryNameString+0x4d8

b6b68d58 8088978c 00000778 0006fd6c 7c8285ec nt!ObpQueryNameString+0x5f5

b6b68d64 7c8285ec badb0d00 0006fd08 00000000 nt!MiReserveSystemPtes+0x1ca

WARNING: Frame IP not in any known module. Following frames may be wrong.

b6b68d68 badb0d00 0006fd08 00000000 00000000 0x7c8285ec

b6b68d6c 0006fd08 00000000 00000000 00000000 0xbadb0d00

b6b68d70 00000000 00000000 00000000 00000000 0x6fd08

 

 

STACK_COMMAND: kb

 

FOLLOWUP_IP:

afd!AfdCleanupReceiveDatagramIrp+42

b9afd661 ?? ???

 

SYMBOL_STACK_INDEX: 2

 

SYMBOL_NAME: afd!AfdCleanupReceiveDatagramIrp+42

 

FOLLOWUP_NAME: MachineOwner

 

MODULE_NAME: afd

 

IMAGE_NAME: afd.sys

 

DEBUG_FLR_IMAGE_TIMESTAMP: 45d6a080

 

FAILURE_BUCKET_ID: 0xA_afd!AfdCleanupReceiveDatagramIrp+42

 

BUCKET_ID: 0xA_afd!AfdCleanupReceiveDatagramIrp+42

 

Followup: MachineOwner

---------

 

-----------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------

 

 

So unless Kaspersky left a part of itself behind that is causing this, the

problem lies elsewhere. When you look at the information above, do you

still think that the issue is the NIC driver?

 

 

 

 

 

Steve


×
×
  • Create New...