Jump to content

Catastrophic Corruption of Dynamic Disks


Recommended Posts

Posted

I had a really disappointing event take place today with a critical system

that runs Windows 2003 32-bit. Effectively *all* of the dynamic disk

structures were corrupted, even though we were not working on them at the

time of the reboot. Upon reboot the system gives some brief message about

the disk being corrupt, and the Windows 2003 boot sequence never starts.

Looking at all of the drives inside the Disk Management utility in

Microsoft's ERD Commander Boot CD, the Dynamic volumes show in the state

"Offline". A Google search seems to suggest that Dynamic volumes in an

offline state normally means the Dynamic volume information is corrupt and

cannot be loaded.

 

What is particularly horrifying to me is we have two separate hardware RAID

controllers with three and five volumes respectively, and then we had

mirrored drives across those different controllers using Windows 2003

mirroring. When we rebooted, the corruption of the Dynamic volume

information resulted in ALL EIGHT drives effectively disappearing and going

"offline". So while the benefit of Dynamic volumes carrying around

information about other volumes on each volume has its advantages, the

downside of this system now becomes very clear to me. If you hit on any

bug that writes the Dynamic volume information incorrectly, you are going to

lose EVERYTHING on that system that is Dynamic!

 

We are in the process of recovering by hacking volume structures to convert

the Dynamic back to Simple volumes, and that so far seems to be going the

right direction.

 

Can someone explain to me under what scenarios this kind of dramatic

corruption of Dynamic volume structures can take place? If I have a loose

end in my hardware, I need to know what possibilities to chase. I didn't

have a good backup of this system since it was just being built up, and

losing it would have been a complete catastrophe.

 

--

Will

Posted

Re: Catastrophic Corruption of Dynamic Disks

 

In attempting to recover the boot device from our failure of dynamic disks,

we did these steps:

 

1) Converted the boot volume from Dynamic to Simple using DSKPROBE from

inside ERD Commander 2005.

2) Made the Simple partition Active, probably from Disk Management in ERD

Commander 2005.

3) Ran chkdsk /r from the Windows 2003 recovery console.

4) Ran Fixboot from recovery console

5) Ran FixMBR from recovery console

 

In spite of all of these steps, any attempt to boot the from the system

volume gets:

 

"Error Loading Operating System"

 

I normally associate that message with a BIOS configuration problem or a

hard drive cylinder mapping issue. I am not finding the problem in our

case. Can someone give me more detail about this error and how to overcome

it?

 

Would an incorrect disk number inside of BOOT.INI ever cause this error?

 

--

Will

 

 

"Will" <westes-usc@noemail.nospam> wrote in message

news:qZCdnVi46_lsWi7anZ2dnUVZ_rCtnZ2d@giganews.com...

>I had a really disappointing event take place today with a critical system

>that runs Windows 2003 32-bit. Effectively *all* of the dynamic disk

>structures were corrupted, even though we were not working on them at the

>time of the reboot. Upon reboot the system gives some brief message

>about the disk being corrupt, and the Windows 2003 boot sequence never

>starts. Looking at all of the drives inside the Disk Management utility in

>Microsoft's ERD Commander Boot CD, the Dynamic volumes show in the state

>"Offline". A Google search seems to suggest that Dynamic volumes in an

>offline state normally means the Dynamic volume information is corrupt and

>cannot be loaded.

>

> What is particularly horrifying to me is we have two separate hardware

> RAID controllers with three and five volumes respectively, and then we had

> mirrored drives across those different controllers using Windows 2003

> mirroring. When we rebooted, the corruption of the Dynamic volume

> information resulted in ALL EIGHT drives effectively disappearing and

> going "offline". So while the benefit of Dynamic volumes carrying around

> information about other volumes on each volume has its advantages, the

> downside of this system now becomes very clear to me. If you hit on any

> bug that writes the Dynamic volume information incorrectly, you are going

> to lose EVERYTHING on that system that is Dynamic!

>

> We are in the process of recovering by hacking volume structures to

> convert the Dynamic back to Simple volumes, and that so far seems to be

> going the right direction.

>

> Can someone explain to me under what scenarios this kind of dramatic

> corruption of Dynamic volume structures can take place? If I have a

> loose end in my hardware, I need to know what possibilities to chase. I

> didn't have a good backup of this system since it was just being built up,

> and losing it would have been a complete catastrophe.

>

> --

> Will

>

Guest Pegasus \(MVP\)
Posted

Re: Catastrophic Corruption of Dynamic Disks

 

Try to separate the boot process from the Windows startup process,

by booting the machine with a Windows boot diskette. Format a

floppy disk on some Windows2000/XP machine, then copy these

files to it:

- ntldr

- ntdetect.com

- boot.ini

 

"Will" <westes-usc@noemail.nospam> wrote in message

news:gq-dnewMEsqNbC7anZ2dnUVZ_rCtnZ2d@giganews.com...

> In attempting to recover the boot device from our failure of dynamic

> disks, we did these steps:

>

> 1) Converted the boot volume from Dynamic to Simple using DSKPROBE from

> inside ERD Commander 2005.

> 2) Made the Simple partition Active, probably from Disk Management in ERD

> Commander 2005.

> 3) Ran chkdsk /r from the Windows 2003 recovery console.

> 4) Ran Fixboot from recovery console

> 5) Ran FixMBR from recovery console

>

> In spite of all of these steps, any attempt to boot the from the system

> volume gets:

>

> "Error Loading Operating System"

>

> I normally associate that message with a BIOS configuration problem or a

> hard drive cylinder mapping issue. I am not finding the problem in our

> case. Can someone give me more detail about this error and how to

> overcome it?

>

> Would an incorrect disk number inside of BOOT.INI ever cause this error?

>

> --

> Will

>

>

> "Will" <westes-usc@noemail.nospam> wrote in message

> news:qZCdnVi46_lsWi7anZ2dnUVZ_rCtnZ2d@giganews.com...

>>I had a really disappointing event take place today with a critical system

>>that runs Windows 2003 32-bit. Effectively *all* of the dynamic disk

>>structures were corrupted, even though we were not working on them at the

>>time of the reboot. Upon reboot the system gives some brief message

>>about the disk being corrupt, and the Windows 2003 boot sequence never

>>starts. Looking at all of the drives inside the Disk Management utility in

>>Microsoft's ERD Commander Boot CD, the Dynamic volumes show in the state

>>"Offline". A Google search seems to suggest that Dynamic volumes in an

>>offline state normally means the Dynamic volume information is corrupt and

>>cannot be loaded.

>>

>> What is particularly horrifying to me is we have two separate hardware

>> RAID controllers with three and five volumes respectively, and then we

>> had mirrored drives across those different controllers using Windows 2003

>> mirroring. When we rebooted, the corruption of the Dynamic volume

>> information resulted in ALL EIGHT drives effectively disappearing and

>> going "offline". So while the benefit of Dynamic volumes carrying

>> around information about other volumes on each volume has its advantages,

>> the downside of this system now becomes very clear to me. If you hit on

>> any bug that writes the Dynamic volume information incorrectly, you are

>> going to lose EVERYTHING on that system that is Dynamic!

>>

>> We are in the process of recovering by hacking volume structures to

>> convert the Dynamic back to Simple volumes, and that so far seems to be

>> going the right direction.

>>

>> Can someone explain to me under what scenarios this kind of dramatic

>> corruption of Dynamic volume structures can take place? If I have a

>> loose end in my hardware, I need to know what possibilities to chase. I

>> didn't have a good backup of this system since it was just being built

>> up, and losing it would have been a complete catastrophe.

>>

>> --

>> Will

>>

>

>

Posted

Re: Catastrophic Corruption of Dynamic Disks

 

"Pegasus (MVP)" <I.can@fly.com.oz> wrote in message

news:O3Ab4dubIHA.1168@TK2MSFTNGP02.phx.gbl...

> Try to separate the boot process from the Windows startup process,

> by booting the machine with a Windows boot diskette. Format a

 

I AM able to boot the system using a Windows boot diskette. So what does

that suggest and what is further correction required to allow boot without

floppy?

 

--

Will

 

 

> "Will" <westes-usc@noemail.nospam> wrote in message

> news:gq-dnewMEsqNbC7anZ2dnUVZ_rCtnZ2d@giganews.com...

> > In attempting to recover the boot device from our failure of dynamic

> > disks, we did these steps:

> >

> > 1) Converted the boot volume from Dynamic to Simple using DSKPROBE from

> > inside ERD Commander 2005.

> > 2) Made the Simple partition Active, probably from Disk Management in

ERD

> > Commander 2005.

> > 3) Ran chkdsk /r from the Windows 2003 recovery console.

> > 4) Ran Fixboot from recovery console

> > 5) Ran FixMBR from recovery console

> >

> > In spite of all of these steps, any attempt to boot the from the system

> > volume gets:

> >

> > "Error Loading Operating System"

> >

> > I normally associate that message with a BIOS configuration problem or a

> > hard drive cylinder mapping issue. I am not finding the problem in

our

> > case. Can someone give me more detail about this error and how to

> > overcome it?

> >

> > Would an incorrect disk number inside of BOOT.INI ever cause this error?

> >

> > --

> > Will

> >

> >

> > "Will" <westes-usc@noemail.nospam> wrote in message

> > news:qZCdnVi46_lsWi7anZ2dnUVZ_rCtnZ2d@giganews.com...

> >>I had a really disappointing event take place today with a critical

system

> >>that runs Windows 2003 32-bit. Effectively *all* of the dynamic disk

> >>structures were corrupted, even though we were not working on them at

the

> >>time of the reboot. Upon reboot the system gives some brief message

> >>about the disk being corrupt, and the Windows 2003 boot sequence never

> >>starts. Looking at all of the drives inside the Disk Management utility

in

> >>Microsoft's ERD Commander Boot CD, the Dynamic volumes show in the state

> >>"Offline". A Google search seems to suggest that Dynamic volumes in

an

> >>offline state normally means the Dynamic volume information is corrupt

and

> >>cannot be loaded.

> >>

> >> What is particularly horrifying to me is we have two separate hardware

> >> RAID controllers with three and five volumes respectively, and then we

> >> had mirrored drives across those different controllers using Windows

2003

> >> mirroring. When we rebooted, the corruption of the Dynamic volume

> >> information resulted in ALL EIGHT drives effectively disappearing and

> >> going "offline". So while the benefit of Dynamic volumes carrying

> >> around information about other volumes on each volume has its

advantages,

> >> the downside of this system now becomes very clear to me. If you hit

on

> >> any bug that writes the Dynamic volume information incorrectly, you are

> >> going to lose EVERYTHING on that system that is Dynamic!

> >>

> >> We are in the process of recovering by hacking volume structures to

> >> convert the Dynamic back to Simple volumes, and that so far seems to be

> >> going the right direction.

> >>

> >> Can someone explain to me under what scenarios this kind of dramatic

> >> corruption of Dynamic volume structures can take place? If I have a

> >> loose end in my hardware, I need to know what possibilities to chase.

I

> >> didn't have a good backup of this system since it was just being built

> >> up, and losing it would have been a complete catastrophe.

> >>

> >> --

> >> Will

Guest Pegasus \(MVP\)
Posted

Re: Catastrophic Corruption of Dynamic Disks

 

Congratulations! This proves that there is nothing wrong with

Windows and that the problem lies with the boot environment.

I would now launch diskmgmt.msc and make sure that the boot

partition is marked "active".

 

 

"Will" <westes-usc@noemail.nospam> wrote in message

news:L-6dnaUCYOIOOSnanZ2dnUVZ_g6dnZ2d@giganews.com...

> "Pegasus (MVP)" <I.can@fly.com.oz> wrote in message

> news:O3Ab4dubIHA.1168@TK2MSFTNGP02.phx.gbl...

>> Try to separate the boot process from the Windows startup process,

>> by booting the machine with a Windows boot diskette. Format a

>

> I AM able to boot the system using a Windows boot diskette. So what does

> that suggest and what is further correction required to allow boot without

> floppy?

>

> --

> Will

>

>

>

>> "Will" <westes-usc@noemail.nospam> wrote in message

>> news:gq-dnewMEsqNbC7anZ2dnUVZ_rCtnZ2d@giganews.com...

>> > In attempting to recover the boot device from our failure of dynamic

>> > disks, we did these steps:

>> >

>> > 1) Converted the boot volume from Dynamic to Simple using DSKPROBE from

>> > inside ERD Commander 2005.

>> > 2) Made the Simple partition Active, probably from Disk Management in

> ERD

>> > Commander 2005.

>> > 3) Ran chkdsk /r from the Windows 2003 recovery console.

>> > 4) Ran Fixboot from recovery console

>> > 5) Ran FixMBR from recovery console

>> >

>> > In spite of all of these steps, any attempt to boot the from the system

>> > volume gets:

>> >

>> > "Error Loading Operating System"

>> >

>> > I normally associate that message with a BIOS configuration problem or

>> > a

>> > hard drive cylinder mapping issue. I am not finding the problem in

> our

>> > case. Can someone give me more detail about this error and how to

>> > overcome it?

>> >

>> > Would an incorrect disk number inside of BOOT.INI ever cause this

>> > error?

>> >

>> > --

>> > Will

>> >

>> >

>> > "Will" <westes-usc@noemail.nospam> wrote in message

>> > news:qZCdnVi46_lsWi7anZ2dnUVZ_rCtnZ2d@giganews.com...

>> >>I had a really disappointing event take place today with a critical

> system

>> >>that runs Windows 2003 32-bit. Effectively *all* of the dynamic disk

>> >>structures were corrupted, even though we were not working on them at

> the

>> >>time of the reboot. Upon reboot the system gives some brief message

>> >>about the disk being corrupt, and the Windows 2003 boot sequence never

>> >>starts. Looking at all of the drives inside the Disk Management utility

> in

>> >>Microsoft's ERD Commander Boot CD, the Dynamic volumes show in the

>> >>state

>> >>"Offline". A Google search seems to suggest that Dynamic volumes in

> an

>> >>offline state normally means the Dynamic volume information is corrupt

> and

>> >>cannot be loaded.

>> >>

>> >> What is particularly horrifying to me is we have two separate hardware

>> >> RAID controllers with three and five volumes respectively, and then we

>> >> had mirrored drives across those different controllers using Windows

> 2003

>> >> mirroring. When we rebooted, the corruption of the Dynamic volume

>> >> information resulted in ALL EIGHT drives effectively disappearing and

>> >> going "offline". So while the benefit of Dynamic volumes carrying

>> >> around information about other volumes on each volume has its

> advantages,

>> >> the downside of this system now becomes very clear to me. If you hit

> on

>> >> any bug that writes the Dynamic volume information incorrectly, you

>> >> are

>> >> going to lose EVERYTHING on that system that is Dynamic!

>> >>

>> >> We are in the process of recovering by hacking volume structures to

>> >> convert the Dynamic back to Simple volumes, and that so far seems to

>> >> be

>> >> going the right direction.

>> >>

>> >> Can someone explain to me under what scenarios this kind of dramatic

>> >> corruption of Dynamic volume structures can take place? If I have a

>> >> loose end in my hardware, I need to know what possibilities to chase.

> I

>> >> didn't have a good backup of this system since it was just being built

>> >> up, and losing it would have been a complete catastrophe.

>> >>

>> >> --

>> >> Will

>

>

Guest Pegasus \(MVP\)
Posted

Re: Catastrophic Corruption of Dynamic Disks

 

 

"Will" <westes-usc@noemail.nospam> wrote in message

news:fMednVwUeIbpISnanZ2dnUVZ_rSrnZ2d@giganews.com...

> "Pegasus (MVP)" <I.can@fly.com.oz> wrote in message

> news:OPig3O1bIHA.6024@TK2MSFTNGP06.phx.gbl...

>> Congratulations! This proves that there is nothing wrong with

>> Windows and that the problem lies with the boot environment.

>> I would now launch diskmgmt.msc and make sure that the boot

>> partition is marked "active".

>

> Partition was definitely active, and I had checked for that per my

> procedure

> posted below.

>

> Other possible causes for the hardware to not be able to bootstrap?

> How

> can I investigate that at a hardware level?

>

> Microsoft should create some special version of a boot floppy (startup

> option?) that tells the user how the hardware looks for the BIOS and

> reports

> errors on possible mismatch between what it needs to see and what it

> actually sees.

>

> You could also report with such a utility the more obvious conditions,

> like

> a partition not marked Active.

>

> --

> Will

 

I suspect that the tools you used when converting your damaged

partition, altered something that is essential for the boot-up process.

What it is I have no idea. I can now think of these options:

a) Keep booting off the FDD.

b) Make a bootable CD and keep booting off it.

c) Partition and format a spare disk on some ***other*** machine.

d) Boot your machine with a Bart PE boot CD. Now use robocopy.exe

to copy the old disk to the new disk, then test the new disk. Do NOT

boot the machine with both disks connected!

 

Instead performing the robocopy process under a Bart PE boot, you

could perform it while both disks are connected as slaves to some

other machine.

 

If the new disk works then you could format the system partition on

the old disk and restore its content from the new disk.

 

By the way, what's happened to the date & time on your posting

computer. A leap into the future?

Posted

Re: Catastrophic Corruption of Dynamic Disks

 

"Pegasus (MVP)" <I.can@fly.com.oz> wrote in message

news:OPig3O1bIHA.6024@TK2MSFTNGP06.phx.gbl...

> Congratulations! This proves that there is nothing wrong with

> Windows and that the problem lies with the boot environment.

> I would now launch diskmgmt.msc and make sure that the boot

> partition is marked "active".

 

Partition was definitely active, and I had checked for that per my procedure

posted below.

 

Other possible causes for the hardware to not be able to bootstrap? How

can I investigate that at a hardware level?

 

Microsoft should create some special version of a boot floppy (startup

option?) that tells the user how the hardware looks for the BIOS and reports

errors on possible mismatch between what it needs to see and what it

actually sees.

 

You could also report with such a utility the more obvious conditions, like

a partition not marked Active.

 

--

Will

 

> "Will" <westes-usc@noemail.nospam> wrote in message

> news:L-6dnaUCYOIOOSnanZ2dnUVZ_g6dnZ2d@giganews.com...

> > "Pegasus (MVP)" <I.can@fly.com.oz> wrote in message

> > news:O3Ab4dubIHA.1168@TK2MSFTNGP02.phx.gbl...

> >> Try to separate the boot process from the Windows startup process,

> >> by booting the machine with a Windows boot diskette. Format a

> >

> > I AM able to boot the system using a Windows boot diskette. So what

does

> > that suggest and what is further correction required to allow boot

without

> > floppy?

> >

> > --

> > Will

> >

> >

> >

> >> "Will" <westes-usc@noemail.nospam> wrote in message

> >> news:gq-dnewMEsqNbC7anZ2dnUVZ_rCtnZ2d@giganews.com...

> >> > In attempting to recover the boot device from our failure of dynamic

> >> > disks, we did these steps:

> >> >

> >> > 1) Converted the boot volume from Dynamic to Simple using DSKPROBE

from

> >> > inside ERD Commander 2005.

> >> > 2) Made the Simple partition Active, probably from Disk Management in

> > ERD

> >> > Commander 2005.

> >> > 3) Ran chkdsk /r from the Windows 2003 recovery console.

> >> > 4) Ran Fixboot from recovery console

> >> > 5) Ran FixMBR from recovery console

> >> >

> >> > In spite of all of these steps, any attempt to boot the from the

system

> >> > volume gets:

> >> >

> >> > "Error Loading Operating System"

> >> >

> >> > I normally associate that message with a BIOS configuration problem

or

> >> > a

> >> > hard drive cylinder mapping issue. I am not finding the problem in

> > our

> >> > case. Can someone give me more detail about this error and how to

> >> > overcome it?

> >> >

> >> > Would an incorrect disk number inside of BOOT.INI ever cause this

> >> > error?

> >> >

> >> > --

> >> > Will

> >> >

> >> >

> >> > "Will" <westes-usc@noemail.nospam> wrote in message

> >> > news:qZCdnVi46_lsWi7anZ2dnUVZ_rCtnZ2d@giganews.com...

> >> >>I had a really disappointing event take place today with a critical

> > system

> >> >>that runs Windows 2003 32-bit. Effectively *all* of the dynamic

disk

> >> >>structures were corrupted, even though we were not working on them at

> > the

> >> >>time of the reboot. Upon reboot the system gives some brief

message

> >> >>about the disk being corrupt, and the Windows 2003 boot sequence

never

> >> >>starts. Looking at all of the drives inside the Disk Management

utility

> > in

> >> >>Microsoft's ERD Commander Boot CD, the Dynamic volumes show in the

> >> >>state

> >> >>"Offline". A Google search seems to suggest that Dynamic volumes

in

> > an

> >> >>offline state normally means the Dynamic volume information is

corrupt

> > and

> >> >>cannot be loaded.

> >> >>

> >> >> What is particularly horrifying to me is we have two separate

hardware

> >> >> RAID controllers with three and five volumes respectively, and then

we

> >> >> had mirrored drives across those different controllers using Windows

> > 2003

> >> >> mirroring. When we rebooted, the corruption of the Dynamic volume

> >> >> information resulted in ALL EIGHT drives effectively disappearing

and

> >> >> going "offline". So while the benefit of Dynamic volumes carrying

> >> >> around information about other volumes on each volume has its

> > advantages,

> >> >> the downside of this system now becomes very clear to me. If you

hit

> > on

> >> >> any bug that writes the Dynamic volume information incorrectly, you

> >> >> are

> >> >> going to lose EVERYTHING on that system that is Dynamic!

> >> >>

> >> >> We are in the process of recovering by hacking volume structures to

> >> >> convert the Dynamic back to Simple volumes, and that so far seems to

> >> >> be

> >> >> going the right direction.

> >> >>

> >> >> Can someone explain to me under what scenarios this kind of dramatic

> >> >> corruption of Dynamic volume structures can take place? If I have

a

> >> >> loose end in my hardware, I need to know what possibilities to

chase.

> > I

> >> >> didn't have a good backup of this system since it was just being

built

> >> >> up, and losing it would have been a complete catastrophe.

> >> >>

> >> >> --

> >> >> Will

> >

> >

>

>

Posted

Re: Catastrophic Corruption of Dynamic Disks

 

"Pegasus (MVP)" <I.can@fly.com.oz> wrote in message

news:Osxifn1bIHA.1204@TK2MSFTNGP03.phx.gbl...

>

> "Will" <westes-usc@noemail.nospam> wrote in message

> news:fMednVwUeIbpISnanZ2dnUVZ_rSrnZ2d@giganews.com...

>> "Pegasus (MVP)" <I.can@fly.com.oz> wrote in message

>> news:OPig3O1bIHA.6024@TK2MSFTNGP06.phx.gbl...

>>> Congratulations! This proves that there is nothing wrong with

>>> Windows and that the problem lies with the boot environment.

>>> I would now launch diskmgmt.msc and make sure that the boot

>>> partition is marked "active".

>>

>> Partition was definitely active, and I had checked for that per my

>> procedure

>> posted below.

>>

>> Other possible causes for the hardware to not be able to bootstrap? How

>> can I investigate that at a hardware level?

>>

>> Microsoft should create some special version of a boot floppy (startup

>> option?) that tells the user how the hardware looks for the BIOS and

>> reports

>> errors on possible mismatch between what it needs to see and what it

>> actually sees.

>>

>> You could also report with such a utility the more obvious conditions,

>> like

>> a partition not marked Active.

>>

>> --

>> Will

>

> I suspect that the tools you used when converting your damaged

> partition, altered something that is essential for the boot-up process.

 

Probably this is not the case, because we were getting the "Error Loading

OS" boot time message even before we made any change to the Dynamic boot

volume.

 

> What it is I have no idea. I can now think of these options:

> a) Keep booting off the FDD.

> b) Make a bootable CD and keep booting off it.

> c) Partition and format a spare disk on some ***other*** machine.

> d) Boot your machine with a Bart PE boot CD. Now use robocopy.exe

> to copy the old disk to the new disk, then test the new disk. Do NOT

> boot the machine with both disks connected!

>

> Instead performing the robocopy process under a Bart PE boot, you

> could perform it while both disks are connected as slaves to some

> other machine.

>

> If the new disk works then you could format the system partition on

> the old disk and restore its content from the new disk.

 

Sounds like fun :)

 

I'm still thinking that somehow the computer BIOS is not finding the drive

it wants or that the geometry somehow doesn't match what it wants to see.

 

> By the way, what's happened to the date & time on your posting

> computer. A leap into the future?

 

The posting time of 3:10p is approximately when I remember doing the post.

Was I an hour ahead of clock? Sounds like a daylight savings time error on

the posting computer....

 

--

Will

Posted

Re: Catastrophic Corruption of Dynamic Disks

 

Not sure if this is part of our problem, but the controller reports a

different SCSI ID ordering of volumes than does the ERD Commander 2005 boot

environment. The boot volume is of course SCSI ID=0 as reported by the

controller. Inside of ERD Commander, it is reported as the highest SCSI

ID. In fact, all of our drives have an inverted SCSI ID ordering as seen

by ERD Commander. I'm not sure how that is even possible if Windows is

simply reporting its drive numbers as a reflection of the hardware drive ID

order.

 

What would such reordering of drives suggest, and is there a way to force a

reset of that mapping?

 

--

Will

 

 

"Pegasus (MVP)" <I.can@fly.com.oz> wrote in message

news:Osxifn1bIHA.1204@TK2MSFTNGP03.phx.gbl...

> "Will" <westes-usc@noemail.nospam> wrote in message

> news:fMednVwUeIbpISnanZ2dnUVZ_rSrnZ2d@giganews.com...

>> "Pegasus (MVP)" <I.can@fly.com.oz> wrote in message

>> news:OPig3O1bIHA.6024@TK2MSFTNGP06.phx.gbl...

>>> Congratulations! This proves that there is nothing wrong with

>>> Windows and that the problem lies with the boot environment.

>>> I would now launch diskmgmt.msc and make sure that the boot

>>> partition is marked "active".

>>

>> Partition was definitely active, and I had checked for that per my

>> procedure

>> posted below.

>>

>> Other possible causes for the hardware to not be able to bootstrap? How

>> can I investigate that at a hardware level?

>>

>> Microsoft should create some special version of a boot floppy (startup

>> option?) that tells the user how the hardware looks for the BIOS and

>> reports

>> errors on possible mismatch between what it needs to see and what it

>> actually sees.

>>

>> You could also report with such a utility the more obvious conditions,

>> like

>> a partition not marked Active.

>>

>> --

>> Will

>

> I suspect that the tools you used when converting your damaged

> partition, altered something that is essential for the boot-up process.

> What it is I have no idea. I can now think of these options:

> a) Keep booting off the FDD.

> b) Make a bootable CD and keep booting off it.

> c) Partition and format a spare disk on some ***other*** machine.

> d) Boot your machine with a Bart PE boot CD. Now use robocopy.exe

> to copy the old disk to the new disk, then test the new disk. Do NOT

> boot the machine with both disks connected!

>

> Instead performing the robocopy process under a Bart PE boot, you

> could perform it while both disks are connected as slaves to some

> other machine.

>

> If the new disk works then you could format the system partition on

> the old disk and restore its content from the new disk.

>

> By the way, what's happened to the date & time on your posting

> computer. A leap into the future?

>

Guest Edwin vMierlo [MVP]
Posted

Re: Catastrophic Corruption of Dynamic Disks

 

This is not solving your problem, but a little advice

 

The Dynamic disk implementation in Windows will write an LDM database header

to each and every dynamic disk on the same system. This means that if for

some reason this data (in a private region of the disk) gets corrupted, it

could potentially affect all dynamic disks on your system. Exactly as you

describe in your post.

 

Here is my take on dynamic disks :

 

<opinion>

Practically there should only be 1 reason that you use dynamic disk, and

that is to create a spanned volume over 2TB. And even this reason is now

decreased in priority by the support of GPT disks who support partition

sizes over 2TB.

Any other reason would be fround upon if you have hardware raid, either

internally or in a SAN.

</opinion>

 

There is an unfounded misconception about dynamic disk versus basic disk,

and that is "that it has a performance gain". No people, it has not, and

although "dynamic disk" sounds better than "basic disk" it has nothing to do

with performance.

 

So, if you are using dynamic disk, for the right reasons, for your "data"

disks, then ensure your boot and system partitions are on basic disk.

Microsoft even recommends this for SAN setups, boot/system/internal-disks on

basic when using dynamic on your SAN.

 

from this article http://support.microsoft.com/kb/816307 :

 

"If you decide to use dynamic disks and you have both locally attached

storage (IDE-based storage or Small Computer System Interface [sCSI]-based

storage) and storage that is located on a storage area network (SAN),

consider the following recommendations, depending on your situation:

 

. Use dynamic disks on only the SAN storage drives and keep the

locally attached storage as basic disks.

 

-or-

. Use basic disks on the SAN storage drives and configure the locally

attached storage as dynamic disks."

 

 

You can apply the same logic to a server with only internal disk

controllers, your data on dynamic, then your boot/system should be basic.

 

Again, please use dynamic disks for the right (technical) reasons,

HTH,

Edwin.

 

 

 

 

"Will" <westes-usc@noemail.nospam> wrote in message

news:qZCdnVi46_lsWi7anZ2dnUVZ_rCtnZ2d@giganews.com...

> I had a really disappointing event take place today with a critical system

> that runs Windows 2003 32-bit. Effectively *all* of the dynamic disk

> structures were corrupted, even though we were not working on them at the

> time of the reboot. Upon reboot the system gives some brief message

about

> the disk being corrupt, and the Windows 2003 boot sequence never starts.

> Looking at all of the drives inside the Disk Management utility in

> Microsoft's ERD Commander Boot CD, the Dynamic volumes show in the state

> "Offline". A Google search seems to suggest that Dynamic volumes in an

> offline state normally means the Dynamic volume information is corrupt and

> cannot be loaded.

>

> What is particularly horrifying to me is we have two separate hardware

RAID

> controllers with three and five volumes respectively, and then we had

> mirrored drives across those different controllers using Windows 2003

> mirroring. When we rebooted, the corruption of the Dynamic volume

> information resulted in ALL EIGHT drives effectively disappearing and

going

> "offline". So while the benefit of Dynamic volumes carrying around

> information about other volumes on each volume has its advantages, the

> downside of this system now becomes very clear to me. If you hit on any

> bug that writes the Dynamic volume information incorrectly, you are going

to

> lose EVERYTHING on that system that is Dynamic!

>

> We are in the process of recovering by hacking volume structures to

convert

> the Dynamic back to Simple volumes, and that so far seems to be going the

> right direction.

>

> Can someone explain to me under what scenarios this kind of dramatic

> corruption of Dynamic volume structures can take place? If I have a

loose

> end in my hardware, I need to know what possibilities to chase. I didn't

> have a good backup of this system since it was just being built up, and

> losing it would have been a complete catastrophe.

>

> --

> Will

>

>

Posted

Re: Catastrophic Corruption of Dynamic Disks

 

Well, after we had this bad experience with Dynamic Disk corruption taking

out every single one of eight volumes on the system, I certainly understand

why you are not eager to use them for your own applications. But I have a

slightly different take on this. Dynamic disks have been the only way we

have found to make a consistent disk image of the boot volume that does not

corrupt or miss backing up the registry files while they are in use by the

OS. Microsoft (and Veritas, since the technology is OEM from them) have

some kind of very low level notification technology there when you "break" a

dynamic mirror to force the OS to write out consistent versions of the

registry files. I have many times broken a mirror, removed the drive from

the system, and then later used that drive to immediately recover a crashed

boot volume. It's extremely reliable technology, and extremely good at

what it does.

 

After this experience losing all of our Dynamic volumes, I am keen on the

idea that we should take our dynamic disk backups, then find a way to

disconnect the storage in our script and power off the backed up drives.

That way if we corrupt the dynamic disk structures, the backups are offline

and intact and can be used for immediate recovery while the dynamic disks

are manually recovered.

 

Using Symantec Storage Foundation for Windows Basic (which is free), you get

some enhancements to Dynamic Disks, most notably the ability to script

mirroring and breaking off of drives. Our nightly backups of the host

that runs virtual machines looks something like this:

 

1) Resynchronize the dynamic volumes

2) Stop all virtual machines (to guarantee consistent disk images of the

virtual machines)

3) Break off the dynamic volume

4) Restart all VMs

5) Backup the "broken" mirror to tape

 

Step 3 only takes about 10 minutes for about 300 MB, so our virtual machines

only have about 10 minutes of downtime to ensure consistent images are

available for backup to tape later on.

 

After this bad experience, I would like to modify the above with:

 

0) Turn on the drive array with the backup drives programmatically and wait

for it to signal ready

....

6) Flush the broken mirror to disk.

7) Remove the drive letter from the broken mirror.

8) Power off the drive array programmatically.

 

To do 0) and 8), I need to find a PDU that comes with a command line

interface that works under Windows. I don't suppose you know of one?

 

--

Will

 

 

"Edwin vMierlo [MVP]" <EdwinvMierlo@discussions.microsoft.com> wrote in

message news:uy58pd8bIHA.5208@TK2MSFTNGP04.phx.gbl...

> This is not solving your problem, but a little advice

>

> The Dynamic disk implementation in Windows will write an LDM database

> header

> to each and every dynamic disk on the same system. This means that if for

> some reason this data (in a private region of the disk) gets corrupted, it

> could potentially affect all dynamic disks on your system. Exactly as you

> describe in your post.

>

> Here is my take on dynamic disks :

>

> <opinion>

> Practically there should only be 1 reason that you use dynamic disk, and

> that is to create a spanned volume over 2TB. And even this reason is now

> decreased in priority by the support of GPT disks who support partition

> sizes over 2TB.

> Any other reason would be fround upon if you have hardware raid, either

> internally or in a SAN.

> </opinion>

>

> There is an unfounded misconception about dynamic disk versus basic disk,

> and that is "that it has a performance gain". No people, it has not, and

> although "dynamic disk" sounds better than "basic disk" it has nothing to

> do

> with performance.

>

> So, if you are using dynamic disk, for the right reasons, for your "data"

> disks, then ensure your boot and system partitions are on basic disk.

> Microsoft even recommends this for SAN setups, boot/system/internal-disks

> on

> basic when using dynamic on your SAN.

>

> from this article http://support.microsoft.com/kb/816307 :

>

> "If you decide to use dynamic disks and you have both locally attached

> storage (IDE-based storage or Small Computer System Interface [sCSI]-based

> storage) and storage that is located on a storage area network (SAN),

> consider the following recommendations, depending on your situation:

>

> . Use dynamic disks on only the SAN storage drives and keep the

> locally attached storage as basic disks.

>

> -or-

> . Use basic disks on the SAN storage drives and configure the locally

> attached storage as dynamic disks."

>

>

> You can apply the same logic to a server with only internal disk

> controllers, your data on dynamic, then your boot/system should be basic.

>

> Again, please use dynamic disks for the right (technical) reasons,

> HTH,

> Edwin.

>

>

>

>

> "Will" <westes-usc@noemail.nospam> wrote in message

> news:qZCdnVi46_lsWi7anZ2dnUVZ_rCtnZ2d@giganews.com...

>> I had a really disappointing event take place today with a critical

>> system

>> that runs Windows 2003 32-bit. Effectively *all* of the dynamic disk

>> structures were corrupted, even though we were not working on them at the

>> time of the reboot. Upon reboot the system gives some brief message

> about

>> the disk being corrupt, and the Windows 2003 boot sequence never starts.

>> Looking at all of the drives inside the Disk Management utility in

>> Microsoft's ERD Commander Boot CD, the Dynamic volumes show in the state

>> "Offline". A Google search seems to suggest that Dynamic volumes in an

>> offline state normally means the Dynamic volume information is corrupt

>> and

>> cannot be loaded.

>>

>> What is particularly horrifying to me is we have two separate hardware

> RAID

>> controllers with three and five volumes respectively, and then we had

>> mirrored drives across those different controllers using Windows 2003

>> mirroring. When we rebooted, the corruption of the Dynamic volume

>> information resulted in ALL EIGHT drives effectively disappearing and

> going

>> "offline". So while the benefit of Dynamic volumes carrying around

>> information about other volumes on each volume has its advantages, the

>> downside of this system now becomes very clear to me. If you hit on any

>> bug that writes the Dynamic volume information incorrectly, you are going

> to

>> lose EVERYTHING on that system that is Dynamic!

>>

>> We are in the process of recovering by hacking volume structures to

> convert

>> the Dynamic back to Simple volumes, and that so far seems to be going the

>> right direction.

>>

>> Can someone explain to me under what scenarios this kind of dramatic

>> corruption of Dynamic volume structures can take place? If I have a

> loose

>> end in my hardware, I need to know what possibilities to chase. I

>> didn't

>> have a good backup of this system since it was just being built up, and

>> losing it would have been a complete catastrophe.

>>

>> --

>> Will

>>

>>

>

>

×
×
  • Create New...