What am I doing wrong here?

Other questions, messages

Moderator: feffer

Post Reply
nimiak
Posts: 1
Joined: Tue Sep 07, 2004 7:58 am

What am I doing wrong here?

Post by nimiak » Tue Sep 07, 2004 8:20 am

Hi everyone,

I'm using partimage to backup and restore a Redhat Linux ES 3.0 system setup with software RAID-1 mirroring across 2 disks, each setup with 3 partitions. Here's what the disks look like:

Disk 1:
/dev/hda1 = /boot
/dev/hda2 = swap
/dev/hda3 = /

Disk 2:
/dev/hdb1 = /boot
/dev/hdb2 = swap
/dev/hdb3 = /

Raid devices:
/dev/md0 = /dev/hd[ab]1
/dev/md1 = /dev/hd[ab]2
/dev/md2 = /dev/hd[ab]3

Here's what I do to backup the system:

1) Boot up with Knoppix and use mdadm to assemble the 3 RAID devices (md[012]).
2) Use partimage to backup the raid devices /dev/md0, /dev/md1, /dev/md2 onto network drive
3) Backup mbr for both drives using dd (i.e. "dd if=/dev/hd[ab] of=/mnt/share/hd[ab].mbr bs=512 count=1").

Here's what I do to restore the system:

1) Boot up with Knoppix.
2) Use fdisk to create /boot, swap, and / partitions on both drives
(Set partitions types to Autodetect Raid)
3) Use mdadm to create and assemble RAID devices (md[012])
4) Use partimage to restore images onto the raid devices /dev/md0, /dev/md1, /dev/md2
5) Restore mbr onto both drives using dd (i.e. "dd if=/mnt/share/hd[ab].mbr of=/dev/hd[ab] bs=512 count=1")

Before rebooting the machine, I verify that the RAID devices exist and the data has been properly restored.

When I reboot the machine, I see the grub screen and it attempts to boot the kernel. However, when Redhat is booting up it can't detect the RAID devices and panics because it can't mount root.

The funny thing is that if I reboot the machine again and perform the exact same restore steps above a second time, and then reboot the machine ...Redhat boots fine and the system is restored to its original state.

Is there something I'm doing wrong in the restore or backup steps? Am I overwriting meta-data in one of the steps? Why won't the machine detect the RAID devices the first time around, but works after a second round of restores? I'd appreciate if someone could help unravel the mystery...

Thanks in advance,
-Kai

Guest

Re: What am I doing wrong here?

Post by Guest » Tue Oct 12, 2004 5:14 pm

nimiak wrote:Here's what I do to backup the system:
1) Boot up with Knoppix and use mdadm to assemble the 3 RAID devices (md[012])
I should think the RAID devices would already be running. You should probably be using mdadm to break the mirror, see below.
nimiak wrote:2) Use partimage to backup the raid devices /dev/md0, /dev/md1, /dev/md2 onto network drive
Just to be sure, Fail and Remove one of the actual RAID disks and then make your image of that individual partition, not the whole /dev/md. After you are done, regenerate the array. I've found this approach, while a real hassle, seems to work consistently.
nimiak wrote:Here's what I do to restore the system:

1) Boot up with Knoppix.
2) Use fdisk to create /boot, swap, and / partitions on both drives
(Set partitions types to Autodetect Raid)
3) Use mdadm to create and assemble RAID devices (md[012])
4) Use partimage to restore images onto the raid devices /dev/md0, /dev/md1, /dev/md2
5) Restore mbr onto both drives using dd (i.e. "dd if=/mnt/share/hd[ab].mbr of=/dev/hd[ab] bs=512 count=1")
Well, I'd say create the RAID devices, but then like before fail and remove the second of the two disks in the RAID 1 array and restore to the remaining disk (partition).. the one you made the image of. Regenerate when you're done. Also, in your step 5, you are probably overwriting the partition table that you worked hard to recreate in step 2. I should think all you really need is the MBR itself, the first 448 bytes of your 512 byte backup file. You already have the new parition table (the last 64 bytes).

Post Reply