Tonight I installed a new system disk in the main feep.org server, in response to these ominous warnings:
(da0:sym0:0:0:0): WRITE(10). CDB: 2a 0 0 30 1 1f 0 0 20 0
(da0:sym0:0:0:0): CAM Status: SCSI Status Error
(da0:sym0:0:0:0): SCSI Status: Check Condition
(da0:sym0:0:0:0): Deferred Error: HARDWARE FAILURE asc:19,0
(da0:sym0:0:0:0): Defect list error field replaceable unit: 2
(da0:sym0:0:0:0): Retrying Command (per Sense Data)
The spare disk was already in the server (it was being used for staging backups, but I moved that elsewhere for now), and was identical to the failing one. So all I really needed to do was duplicate the boot block, partition table, disk label, and filesystems. I was able to do most of this while the system was running normally, which kept downtime to a minimum. In the examples, da0 is the failing disk, and da1 is the new target disk.
First, initialize the disk by creating a DOS partition table (mine already had one):
fdisk -I /dev/da1
Install the boot block:
fdisk -b /boot/mbr
Duplicate the disklabel:
disklabel da0s1 | disklabel -R da1s1 -
Create file systems:
newfs /dev/da1s1a
newfs -U /dev/da1s1e
newfs -U /dev/da1s1f
newfs -U /dev/da1s1d
Copy data:
mount /dev/da1s1a /mnt cd / && tar -cf - --one-file-system --exclude /mnt . \ | ( cd /mnt && tar -xpf ) umount /mnt mount /dev/da1s1e /mnt cd /tmp && tar -cf - . | ( cd /mnt && tar -xpf ) umount /mnt mount /dev/da1s1f /mnt cd /usr && tar -cf - . | ( cd /mnt && tar -xpf ) umount /mnt mount /dev/da1s1d /mnt cd /var && tar -cf - . | ( cd /mnt && tar -xpf ) umount /mnt
Now I intended to boot into single-user mode to finish with a final rsync on a very quiet system, but I couldn't boot from da0 any longer; it seemed to be getting worse. So I booted from FreeBSD 5.5 Disc 1 and used the Live CD feature to finish up with a few rsyncs (using the rsync binary from da1, since the Live CD doesn't include it).
mkdir /da0 /da1 cd /da0 && mkdir root tmp usr var cd /da1 && mkdir root tmp usr var mount /dev/da0s1a /da0/root mount /dev/da1s1a /da1/root mount /dev/da0s1e /da0/tmp mount /dev/da1s1e /da1/tmp mount /dev/da0s1f /da0/usr mount /dev/da1s1f /da1/usr mount /dev/da0s1d /da0/var mount /dev/da1s1d /da1/var /da1/usr/local/rsync -avx /da0/root/ /da1/root /da1/usr/local/rsync -avx /da0/tmp/ /da1/tmp /da1/usr/local/rsync -avx /da0/usr/ /da1/usr /da1/usr/local/rsync -avx /da0/var/ /da1/var umount /dev/da0s1a umount /dev/da1s1a umount /dev/da0s1e umount /dev/da1s1e umount /dev/da0s1f umount /dev/da1s1f umount /dev/da0s1d umount /dev/da1s1d
That done, I yanked da0, put da1 in its place, and a fresh disk in da1's place. Everything came up with no problems.
In this post, I had a heck of a time with the boolean AND operator (&&) because Blogger insisted on helpfully substituting & for each ampersand every time I edited the post in the Blogger composer. This happens even inside code and pre blo
Tracked: May 07, 22:27