Clone of the Intel Fake Raid

2013-02-27

The cops aren't coming, neither is the clone army.

But how do you Clone a fake raid 1 array to another machine, and with what software?

What is fake raid anyway? (software raid)

That depends, on what you buy and whom from, each vendor has their variations and then tend not be compatible. Hereabouts I talk of Raid 1, which was was I was trying to clone recently onto another machine, and in fact several other machines.

In fake raid as opposed to proper hardware raid in a raid 1 configuration a controller card magically mirrors writes across two hard disks, that are, or almost are duplicates of each other. Whereas 'proper' raid cards will cost a lot of money, have a backed up battery and do all the raid stuff on the card. The cheaper fake raid cards (sometimes called host raid) don't.

For the most part the card itself will mirror writes to another drive. Then, a software driver on your operating system will (or may) perform more exotic operations, one of which is schedule drive reads from the two mirrored arrays separately to speed up data reads (so yes, raid 1 can be faster than a plain disk, even though it's not stripped as in raid 0). It's not guaranteed that your particular make of fake raid will do this, however.

The Intel family of controllers are known to do this and if you don't restore the clone of the raid array properly with a given drive imaging tool, then you will find this out quickly as your operating system, which usually means windows here will start booting, get half way though and then throw a big wobly and reboot, or blue screen.

So the fake raid typically consists of three main components:

  1. The raid card, that mirrors writes onto both drives
  2. A software driver that does 'extra' stuff
  3. A software tool, to rebuild a degraded array

Note that #2 above isn't necessary for reading and writing to the array. But if your raid array is degraded - i.e. one disk has gone 'bad' then you will need tool #3 to rebuild it - and it may or may not need #2, the driver to help it.

On to cloning

In order for the raid array to work properly it must store it's configuration somewhere and with Intel Fake raid arrays that is on the disk. What it stores there is the configuration of the array, as well as the status of the drive. One vitally important piece of information the array stores is the drive serial numbers. The raid card reads this information from both drives, and both drives will store information on both disks in the system - this way if a drive fails the other drive will have the raid configuration. If both drives fail, well yes, that is the end anyway.

Now if you come along, and restore a clone image you took off  drive 1 in system A and restore it to drive 1 of system B, one of two things will happen:

a. The controller won't notice drive now differs from drive b
The system will start to boot and then break before getting very far
b. The controller will complain drive 1 is not a member of the array, and will, upon booting the operating system, invite you to rebuild it from disk 2 - Which will overwrite what you have just done and isn't what you want.

In situation (a) you probably used clonezilla, Redo Backup, PING, and countless other tools that only copy used disk space to save time.

In situation (b) you probably used dd, or another tool that does raw imaging.

So what went wrong?

Situation (a) Clonezilla, Redo Backup, PING, etc... don't copy unused space, which means they don't touch the raid volume information at the end of the disk. So disk 1 on system B has changed radically from disk 2, but the controller doesn't know.

Solution:

(i) Restore both disks, and don't reboot into the OS until both disks are restored. 
(ii) Zero the last four sectors on drive 2 of system B. That way the raid controller will take it out of raid and allow you to rebuild disk 1 onto disk 2 which is what you want.
(iii)Better yet - this is the best solution. Use a drive image tool/OS that supports raid 1, and then you won't have this issue at all and everything will work.

Situation (b): disk 1 has been ejected from the raid array because the controller thinks that is doesn't belong there as it's serial number doesn't match against disk 2. But wait you say, how does the controller know which disk has the proper raid information? Well because it knows that one disk is still there from it's serial number, as recorded in the array, so that is the source of truth.

Solution:
(i) Don't override the existing array information. Which is tricky as it may vary in size, depending on what controller you use, but you could, for example not copy the last MB of disk, that *should* do it. Remember too, I speak of Intel raid arrays here. Other fake raid systems might do things a bit different - for example they may *just work* and not store anything on the disk, but rather in flash.
(ii) The best way, again is to use a tool/OS that recognises raid 1