This is an old revision of the document!
So I upgraded Ubuntu 9.10 to Ubuntu 11.10. When the system boots it says it cannot mount /store
, my 960GB RAID+LVM file-system. That the one that holds over 10 years of personal photographs and such.
There are many layers of indirection between the file-system and the physical storage when using LVM or RAID. When using both, the number of layers can seem excessive. Here's a diagram of the layers involved in my (lost) setup:
Note that the RAID block device, md0
, is not partitioned. I believe that was a mistake on my part, and a likely reason why Ubuntu 11.10 cannot auto-detect it.
Since upgrading system boot is interrupted with an error screen to the affect of “Cannot mount /store” and a prompt to enter a root shell or skip mounting. From what I can see, the RAID array is detected without problems, and is functioning correctly. So the system looks like this:
The RAID (multi-disk) status looks fine to me:
root@ikari:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md127 : active raid5 sdb[1] sde[3] sdc[0] sdd[2] sdf[4](S) 937713408 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] unused devices: <none>
but the resulting 960.2 GB block device is partitioned as a “Linux RAID autodetect” - which would suggest that it is a *member* of some other multi-disk setup. This, I believe, is human error on my part when I created the thing…
root@ikari:~# fdisk -l /dev/md127 Disk /dev/md127: 960.2 GB, 960218529792 bytes 255 heads, 63 sectors/track, 116739 cylinders, total 1875426816 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 65536 bytes / 196608 bytes Disk identifier: 0xd71c877b Device Boot Start End Blocks Id System /dev/md127p1 63 625137344 312568641 fd Linux RAID autodetect Partition 1 does not start on physical sector boundary.
md127
, partition table and all (required buying an external 2TB USB drive)
So I created a single “Linux LVM” partition on the 2TB disk, created a single 1.8TB physical volume and a single 1.8TB volume group containing it. On this I created a 1TB logical volume called lv_scratch
and copied the contents of md127
to it (e.g. dd if=/dev/md127 of=/dev/mapper/lv_scratch
). Once the copy was made, I created a snapshot of lv_scratch
which I imaginatively called snap
.
LVM snapshots are interesting creatures. As the name suggests, the snapshot (named snap
) holds the state of lv_scratch
as it was when I created it. I can still read and write to lv_scratch
, but the contents of snap
will not change. This is ideal for making consistent backups. The snapshot works by deferring any writes to lv_scratch
and placing them instead in some temporary copy-on-write (COW) volume. All access to lv_scratch
consults the COW volume - when there is a hit it is returned, otherwise the original (unchanged) lv_scratch
is read. When the snapshot is deleted, the deferred changes stored in snap
are written to lv_scratch
and become permanent. Makes sense if you are used to copy-on-write behaviour.
Now here is where things get interesting. The snapshot, snap
, does not have to be read-only: you can create it read-write. Doing so gives you a very cheap copy of lv_scratch
, and any changes you make to the snapshot are stored in a COW table. You can discard the changes by deleting the snapshot. Ideal for my situation: I want to experiment with the partition table and various file-system recovery tools etc. I let these manipulate the snapshot, and if things go bad I delete and recreate the snapshot and try over.
root@ikari:~# fdisk -l /dev/sdj Disk /dev/sdj: 2000.4 GB, 2000398934016 bytes 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x000f0222 Device Boot Start End Blocks Id System /dev/sdj1 2048 3907028991 1953513472 8e Linux LVM
root@ikari:~# pvdisplay /dev/dm-0: read failed after 0 of 4096 at 0: Input/output error /dev/dm-1: read failed after 0 of 4096 at 0: Input/output error /dev/dm-2: read failed after 0 of 4096 at 0: Input/output error /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error --- Physical volume --- PV Name /dev/sdj1 VG Name vg_scratch PV Size 1.82 TiB / not usable 4.00 MiB Allocatable yes PE Size 4.00 MiB Total PE 476931 Free PE 86787 Allocated PE 390144 PV UUID nrf9cQ-Asfz-Y2x2-SDoT-3ppu-mpEC-Fnuf8Z root@ikari:~# vgdisplay /dev/dm-0: read failed after 0 of 4096 at 0: Input/output error /dev/dm-1: read failed after 0 of 4096 at 0: Input/output error /dev/dm-2: read failed after 0 of 4096 at 0: Input/output error /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error --- Volume group --- VG Name vg_scratch System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 13 VG Access read/write VG Status resizable MAX LV 0 Cur LV 2 Open LV 0 Max PV 0 Cur PV 1 Act PV 1 VG Size 1.82 TiB PE Size 4.00 MiB Total PE 476931 Alloc PE / Size 390144 / 1.49 TiB Free PE / Size 86787 / 339.01 GiB VG UUID Lk7UZP-48xF-vBPi-6g8F-sXlF-qyzy-pQNKgq root@ikari:~# lvdisplay /dev/dm-0: read failed after 0 of 4096 at 0: Input/output error /dev/dm-1: read failed after 0 of 4096 at 0: Input/output error /dev/dm-2: read failed after 0 of 4096 at 0: Input/output error /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error --- Logical volume --- LV Name /dev/vg_scratch/lv_scratch VG Name vg_scratch LV UUID aFBpgv-gqcd-jjLU-c7xO-Jyeb-2R0t-HpEF84 LV Write Access read/write LV snapshot status source of /dev/vg_scratch/snap [active] LV Status available # open 0 LV Size 1.00 TiB Current LE 262144 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:1 --- Logical volume --- LV Name /dev/vg_scratch/snap VG Name vg_scratch LV UUID OvOsQ7-uACi-xJVZ-vseu-fKEc-F73h-CmSalH LV Write Access read/write LV snapshot status active destination for /dev/vg_scratch/lv_scratch LV Status available # open 0 LV Size 1.00 TiB Current LE 262144 COW-table size 500.00 GiB COW-table LE 128000 Allocated to snapshot 0.00% Snapshot chunk size 4.00 KiB Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:3
So here's the goal I'm aiming for on my external storage:
Since it's been a few days and reboots since I last worked on this, I'll start by plugging the USB drive it.
root@ikari:~# dmesg [ 479.180019] usb 2-5: new high speed USB device number 7 using ehci_hcd [ 479.313228] scsi13 : usb-storage 2-5:1.0 [ 480.312605] scsi 13:0:0:0: Direct-Access Seagate Desktop 0130 PQ: 0 ANSI: 4 [ 480.336633] sd 13:0:0:0: Attached scsi generic sg10 type 0 [ 480.337029] sd 13:0:0:0: [sdi] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB) [ 480.337671] sd 13:0:0:0: [sdi] Write Protect is off [ 480.337671] sd 13:0:0:0: [sdi] Mode Sense: 2f 08 00 00 [ 480.340027] sd 13:0:0:0: [sdi] No Caching mode page present [ 480.340027] sd 13:0:0:0: [sdi] Assuming drive cache: write through [ 480.341806] sd 13:0:0:0: [sdi] No Caching mode page present [ 480.341811] sd 13:0:0:0: [sdi] Assuming drive cache: write through [ 480.357290] sdi: sdi1 [ 480.359346] sd 13:0:0:0: [sdi] No Caching mode page present [ 480.359350] sd 13:0:0:0: [sdi] Assuming drive cache: write through [ 480.359354] sd 13:0:0:0: [sdi] Attached SCSI disk
The (outer) LVM PVs are automatically detects, and their VGs + LVs are subsequently detected:
root@ikari:~# pvs PV VG Fmt Attr PSize PFree /dev/sdi1 vg_scratch lvm2 a- 1.82t 339.01g root@ikari:~# vgs VG #PV #LV #SN Attr VSize VFree vg_scratch 1 2 1 wz--n- 1.82t 339.01g root@ikari:~# lvs LV VG Attr LSize Origin Snap% Move Log Copy% Convert lv_scratch vg_scratch owi-a- 1.00t snap vg_scratch swi-a- 500.00g lv_scratch 0.00
Somewhere on the snap
logcial volume is my nested LVM. I used xxd /dev/vg_scratch/snap | less
and searched for LVM2
. The first hit was a false-positive (appeared to have stripes of NULLs written across it), but the second hit looked plausible:
8018600: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018610: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018620: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018630: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018640: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018650: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018660: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018670: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018680: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018690: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80186a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80186b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80186c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80186d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80186e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80186f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018700: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018710: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018720: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018730: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018740: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018750: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018760: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018770: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018780: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018790: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80187a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80187b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80187c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80187d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80187e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80187f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018800: 4c41 4245 4c4f 4e45 0100 0000 0000 0000 LABELONE........ 8018810: 9148 4053 2000 0000 4c56 4d32 2030 3031 .H@S ...LVM2 001 8018820: 5341 7536 6e32 7578 474c 5148 6743 5351 SAu6n2uxGLQHgCSQ 8018830: 6b56 6b5a 655a 4c78 7874 314b 7652 6a31 kVkZeZLxxt1KvRj1 8018840: 00f8 0391 df00 0000 0000 0300 0000 0000 ................ 8018850: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018860: 0000 0000 0000 0000 0010 0000 0000 0000 ................ 8018870: 00f0 0200 0000 0000 0000 0000 0000 0000 ................ 8018880: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 8018890: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 80188a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
I know from using xxd
to examine a correct and functioning LVM2 partition (the PV behind vg_scratch
as it happens) that the “LVM2” text should appear at 0x210. So I'll create a loopback device with an appropriate offset to make that happen:
root@ikari:~# losetup /dev/loop0 /dev/vg_scratch/snap --offset $((0x8018600)) root@ikari:~# lvmdiskscan /dev/ram0 [ 64.00 MiB] /dev/loop0 [ 894.15 GiB] LVM physical volume /dev/dm-0 [ 186.27 GiB] /dev/ram1 [ 64.00 MiB] /dev/sda1 [ 294.09 GiB] /dev/dm-1 [ 894.27 GiB] /dev/ram2 [ 64.00 MiB] /dev/dm-2 [ 894.27 GiB] /dev/ram3 [ 64.00 MiB] /dev/dm-3 [ 894.27 GiB] /dev/ram4 [ 64.00 MiB] /dev/dm-4 [ 782.47 GiB] /dev/ram5 [ 64.00 MiB] /dev/sda5 [ 4.00 GiB] /dev/dm-5 [ 715.38 GiB] /dev/ram6 [ 64.00 MiB] /dev/ram7 [ 64.00 MiB] /dev/ram8 [ 64.00 MiB] /dev/ram9 [ 64.00 MiB] /dev/ram10 [ 64.00 MiB] /dev/ram11 [ 64.00 MiB] /dev/ram12 [ 64.00 MiB] /dev/ram13 [ 64.00 MiB] /dev/ram14 [ 64.00 MiB] /dev/ram15 [ 64.00 MiB] /dev/sdb1 [ 1.82 TiB] LVM physical volume 0 disks 24 partitions 0 LVM physical volume whole disks 2 LVM physical volumes root@ikari:~# pvs PV VG Fmt Attr PSize PFree /dev/loop0 store_vg lvm2 a- 894.25g 178.88g /dev/sdb1 vg_scratch lvm2 a- 1.82t 0
If lvmdiskscan
doesn work you could try using partprobe
to tell the Kernel to rescan partition tables and do what it does.
root@ikari:~# lvs LV VG Attr LSize Origin Snap% Move Log Copy% Convert store_lv store_vg -wi-a- 715.38g home_zfs vg_scratch -wi-a- 186.27g lv_scratch vg_scratch owi-a- 894.27g snap vg_scratch swi-ao 782.47g lv_scratch 0.00