Today I learned about backup GUID partition tables. Let me explain how.
I recently moved my virtualization setup at home to the newly released Minisforum MS-A2. The experience so far has been solid, besides it being slightly noisier than I expected.
As part of the move, I migrated all my VMs over to the new machine. When I ordered the MS-A2, I only got a single Lexar NM790 4TB drive initially as I still had the other 4TB SSD still built into the old home server. The idea was that I could tear it off the old server and plug it into the MS-A2 (as it has 3 NVMe slots, albeit with different speeds) after the migration.
But after moving all the VMs I faced a dilemma: I was lacking the space to hibernate them for machine maintenance. Shutting them down instead of hibernating them would cause me a great deal of pain - restarting the Valar VM instances is sadly non-trivial.
Thus, I needed to look for space somewhere else. One of the largest logical disks I have mounted to VMs was a 2.44TiB volume used for storing RAW files for my photography collection. Only 1.3TiB was unused, so I could easily reduce the disk size temporarily.
So I came up with a plan.
# This happens inside the VM that is normally mounting that volume.
# First, make sure /dev/sdb1 is good and happy.
e2fsck -f /dev/sdb1
# Next, resize the partition to 1.7TiB.
resize2fs /dev/sdb1 1.7T
For the record, here is the resulting layout as reported by fdisk
.
Disk /dev/pve/vm-104-disk-1: 2.44 TiB, 2684354560000 bytes, 5242880000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: gpt
Disk identifier: 3637D381-0EFA-44FE-89E3-016A3BA2C60F
Device Start End Sectors Size Type
/dev/pve/vm-104-disk-1p1 40 3565158439 3565158400 1.7T Linux filesystem
Now, I could get to work with lvreduce
. As I already knew we need some extra space around that single partition, I reduced the disk to 1.75TiB.
$ lvchange -an /dev/pve/vm-104-disk-1
$ lvreduce -L 1.75T /dev/pve/vm-104-disk-1
Size of logical volume pve/vm-104-disk-1 changed from 2.44 TiB (640000 extents) to 1.75 TiB (458752 extents).
Logical volume pve/vm-104-disk-1 successfully resized.
$ lvchange -ay /dev/pve/vm-104-disk-1
Should be good right? I started the VM using the disk and was shocked to see the following lsblk
output.
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 600G 0 disk
└─sda1 8:1 0 600G 0 part /
sdb 8:16 0 2.4T 0 disk
sr0 11:0 1 1024M 0 rom
You might not notice it immediately, but my /dev/sdb1
partition was suddenly missing. My first thought was to investigate using fdisk -l /dev/sdb
, only being left more confused afterwards.
$ fdisk -l /dev/sdb
Disk /dev/sdb: 1.75 TiB, 1924145348608 bytes, 3758096384 sectors
Disk model: QEMU HARDDISK
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000
Device Boot Start End Sectors Size Id Type
/dev/sdb1 1 4294967295 4294967295 2T ee GPT
Where did my GUID partition table go? And why is there this weird GPT partition entry at sector 1?
This is when I learned about GUID partition tables primary and backup locations: A standard GPT layout includes a primary GPT header and partition array at the beginning of the disk and a secondary (backup) GPT header and partition array at the very end of the disk.
So when I shrunk the disk, I correctly protected the data inside the EXT4 partition – but my backup GPT header still went to the dogs. And fdisk
gets very confused when that backup GPT header is missing as it tries to look for that backup header and finds garbage data, falling back to the protective MBR.
Using gdisk
revealed the problem in a more descriptive fashion.
$ gdisk /dev/pve/vm-104-disk-1
GPT fdisk (gdisk) version 1.0.9
Warning! Disk size is smaller than the main header indicates! Loading
secondary header from the last sector of the disk! You should use 'v' to
verify disk integrity, and perhaps options on the experts' menu to repair
the disk.
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.
Warning! One or more CRCs don't match. You should repair the disk!
Main header: OK
Backup header: ERROR
Main partition table: OK
Backup partition table: ERROR
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: damaged
****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
So how to fix it? Just write the GUID partition table using gdisk
again. That’s it.
Command (? for help): w
Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!
Do you want to proceed? (Y/N): Y
OK; writing new GUID partition table (GPT) to /dev/pve/vm-104-disk-1.
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot or after you
run partprobe(8) or kpartx(8)
The operation has completed successfully.
Now, even fdisk
got the note.
$ fdisk -l /dev/pve/vm-104-disk-1
Disk /dev/pve/vm-104-disk-1: 1.75 TiB, 1924145348608 bytes, 3758096384 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: gpt
Disk identifier: 3637D381-0EFA-44FE-89E3-016A3BA2C60F
Device Start End Sectors Size Type
/dev/pve/vm-104-disk-1p1 40 3565158439 3565158400 1.7T Linux filesystem
Afterwards, I was able to set the discard
flag on the mounted disk in my VM and use fstrim
on the /dev/sdb1
partition to reclaim the unused space in my volume group.