Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
How to backup a running system
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 882

PostPosted: Mon Jan 30, 2023 2:15 pm    Post subject: How to backup a running system Reply with quote

Hi,

I'm running a system with 2 disks in RAID 1 (root has LVM).

I also have another 2 spare disks connected to the motherboard. THese disks can be formatted or not, but they should always be the same size as the first two or bigger.

I would like to "fully clone" the running system (in the "first" 2 disks) to the backup system (the other 2 disks).

I will never boot the "backup system" while the "production disks" are connected, ie. if I ever boot the backup system it will be after disconnecting every other disk and leaving only the 2 backup disks in the motherboard (I'm saying this becayse of possible GRUB boot errors with root device naming).

So, my running system has this fstab (not using by-id or by-uuid):

Code:
/dev/vgroot/root        /               ext4    noatime         0       1
/dev/md/3       /boot           ext2    noauto,noatime  1       2
/dev/sdb2               /boot/efi       vfat    noauto,noatime  1       2
/dev/md/4       none            swap    sw              0       0
proc    /proc   proc    defaults        0       0
shm     /dev/shm        tmpfs   nodev,nosuid,noexec     0       0
tmpfs   /tmp            tmpfs   size=4G,noatime         0       0
tmpfs   /var/tmp        tmpfs   size=12G,noatime        0       0


This is the relationship between logical and physical drives in LVM:

Code:
# pvdisplay -m
  --- Physical volume ---
  PV Name               /dev/sdb5
  VG Name               vgroot
  PV Size               187.14 GiB / not usable <3.59 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              47907
  Free PE               0
  Allocated PE          47907
  PV UUID               vPKVKO-KnVY-Pzd4-USEL-piI7-XuZP-rAWbfM

  --- Physical Segments ---
  Physical extent 0 to 0:
    Logical volume      /dev/vgroot/root_rmeta_0
    Logical extents     0 to 0
  Physical extent 1 to 47906:
    Logical volume      /dev/vgroot/root_rimage_0
    Logical extents     0 to 47905

  --- Physical volume ---
  PV Name               /dev/sdd5
  VG Name               vgroot
  PV Size               187.14 GiB / not usable <3.59 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              47907
  Free PE               0
  Allocated PE          47907
  PV UUID               SBMcAC-7dd8-AQ2m-XqS1-qoJt-aAW8-3HKsxP

  --- Physical Segments ---
  Physical extent 0 to 0:
    Logical volume      /dev/vgroot/root_rmeta_1
    Logical extents     0 to 0
  Physical extent 1 to 47906:
    Logical volume      /dev/vgroot/root_rimage_1
    Logical extents     0 to 47905


Same for mdadm:

Code:
# mdadm -v --detail --scan /dev/md/3
ARRAY /dev/md/3 level=raid1 num-devices=2 metadata=0.90 UUID=f7eeab6f:250039dd:cb201669:f728008a
   devices=/dev/sdb3,/dev/sdd3

# mdadm -v --detail --scan /dev/md/4
ARRAY /dev/md/4 level=raid1 num-devices=2 metadata=0.90 UUID=5e1a89ea:6fdaadcd:cb201669:f728008a
   devices=/dev/sdb4,/dev/sdd4


Here are all the disks listed (2 in production - sdb and sdd - and 2 for backup):

Code:
# lsblk
NAME                     MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda                        8:0    0 223.6G  0 disk
├─sda1                     8:1    0     2M  0 part
│ └─md127                  9:127  0   1.9M  0 raid1
├─sda2                     8:2    0   512M  0 part
├─sda3                     8:3    0   512M  0 part
│ └─md126                  9:126  0 511.9M  0 raid1
├─sda4                     8:4    0  35.3G  0 part
│ └─md125                  9:125  0  35.3G  0 raid1
└─sda5                     8:5    0 187.2G  0 part
  └─md124                  9:124  0 187.2G  0 raid1
sdb                        8:16   0 223.6G  0 disk
├─sdb1                     8:17   0     2M  0 part
│ └─md1                    9:1    0   1.9M  0 raid1
├─sdb2                     8:18   0   512M  0 part
├─sdb3                     8:19   0   512M  0 part
│ └─md3                    9:3    0 511.9M  0 raid1
├─sdb4                     8:20   0  35.3G  0 part
│ └─md4                    9:4    0  35.3G  0 raid1 [SWAP]
└─sdb5                     8:21   0 187.1G  0 part
  ├─vgroot-root_rmeta_0  253:0    0     4M  0 lvm
  │ └─vgroot-root        253:4    0 187.1G  0 lvm   /
  └─vgroot-root_rimage_0 253:1    0 187.1G  0 lvm
    └─vgroot-root        253:4    0 187.1G  0 lvm   /
sdc                        8:32   0 223.6G  0 disk
├─sdc1                     8:33   0     2M  0 part
│ └─md127                  9:127  0   1.9M  0 raid1
├─sdc2                     8:34   0   512M  0 part
├─sdc3                     8:35   0   512M  0 part
│ └─md126                  9:126  0 511.9M  0 raid1
├─sdc4                     8:36   0  35.3G  0 part
│ └─md125                  9:125  0  35.3G  0 raid1
└─sdc5                     8:37   0 187.2G  0 part
sdd                        8:48   0 223.6G  0 disk
├─sdd1                     8:49   0     2M  0 part
│ └─md1                    9:1    0   1.9M  0 raid1
├─sdd2                     8:50   0   512M  0 part
├─sdd3                     8:51   0   512M  0 part
│ └─md3                    9:3    0 511.9M  0 raid1
├─sdd4                     8:52   0  35.3G  0 part
│ └─md4                    9:4    0  35.3G  0 raid1 [SWAP]
└─sdd5                     8:53   0 187.1G  0 part
  ├─vgroot-root_rmeta_1  253:2    0     4M  0 lvm
  │ └─vgroot-root        253:4    0 187.1G  0 lvm   /
  └─vgroot-root_rimage_1 253:3    0 187.1G  0 lvm
    └─vgroot-root        253:4    0 187.1G  0 lvm   /



In this case, I know that /dev/sdb and /dev/sdd are the production disks, but what if for some reason an admin stops the system, opens the chassis, connects new backup disks and in the process unwillingly swaps some cables so that, say, on a reboot the Linux kernel maps sdb and/or sdd to the backup disks instead of the production disks?

When grub installed in the MBR it made several boot entries. Some set this root:

Code:
set root=(hd0,gpt3)


and others set this other root:

Code:
set root='mduuid/f7eeab6f250039ddcb201669f728008a'


The latter boots fine, and the GRUB's mduuid is the same as mdadm's /dev/md/3 uuid.

So at this point if I were to clone / dd each "production" disk to the backup disks (eg. clone sdb and sdd to sda and sdc) what would happen to:

Code:
# mdadm -v --detail --scan /dev/md126
ARRAY /dev/md126 level=raid1 num-devices=2 metadata=0.90 UUID=6451c3dd:925d47ad:c44c77eb:7ee19756
   devices=/dev/sda3,/dev/sdc3


and more importantly how would grub behave (see further down on how I would clone the disks)?

The output of
Code:
# parted -l

shows that all the disks are GPT.

In any case what I want to do is wipe everything out on the 2 target backup disks. I want to repartinion them and (maybe) reinstall grub.

Now I'll propose a procedure that will hopefully not screw everything up with LVM, mdadm and grub...

The SOURCE disks (sdb and sdd) are GPT so instead of cloning the boot sector with dd:

Code:
# dd if=/dev/sdb of=/dev/sda bs=512 count=1
# dd if=/dev/sdd of=/dev/sdc bs=512 count=1


I'd use this:

Code:
# sgdisk -R=/dev/sda /dev/sdb
# sgdisk -R=/dev/sdc /dev/sdd


Then this:

Code:
# ddrescue --force -n /dev/sdb /dev/sda /tmp/ddrescue-disk1
# ddrescue --force -n /dev/sdd /dev/sdc /tmp/ddrescue-disk2


I didn't try the above yet, but I'd like to know:

- Do I need to install grub on the backup disks (sda and adc) or is that not necessary after running ddrescue?

- If I reboot the machine will it load the production set sdb/sdd as expected and ignore the backup set?

- If I issue a shutdown, disconnect the current sdb/sdd disks and leave only the backup disks connected to the motherboard, will the machine boot the backup system as if it were a perfect copy of the production system?

Finally, in order to avoid data corruption when backing up a running system, how should I properly create a snapshot and clone it to the target partition?

Code:
# lvs
  LV   VG     Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  root vgroot rwi-aor--- 187.13g                                    100.00

# vgs
  VG     #PV #LV #SN Attr   VSize   VFree
  vgroot   2   1   0 wz--n- 374.27g    0

# df -h
Filesystem      Size  Used Avail Use% Mounted on
none             16G  1.1M   16G   1% /run
udev             10M     0   10M   0% /dev
shm              16G     0   16G   0% /dev/shm
/dev/dm-4       184G   22G  153G  13% /
cgroup_root      10M     0   10M   0% /sys/fs/cgroup
tmpfs           4.0G     0  4.0G   0% /tmp
tmpfs            12G     0   12G   0% /var/tmp
/dev/md3        480M   44M  411M  10% /boot

# lvcreate -s -n backup1 -L 25G /dev/vgroot/root
  Volume group "vgroot" has insufficient free space (0 extents): 6400 required.


How can I fix this?

Should I do the following?

Code:
# resize2fs /dev/vgroot/root 80G
# lvreduce -L 100G /dev/vgroot/root
# resize2fs /dev/vgroot/root


Supposing I get enough free space to create the snapshot, how do I properly sync "root" to the backup disks?

I suppose I need to

Code:
# lvdisplay backup1


but I wonder if I can ddrescue the snapshot to the target without having to mount and then rsync.
I would then lvremove the snapshot.

Thanks


Last edited by Vieri on Wed Feb 01, 2023 12:57 pm; edited 1 time in total
Back to top
View user's profile Send private message
alamahant
Advocate
Advocate


Joined: 23 Mar 2019
Posts: 3879

PostPosted: Mon Jan 30, 2023 7:00 pm    Post subject: Reply with quote

I am not familiar with raid but the best option to clone a running system is rsync
Code:

rsync -aAHXv  --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} / /mnt/

provided you have created a similar raid/lvm setup in your spare disks,formatted it and mounted it on /mnt
If you need the backup to be bootable you should chroot into it modify fstab and grub files and reinstall and update grub and update initramfs also.
_________________
:)
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 882

PostPosted: Tue Jan 31, 2023 9:30 am    Post subject: Reply with quote

alamahant wrote:
the best option to clone a running system is rsync


That doesn't necessarily avoid data corruption. I guess I would need to mount an LVM snapshot of root.

alamahant wrote:
provided you have created a similar raid/lvm setup in your spare disks,formatted it


So the following wouldn't be enough to copy over the partition data to the target/backup disks?

# sgdisk -R=$targetDev $sourceDev
# ddrescue --force -n $sourceDev $targetDev /tmp/ddrescue-clone

Or it would, but not the RAID and LVM metadata?

alamahant wrote:
backup to be bootable you should chroot into it modify fstab and grub files and reinstall and update grub and update initramfs also.


My fstab does not use uuids so it shouldn't require editing.

I believe I need to reinstall initramfs and grub because of the UUID changes in both /etc/mdadm.conf and the grub config file.
However, if grub.cfg uses
Code:
root=(hd0,gpt3)

instead of
Code:
root='mduuid/WHATEVER'

shouldn't it work without changing anything?

I wonder if there's a way to set up mdadm.conf without UUIDs so I wouldn't have to rebuild initramfs either.

The goal here is to sync from one disk set to another within the same motherboard, and the backup system will only be booted when the first set is disconnected.
In other words, if the running system is on MOBO SATA ports 1, 2 and the backup disks on MOBO SATA ports 3, 4 I would disconnect ports 1, 2 and boot from ports 3, 4.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Tue Jan 31, 2023 2:19 pm    Post subject: Reply with quote

If you're using raid1 AKA mirror, why not just extend it to sdc and sdd, and then unplug sda and sdb after sync completes? I'd avoid mixing those 2 sets of drives afterwards, but you can clone it this way, since IDs won't change in the process. But it also means you don't have to update any configs to keep system bootable.

You didn't put EFI on raid, so this one must be cloned manually, but it's NOT a part of a running system, so rsync or dd are both fine.
Back to top
View user's profile Send private message
sabayonino
Veteran
Veteran


Joined: 03 Jan 2012
Posts: 1014

PostPosted: Tue Jan 31, 2023 7:16 pm    Post subject: Reply with quote

Consider FSArchiver also

:roll:

Code:
app-backup/fsarchiver
     Available versions:  0.8.6-r1{tbz2} {debug lz4 lzma lzo static +zstd}
     Installed versions:  0.8.6-r1{tbz2}(05:49:17 10/12/2022)(lz4 lzma lzo zstd -debug -static)
     Homepage:            https://www.fsarchiver.org
     Description:         Flexible filesystem archiver for backup and deployment tool

_________________
LRS i586 on G.Drive
LRS x86-64 EFI on MEGA
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 882

PostPosted: Wed Feb 01, 2023 12:03 pm    Post subject: Reply with quote

szatox wrote:
If you're using raid1 AKA mirror, why not just extend it to sdc and sdd


...because I want to manually backup the production system only when necessary.
Let's say I want to run a major update in the production set (emerge system ; emerge world ; change kernels, etc.).
What I would then do is:

- clone the running system (it has to be running 24/7) to the backup disks,

- update my production system and test

- if anything goes haywire, disconnect production disks and boot the backup (or BIOS-boot the backup)

I know I could take system snapshots with LVM (just root in my case) and eventually restore in case of trouble. However:

- I'm not 100% sure that the LVM snapshot recovery will work (who knows... maybe a system LVM update can break something)
- It might work for root in my case, but not for boot
- I don't want to mess with GRUB or other bootloaders

I want a backup system I'm 100% sure will boot just like my production system before its major update.
I also want it to be very fast. That's why I'm considering local hard disks (without having to restore when the disaster is already happening).

I'm guessing that if I try to keep both RAID and LVM in the backup set I'm going to run into trouble.

Maybe my only safe bet is to keep my production system in mdadm RAID1 + LVM root, but set up just one "simple" backup disk that I would have to partition and format accordingly without RAID or LVM, and simply rsync the filesystems (I'd use an lvm snapshot for root) and chroot into the backup and reinstall grub and initramfs.

The downside is that once I boot my backup I'd be running a "degraded", non-RAIDed system, without even LVM to take snapshots.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Wed Feb 01, 2023 12:42 pm    Post subject: Reply with quote

So, it's actually about backup and not about cloning. Clone means you want make 2 instances out of 1. Kinda changes everything.

This is for a single machine, right?
Ok, so, 3 options:
* LVM snapshot -> archive contents to something else -> upgrade -> test -> remove snapshot
* split raid -> upgrade -> test -> repair raid (you can have more than 2 disks in a mirror to always cover hardware failure and you can enable write-intent bitmap to only merge changes)
* have a regular, scheduled, and tested backup which you should already have in place, and do snapshots when you know you're doing something dangerous.


In case of a bigger setups, you might consider virtualization and snapshoting the VM at hypervisor level. You're not going to break this one by updating guest's OS.
All those should allow you for quick recovery should something go wrong, and if it goes really wrong, you can rebuild it manually and restore data.
I tend to keep file-level backups for those times, it's OK to do manual recovery of a single machine. In case of a server farm I'd go for 100% automated process, but it takes some dedication and I don't screw myself up often enough to justify going through this pain just to protect personal machines.
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 882

PostPosted: Wed Feb 01, 2023 12:54 pm    Post subject: Reply with quote

sabayonino wrote:
Consider FSArchiver also


Yes, the backup script at the end of this page is nice:

https://www.system-rescue.org/lvm-guide-en/Making-consistent-backups-with-LVM/

However, it backs up to a file I then need to restore on my backup disks.
If the backup system had its own LVM root in raid1 mode and boot partition in mdadm RAID1, how would I properly chroot/mount the whole backup system in order to restore the backup, make the adjustments, rebuild initramsfs and reinstall grub?

I mean, in the case of a single non-RAIDed drive with no LVM, it's as simple as mounting /dev/sd?{1-5}. For md or LVM devices I'd hope the backups would show up in /dev. From my first post, I take it /dev/md/127 is the mdadm RAID1 set of my first partition of my backup disks.
I'd need to:

Code:
mount /dev/md/127 /mnt/target
mount /dev/md/1 /mnt/source
rsync OPTIONS /mnt/source/ /mnt/target/


Same for the rest except partition 2 which is not RAIDed.

The only quirk I see is that, for instance, the the LVM root in sdb on partition 5, but no LVM root on sda's partition 5.
I guess I need to lvcreate again on the backup disks in chrooted environment.
But then again, how will it be displayed/accessed? I currently see on the running system:

Code:
/dev/vgroot/root
/dev/mapper/vgroot-root
/dev/dm-4


If LVM root were properly created in the backup set of disks, would it appear somewhere in /dev on the production system?
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 882

PostPosted: Wed Feb 01, 2023 1:14 pm    Post subject: Reply with quote

szatox wrote:
So, it's actually about backup and not about cloning. Clone means you want make 2 instances out of 1.


I actually do want to make 2 out of 1, but only on-demand.
I wish I could use something like CloneZilla which seems to support RAID and LVM, but I cannot do it offline.
It has to be from a running system.

szatox wrote:

This is for a single machine, right?


Yes, even though I intend to do the same for a total of 4 physical machines.

szatox wrote:

Ok, so, 3 options:
* LVM snapshot -> archive contents to something else -> upgrade -> test -> remove snapshot


That assumes that I should restore the snapshot to a backup partition/volume, right?
The other partitions that are not LVMed should simply be rsync'ed or dd'ed, right?
Still, the hard part is rewriting the RAID and/or LVM information (and grub + initramfs) on the backup system so it is properly bootable.

szatox wrote:

* split raid -> upgrade -> test -> repair raid (you can have more than 2 disks in a mirror to always cover hardware failure and you can enable write-intent bitmap to only merge changes)


Ok, that's an interesting approach which might require less work and headaches.
With "split raid" do you mean removing a drive from a RAID1 array? I don't know how to do that without unmounting the array.
Also, if you repair the RAID set I understand you would keep the new updated system data. However, how would you repair the RAID set so as to recover the older data instead?

szatox wrote:

* have a regular, scheduled, and tested backup which you should already have in place, and do snapshots when you know you're doing something dangerous.


Yes, the system is already backed up, but restoring takes too much time.

szatox wrote:

In case of a bigger setups, you might consider virtualization and snapshoting the VM at hypervisor level.


Yes, but I can't do that here.
Some systems, such as PBX gateways with telephony cards, can't be virtualized.

Thanks
Back to top
View user's profile Send private message
sabayonino
Veteran
Veteran


Joined: 03 Jan 2012
Posts: 1014

PostPosted: Wed Feb 01, 2023 7:48 pm    Post subject: Reply with quote

Vieri wrote:
sabayonino wrote:
Consider FSArchiver also


Yes, the backup script at the end of this page is nice:

https://www.system-rescue.org/lvm-guide-en/Making-consistent-backups-with-LVM/

However, it backs up to a file I then need to restore on my backup disks.
If the backup system had its own LVM root in raid1 mode and boot partition in mdadm RAID1, how would I properly chroot/mount the whole backup system in order to restore the backup, make the adjustments, rebuild initramsfs and reinstall grub?

I mean, in the case of a single non-RAIDed drive with no LVM, it's as simple as mounting /dev/sd?{1-5}. For md or LVM devices I'd hope the backups would show up in /dev. From my first post, I take it /dev/md/127 is the mdadm RAID1 set of my first partition of my backup disks.
I'd need to:



FSArchiver can work with Block file and files level (very similar to rsync, but on cleared device)

see fsarchiver --help command

Quote:
[...]
<commands>
* savefs: save filesystems to an archive file (backup a device to a file)
* restfs: restore filesystems from an archive (overwrites the existing data)
* savedir: save directories to the archive (similar to a compressed tarball)
* restdir: restore data from an archive which is not based on a filesystem


See the -A option for a running system ... keep in mind that make a backup of the running system may have inconsitence status during backup/restore action

if you need to backup/restore to file level , rsync is the solution

Or you may need a of a professional solution like AMANDA backup or similar
https://wiki.zmanda.com/index.php/Main_Page

Code:
app-backup/amanda
     Available versions:  3.5.1-r3 {curl gnuplot ipv6 kerberos minimal ndmp nls readline s3 samba systemd xfs}
     Homepage:            http://www.amanda.org/
     Description:         The Advanced Maryland Automatic Network Disk Archiver


If you have 1 host only , consider BTRFS and its snapshots system
You can boot the system from the working snapshots

take a look https://dataswamp.org/~solene/2023-01-04-boot-on-btrfs-snapshot.html
_________________
LRS i586 on G.Drive
LRS x86-64 EFI on MEGA
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Thu Feb 02, 2023 2:24 am    Post subject: Reply with quote

Vieri wrote:


szatox wrote:

Ok, so, 3 options:
* LVM snapshot -> archive contents to something else -> upgrade -> test -> remove snapshot


That assumes that I should restore the snapshot to a backup partition/volume, right?
The other partitions that are not LVMed should simply be rsync'ed or dd'ed, right?
Still, the hard part is rewriting the RAID and/or LVM information (and grub + initramfs) on the backup system so it is properly bootable.

Not really. If test shows that upgrade failed and you need to revert, you just merge snapshot or - in case of thin volumes - delete the original volume and rename snapshot to whatever the original name was. And then reboot, obviously.
Snapshots of a fully provisioned volume come at a pretty big write performance penalty, but most systems can take that. Thin volumes unfortunately don't have overflow protection, if you run out of extents in the pool, you lose ALL data in that pool - including other thin volumes. Don't use on a production system unless you know what you're doing.

If you already have a tested backup, you can skip archiving the snapshot. Though it might be more consistent than a file-level copy of a running system. Either way, that archive is just for disaster recovery, your primary recovery method here is a snapshot merge.
The only thing not on LVM should be boot partition. And efi, if you separated it from boot. Boot is not written to very often, so just recover it from your regular backup if you really need to. If this is too much effort, you should probably reconsider your backup solution too. Err... I mean: rsync is fine for that.
Quote:

szatox wrote:

* split raid -> upgrade -> test -> repair raid (you can have more than 2 disks in a mirror to always cover hardware failure and you can enable write-intent bitmap to only merge changes)


Ok, that's an interesting approach which might require less work and headaches.
With "split raid" do you mean removing a drive from a RAID1 array? I don't know how to do that without unmounting the array.
Also, if you repair the RAID set I understand you would keep the new updated system data. However, how would you repair the RAID set so as to recover the older data instead?

You split the raid by unplugging one or more disks.
Merge forward by reconnecting them and possibly re-adding. This combined with write-intent bitmap will only sync recent changes.
Merge backwards by unplugging "new" drives, plugging the "old" drives, failing absent devices, and then plugging them again and adding as spares. Mdraid will immediately start resilvering them. You may need to destroy superblocks before adding "new" disks as spares.
Note: SATA connectors will hate you, they are not designed for being refitted on daily basis.
Quote:

szatox wrote:

* have a regular, scheduled, and tested backup which you should already have in place, and do snapshots when you know you're doing something dangerous.


Yes, the system is already backed up, but restoring takes too much time.

If you're THAT concerned about downtime, perhaps you should consider a highly-available setup instead of shaving a few extra seconds of recovery time?
Have 2 independent workers with a floating IP address for example?
One machine is always idle, so you can do maintenance there, and once it succeeds you can either switch immediately or do maintenance on live machine, and switch to your backup only if things go wrong. Either way, you can take all the time you need to patch things up.
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 882

PostPosted: Thu Feb 02, 2023 10:55 am    Post subject: Reply with quote

sabayonino wrote:
consider BTRFS and its snapshots system
You can boot the system from the working snapshots

take a look https://dataswamp.org/~solene/2023-01-04-boot-on-btrfs-snapshot.html


Very nice, but it seems BTRFS is not considered stable when used with RAID.
I'm not sure I want to take the single-disk route.

Thanks anyway.
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 882

PostPosted: Thu Feb 02, 2023 12:43 pm    Post subject: Reply with quote

szatox wrote:

If test shows that upgrade failed and you need to revert, you just merge snapshot or - in case of thin volumes - delete the original volume and rename snapshot to whatever the original name was. And then reboot, obviously.


OK, so taking an LVM snapshot of root, and reverting if something goes really wrong is fine.
Backing up the boot partition with rsync right before the dreaded system update is feasible (I could even back it up to a dir in the root partition so I don't have to depend on an external source).
In any case, I'm not sure I actually need to worry about the boot partition on a RAID1 set because the only changes would be new kernel and initramfs image files. I guess reverting back would be simply selecting the previous grub entry.

There is a remote possibility that even if the production system updates fine apparently, a reboot may lead to some kind of error (no matter which kernel/initramfs images I choose). That would mean that I wouldn't be able to revert the LVM snapshot from the running system.

Would it be possible to boot from the Gentoo Live medium and somehow revert the snapshot from there? If so, how should that be done? Do I need to chroot into the "vgroot-root" and run the LVM utilities there?

szatox wrote:

split raid -> upgrade -> test -> repair raid
You split the raid by unplugging one or more disks.


OK, that's not an option for me.

szatox wrote:

If you're THAT concerned about downtime, perhaps you should consider a highly-available setup instead of shaving a few extra seconds of recovery time?
Have 2 independent workers with a floating IP address for example?


Well, a HA system would be fine, but in my particular scenario I have a PBX Gateway with several telephony cards and I want them all in just one box (well, I don't have enough to duplicate the system). I have another box (the backup) ready to pick up work if the main one fails. However, I would need to move every single telephony card from box1 to box2. The extra time needed to do this is one thing, but I also do not like the idea of moving around PCI cards with the risk of electrostatic woes.

I guess I'll try the LVM snapshot solution. It's probably the simplest.
I also remember a while back there was a package called "demerge" that's not in portage anymore.
I never really used it, but it sounded interesting because it allowed to "record" a portage state after which you could emerge system/world, and if something went wrong you could demerge everything to the previous recorded state.
I wouldn't count on that anyway especially if in the update process I had to change python versions...
So LVM snapshots seem to be the way to go.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Thu Feb 02, 2023 1:16 pm    Post subject: Reply with quote

Quote:
Would it be possible to boot from the Gentoo Live medium and somehow revert the snapshot from there? If so, how should that be done? Do I need to chroot into the "vgroot-root" and run the LVM utilities there?
You don't need to chroot, just trigger the merge with lvconvert from the liveCD.

Quote:
In any case, I'm not sure I actually need to worry about the boot partition on a RAID1 set because the only changes would be new kernel and initramfs image files. I guess reverting back would be simply selecting the previous grub entry.
Yes, usually this is all it takes.
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 882

PostPosted: Fri Feb 03, 2023 11:56 am    Post subject: Reply with quote

Hi again,

One final question.

I'm trying to add 2 spare disks to my RAID1 set (2 disks).

With mdadm I use the --add-spare option.

What about LVM?

Here's how I create the RAID1 array with 2 disks:

Code:
TARGET_DEVICE=/dev/sda
TARGET_DEVICE_2=/dev/sdb

pvcreate ${TARGET_DEVICE}5 ${TARGET_DEVICE_2}5
vgcreate vgroot ${TARGET_DEVICE}5 ${TARGET_DEVICE_2}5
lvcreate -m 1 --type raid1 -l 50%VG -n root vgroot


How would I add /dev/sdc5 and /dev/sdd5 as spares?

lvcreate doesn't seem to have an option for spare disks.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Fri Feb 03, 2023 11:44 pm    Post subject: Reply with quote

While it is possible to use LVM for making raid direclty, I like LVM on top of mdraid.

Now, with LVM alone you define raid level per logical volume, so there is no concept of spares really. Unlike other raids, LVM knows which extents are definitely not in use and can be mapped in case of hardware failure, so spare disks don't really make much sense.
You just add disks to volume group instead and define logical volume's raid level and number of mirrors or number of stripes. You can opt to strip/mirror an LV to fewer disks than you added. I don't know how it allocates space across multiple PV, it kinda makes sense to spread stripes everywhere to boost performance. Also, it would leave some room for recovery should one disk fail. Mirror copies will go to different physical devices too, it kinda is the defining feature of a mirror.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum