Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Kernel & Hardware
  • Search

Backup/RAID options?

Kernel not recognizing your hardware? Problems with power management or PCMCIA? What hardware is compatible with Gentoo? See here. (Only for kernels supported by Gentoo.)
Post Reply
Advanced search
19 posts • Page 1 of 1
Author
Message
Illiander
Apprentice
Apprentice
Posts: 258
Joined: Tue Feb 22, 2011 2:11 pm

Backup/RAID options?

  • Quote

Post by Illiander » Sat Sep 02, 2023 12:51 am

I just had a hdd-failure scare, so I'm here to ask for advice on how to do data backups/redundancy.

What is the current best way to turn one hdd of irreplaceable data into two interchangable hdds of that data, and have that state be maintained as that data changes without me having to take explicit action?

This is for my home drive (I have a dedicated hdd with one big partition that's mounted to /home)

I tend to rebuild the rest of the system when I move motherboard/CPU, since that is simpler when foundational things change, and then move my home drive over and fix it's user settings.

What I'd like is to be able to do is take out either hdd and plug it into a new computer and have it just work, as well as being able to swap in a new hdd and have it rebuild the second copy.

My knowledge of backup tools is over 10 years old and not on Linux.

My knowledge of RAID 1 is telling me that I couldn't move to a larger hdd later if needed, but otherwise is logically what I want (but probably without actually using RAID protocols since this is my day-to-day desktop).

Do I have any good options?
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Sat Sep 02, 2023 3:18 am

Need to step back a moment and discuss the difference between backup and RAID.

RAID is not backup.

RAID is used to emulate one disk with several disks. It can provide tolerance from downtime in case a disk goes down but does not provide backup. Remember if you delete something on a RAID, all copies are instantly gone, you can't go back.

A true backup will allow you to go back in time and retrieve data. There is some argument that versioned filesystems have some sort of backup built in, but this still does not handle the case where you have a disk failure.

Your tolerance to the "time hole" between when new content is created and backups are taken is the key point. Remember that a deletion is "new content" which will affect your decision.

There are many options for backup - simply rsyncing a disk to another at night is a simple method, but may not protect you from people sneaking in a trojan that encrypts all your files.

Using an online backup system that copies to the web may make sense as they can also version, but can be costly. If your data is irreplaceable you may be willing to pay.

There are a lot of options in between depending on your tolerance for loss, cost, and timeliness...

(as an aside, Linux MDRAID and I believe other RAID (DMRAID, btrfs raidz, etc.) do allow for hot filesystem growth.)
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
jpsollie
Guru
Guru
Posts: 327
Joined: Sat Aug 17, 2013 3:40 pm

  • Quote

Post by jpsollie » Sat Sep 02, 2023 4:44 am

eccerr0r wrote: (as an aside, Linux MDRAID and I believe other RAID (DMRAID, btrfs raidz, etc.) do allow for hot filesystem growth.)
I agree, maybe it's best to choose one of these options:
1. use an advanced filesystem like BTRFS / ZFS which may mirror copies on multiple disks. when you plug in the disk into the other PC, it will still work (but complain it found only one of the disks)
2. create an mdraid / dmraid array which can be grown when you need it. You'll need to "recover" the damaged array when moving the disk to another drive
3. use rsync. Imo the best option. create a cron scrypt which runs rsync frequently and take out the mirror when you need it.
The power of Gentoo optimization (not overclocked): Image
Top
Illiander
Apprentice
Apprentice
Posts: 258
Joined: Tue Feb 22, 2011 2:11 pm

  • Quote

Post by Illiander » Sat Sep 02, 2023 9:42 am

eccerr0r wrote:A true backup will allow you to go back in time and retrieve data.
This is a thing I'm not concerned with.

The key thing I'm after is redundancy in case of drive failure.

Secondary things are ease of getting back on my feet after a drive failure.

This is an 8TB drive, so anything using the internet is a non-starter.

---

Guess I'll be looking into rsync.
Top
pietinger
Administrator
Administrator
Posts: 6630
Joined: Tue Oct 17, 2006 5:11 pm
Location: Bavaria

  • Quote

Post by pietinger » Sat Sep 02, 2023 9:48 am

Illiander wrote:This is an 8TB drive, so anything using the internet is a non-starter.
Simple Solution: Buy another 8 TB drive and do a RAID1
Better: Buy 2 (=minimum; or 3 or 4) 8 TB drives and do a RAID5
see: https://en.wikipedia.org/wiki/Standard_RAID_levels
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56094
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sat Sep 02, 2023 2:44 pm

Illiander,

8TB drives are pushing the boundary for what mdadm has a good probability to recover from,
I have 4 8TB drives in raid5 and only found out about the above after I had already done it.

Raid5 requires N-1 of N drives to work. With a failed drive, the redundancy is gone.
To recalculate the redundant data for a new drive requires that the N-1 remaining drives have no read errors while that happens.

Raid is not a substitute for backups. You may need to restore your raid set from backup if recalculating the redundant data fails.

See my signature before its too late.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Sat Sep 02, 2023 4:02 pm

how fast are those 8TB disks in sequential reads/writes?

Haven't gotten new disks in a while, still using 2T disks as my working set (ala the poll I did sometime...) is not very large and I don't shuffle videos at all... and these are like 120MB/sec - 180MB/sec sequential reads which are likely dwarfed by newer disks...

And yes these sequential write speeds are what kills RAID rebuilds especially those SMRs, and one may have to resort to rsync and some downtime to swap disks if one fails...
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56094
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sat Sep 02, 2023 4:33 pm

eccerr0r,

SMR in raid ... As soon a drive takes time out to reshingle, it will get kicked out of the array. They are really video surveillance drives.

My drives are Toshiba 8TB N300. Carefully chosen to be CMR. They are about 180Mb/sec at the outside and 35MB/sec near the spindle.
That's sequential reads. Writes should be about the same.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Sat Sep 02, 2023 4:40 pm

I'm surprised it hasn't gone past 180MB/sec on the outside, at least most of the data on the disk is stored on the outside anyway... Though it gets slower along the center tracks there's not as much data there.

However this still means a single 8T disk will take 4x as long to rebuild compared to a 2T disk because the read speeds are the same. It already takes way too long for my 2T disks to scrub (about 5 hours) which is probably around the same speed as a rebuild -- meaning an 8T disk will take a full day. Ouch!
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56094
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sat Sep 02, 2023 4:58 pm

eccerr0r

Code: Select all

# smartctl -x /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.5.0-gentoo] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba N300/MN NAS HDD
Device Model:     TOSHIBA HDWG480
...
Firmware Version: 0601
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
...
Extended self-test routine
recommended polling time: 	 ( 690) minutes.
Running a check is about 12h. The centre 50% of my drives - the slow bit, is my media collection so speed is not a concern there.
The check brings the system to its knees. I can stop that but I usually run it overnight.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Sun Sep 03, 2023 9:54 pm

yes it's kind of weird how to make sure raid check/rebuild doesn't interfere with regular use - problem being, each read transaction to the disk takes so many milliseconds and that many milliseconds is now out of the disk bandwidth.

I'd be surprised that disk schedulers base don't base fairness on time per transaction instead of just bytes transferred - so that large, multiple, or random transactions are more expensive than few, small, sequentiall transfers... Then again based on different programs putting the heads where they want it, who knows if your transaction will be "random" or not...

But then nobody cares about this anymore as they expect everyone to be using SSDs now :(

*Sigh*
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
Spanik
Veteran
Veteran
Posts: 1170
Joined: Fri Dec 12, 2003 9:10 pm
Location: Belgium

  • Quote

Post by Spanik » Mon Sep 04, 2023 9:19 pm

I'm using a raid 5 (4x 2TB) as my data drive. It isn't backup but it buys me time to get my backup up-to-date and verified when something happens. Backup is local offline USB and that is in a rotating set of three where one of them is offsite. One day I will get the nas working. The raid sits on an Areca raid controller that has started as a pci-133 years ago and for this motherboard I swapped it to a pci-e version of the same card. It just recognised the disks and got on with it. Must be at least the third motherboard it sits on.

I don't see a point in being able to grow your partition. Certainly not if you keep you data on a single partition that spans a whole drive or array. The day that is too small, it is very likely that your drives are years old and can do with a replacement. That is what I do. Put in a new larger array and copy the data over. Then just remove the old array to the dump. First this setup was 4x 160GB, then 4x 640GB Both changed when they got too small. Next 4x 1.5TB and when one drive of that array failed 4x 2TB. When one drive of an array fails, the others have had it just as hard. They have just as many hours, done as many startup cycles, survived just the same temperatures and read/write cycles. So they are just as far gone as the weakest of the lot that failed. Time to replace them.
Expert in non-working solutions
Top
szatox
Advocate
Advocate
Posts: 3858
Joined: Tue Aug 27, 2013 12:35 pm

  • Quote

Post by szatox » Tue Sep 05, 2023 12:10 am

It isn't backup but it buys me time to get my backup up-to-date and verified when something happens
Not really. You don't need backups to deal with issues raid already has covered.
The purpose of backup is keeping your data safe from your mistakes as well as hardware failures.
The purpose of raid is keeping your service safe from extended downtime caused by hardware failure.

There is some overlap between those two, but backup provides you a way to go back in time, while raid smooths out your ride to the future.

What is the current best way to turn one hdd of irreplaceable data into two interchangable hdds of that data, and have that state be maintained as that data changes without me having to take explicit action?
There is no "current best way", there is only "an acceptable way for your use case and cost"
I like personally like rsync, running on daily schedule, triggered by cron.
Rsync is able to use a reference target in addition to target location, and use hard links to save space on files that haven't been modified (as long as they are within a single filesystem tha supports hardlinks, which is probably anything except FAT). Easy and cheap way to keep 30 past versions. Gotta make sure not to modify files in place, or all old versions will get corrupted. An fs with support for copy on write might be a good solution to this.

The common wisdom is to have 2 independent backup copies, I recover data rarely enough 1 independent backup copy is IMO good enough.
The real trick is you must know how to restore your system and data from that backup, and do it in a reasonable time. Restoring a file copy is similar to installing a new gentoo.
Top
Spanik
Veteran
Veteran
Posts: 1170
Joined: Fri Dec 12, 2003 9:10 pm
Location: Belgium

  • Quote

Post by Spanik » Wed Sep 06, 2023 6:59 pm

Just a silly data point: I just did an rsync to update one of my off-line USB backup drives. And I found out it has everything on it EXCEPT my data. So yes, it has email and /etc and my virtual machines. But absolutely none of my essential data.

So yes, do check if your backup is usable.... :( :oops:
Expert in non-working solutions
Top
Illiander
Apprentice
Apprentice
Posts: 258
Joined: Tue Feb 22, 2011 2:11 pm

  • Quote

Post by Illiander » Thu Sep 21, 2023 11:27 pm

NeddySeagoon wrote: Raid is not a substitute for backups. You may need to restore your raid set from backup if recalculating the redundant data fails.

See my signature before its too late.
Not even RAID 1? Guess I'm sticking with rsync on cron then.

And your signiture is exactly why I'm trying to figure out how to do backups on my home drive.
Top
pietinger
Administrator
Administrator
Posts: 6630
Joined: Tue Oct 17, 2006 5:11 pm
Location: Bavaria

  • Quote

Post by pietinger » Fri Sep 22, 2023 12:17 am

Illiander wrote:
NeddySeagoon wrote: Raid is not a substitute for backups. You may need to restore your raid set from backup if recalculating the redundant data fails.

See my signature before its too late.
Not even RAID 1?
Yes, not even a RAID 1 is a substitute for a backup ... because:

1) What do you do if you have accidentally deleted an important file ? (yes, it is gone on both harddisks) Or you have started the wrong command ... formatting your USB stick ... oops ... it was not the stick ... ;-)
2) What happens if your system is compromized and all files are encrypted ? (yes, you have to pay to get it decrypted)
3) What happens if you have a fire or high water ? (yes, both harddisk are on fire or in the water :lol: ) (*)
4) What happen if your system gets a short 5,000 volt energy shock because of a lightning strike ?
...

(* This is also the reason why your backup must be on another location than your system)
Top
AJM
Apprentice
Apprentice
User avatar
Posts: 195
Joined: Wed Sep 25, 2002 7:46 pm
Location: Aberdeen, Scotland

  • Quote

Post by AJM » Fri Sep 22, 2023 8:20 am

pietinger wrote:3) What happens if you have a fire or high water ? (yes, both harddisk are on fire or in the water :lol: )
I have a HDD and SSD here which suffered both (the water was from five fire engines extinguishing the fire - and me with a garden hose before they arrived.) Incredibly, both drives still work perfectly although they're blackened and smelly!

Then again, they're sitting next to a mountain of dozens of faulty hard disks waiting for dismantling and disposal, none of which suffered any mistreatement - a constant reminder to me to check my backups. Personally I use rsnapshot (so rsync) to an external USB drive on site and restic to a server I keep twenty miles away.
Top
Illiander
Apprentice
Apprentice
Posts: 258
Joined: Tue Feb 22, 2011 2:11 pm

  • Quote

Post by Illiander » Fri Sep 22, 2023 10:27 am

pietinger wrote: 3) What happens if you have a fire or high water ? (yes, both harddisk are on fire or in the water :lol: ) (*)
4) What happen if your system gets a short 5,000 volt energy shock because of a lightning strike ?
...

(* This is also the reason why your backup must be on another location than your system)
Off-site backups are a massive pain when your best data transfer method is sneakernet.
Top
pietinger
Administrator
Administrator
Posts: 6630
Joined: Tue Oct 17, 2006 5:11 pm
Location: Bavaria

  • Quote

Post by pietinger » Fri Sep 22, 2023 10:55 am

Illiander wrote:Off-site backups are a massive pain when your best data transfer method is sneakernet.
Yes ... and TBH, I keep my usb backup disks in a fireproof safe in another room ... but of course I can't recommend that ;-)
Top
Post Reply

19 posts • Page 1 of 1

Return to “Kernel & Hardware”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic