Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Corrosion free storage with full redundancy - how?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
ManDay
Apprentice
Apprentice


Joined: 22 Jan 2008
Posts: 246

PostPosted: Fri Aug 16, 2013 6:44 pm    Post subject: Corrosion free storage with full redundancy - how? Reply with quote

I want to store data n-fold fully redundant on n parts of disks (e.g. partitions) where those parts could reside either on the same disk, be distributed across disks or be individually separate disks.

Every time a datum is read (or written, for it should be read back for verifying it has been written properly), but at least after a certain time in which it has not been accessed, its integrity should automatically be ascertained by verifying a checksum (that checksum may also be redundantly stored or only once). If the datum is found corrupted, it should be restored from one of the other parts where the checksum agrees (unless the part where it's found corrupted has failed fatally, of course).

From what I've gathered, RAID (1) does the redundancy but does not maintain integrity (by means of checksum, even). BTRFS, ZFS and XFS do have the checksumming but they appear to not natively incorporate redundancy.

Could anyone propose a solution which provides all these features and is as deeply rooted in the kernel (or in a userspace FS) as possible, so that it happens as transparently to the user as possible?
Back to top
View user's profile Send private message
ryao
Retired Dev
Retired Dev


Joined: 27 Feb 2012
Posts: 132

PostPosted: Sat Aug 31, 2013 8:57 pm    Post subject: Re: Corrosion free storage with full redundancy - how? Reply with quote

ManDay wrote:
I want to store data n-fold fully redundant on n parts of disks (e.g. partitions) where those parts could reside either on the same disk, be distributed across disks or be individually separate disks.

Every time a datum is read (or written, for it should be read back for verifying it has been written properly), but at least after a certain time in which it has not been accessed, its integrity should automatically be ascertained by verifying a checksum (that checksum may also be redundantly stored or only once). If the datum is found corrupted, it should be restored from one of the other parts where the checksum agrees (unless the part where it's found corrupted has failed fatally, of course).

From what I've gathered, RAID (1) does the redundancy but does not maintain integrity (by means of checksum, even). BTRFS, ZFS and XFS do have the checksumming but they appear to not natively incorporate redundancy.

Could anyone propose a solution which provides all these features and is as deeply rooted in the kernel (or in a userspace FS) as possible, so that it happens as transparently to the user as possible?


I have no idea what you mean by "BTRFS, ZFS and XFS do have the checksumming but they appear to not natively incorporate redundancy". ZFS is known for doing what you described. ZFS' internal structure is a merkle tree. By default, metadata is written twice and data is written once. This can be increased by adjusting the copies setting. The default setting is 1. Setting it to 2 makes future writes store three copies of metadata and two copies of data. Setting it to 3 makes future writes store three copies of metadata and 3 copies of data. This is in addition to parity blocks for raidz.

There are many possible pool configurations, but a simple one can be made by doing zpool create tank raidz2 /path/to/device{0,5}. That gives you a pool where you can lose any two disks and ZFS will be able to reconstruct them. You can also corrupt any two blocks in the same stripe and ZFS will be able to reconstruct them upon a checksum failure. Scrubs can be done to verify checksums periodically by runningzpool scrub tank in a cron job.

By the way, XFS does not do checksums (although a future redesign will incorporate them for metadata). btrfs has 32-bit checksums on 32-bit systems and 64-bit checksums on 64-bit systems. ZFS uses 256-bit checksums on all platforms. btrfs also has ditto blocks, but they are off by default. It also recently integrated RAID 5/6. It uses a fixed stripe size, which is inferior to RAID-Z's variable stripe size.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum