Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED]Best way to join 2 disks into one
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
simonbcn
n00b
n00b


Joined: 01 Aug 2011
Posts: 69
Location: Denmark

PostPosted: Wed Dec 28, 2022 8:22 am    Post subject: [SOLVED]Best way to join 2 disks into one Reply with quote

I have 2 HDD disks formatted with XFS (16T and 14T) containing, almost exclusively, large video files in high resolution: 1080p and 4K, with high bitrate. I want to join them. One of the disks can be reformatted without problems but the other I would prefer not to have to reformat.
RAID0 is ruled out because I understand that the disks would have to be the same size.
Doing a search I found that there are many options: BTRFS, LVM, ZFS, MDAM, AUFS, UNIONFS, MERGEFS. But I'm not sure which is the best choice in terms of performance.
ZFS has the disadvantage that it is not integrated into the Kernel. I understand that BTRFS is not particularly suitable when dealing with disks that are going to contain large files. LVM is the most common option but I am not sure if it is an optimal choice.
Can I use XFS to extend the file system across 2 disks? I have searched and it seems not to be.
Could someone with experience in these matters give me a hand?


Last edited by simonbcn on Sun Jan 01, 2023 9:04 pm; edited 1 time in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54261
Location: 56N 3W

PostPosted: Wed Dec 28, 2022 11:14 am    Post subject: Reply with quote

simonbcn,

raid0 has two modes, striped, in which speed in increased the drives are accesses are paralleled to some degree and linear which sticks random sized drives end to end so they appear as one big drive. Both have the downside that if one member of the raid set fails, all the data is lost.

LVM uses the same kernel code for raid0 as mdadm, The management code is different. I use LVM on top of mdadm raid here. mdadm has been around longer, so LVM raid was not an option.

In place movement to raid is not possible as all the members of the raid set need to have the raid metadata added.
LVM requires metadata to be added too.

Changing filesystems will be data destructive anyway.

What benefit do you expect to gain by being able to access the data on these drives as if they were one?

How will you do backups?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
simonbcn
n00b
n00b


Joined: 01 Aug 2011
Posts: 69
Location: Denmark

PostPosted: Wed Dec 28, 2022 11:30 am    Post subject: Reply with quote

I don't need backup those files. I'm not currently doing it as individual discs, why would I do it if I put them together? They do not contain essential data.
If I "join" the disks, all files will be distributed across the available space on both disks. Without worrying about one of them filling up while the other is half empty.
Isn't there any method available to join both disks while preserving the content of at least one of them?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54261
Location: 56N 3W

PostPosted: Wed Dec 28, 2022 12:40 pm    Post subject: Reply with quote

simonbcn,

If you use mdadm raid metadata version <=1.0 the metadata goes at the end of the volume.
A piece of your existing XFS filesystem is there now.
From my limited knowledge of XFS, it can be grown but not shrunk.
The good news is that the filesystem start does not move.

Later versions of mdadm put the metadata at the start of the volume, which destroys the filesystem entirely.

LVM puts the metadata at the beginning of the volume too.

You probably want to practice the following on a couple of USB sticks, or sparse loop devices before you do it for real.

Move as much data onto the small drive as you can. This will be preserved if all goes well. If you can preserve more elsewhere, that works too.
This next step is destructive.
Donate the partition on the large drive to an mdadm lineal raid0 set. So you have a linear raid0 with only one spindle/drive. This gets the metadata in the right place.
mdadm may give you an odd look but its valid. You may need to coax it.
Make a filesystem on the new raid set. It will be /dev/md<something>

Copy over the data from the small drive to the raid set. Now you have preserved as much data as possible and its on the one drive raid0 and the small drive.
Check that works. If not, this is the last chance to make the raid and copy over preserved data.

The next step destroys the preserved data.
Grow the raid0 by adding the partition from the small drive to it. As long as you made the raid0 in linear mode, this works.
Grow the filesystem on the raid0 to fill the available space.

Now you have both drives in the raid0 with lots of empty space and some data preserved.



The kernel will show the individual drives as well as the raid set. The individual drives are no longer yours. You must only use the raid.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
sublogic
Apprentice
Apprentice


Joined: 21 Mar 2022
Posts: 222
Location: Pennsylvania, USA

PostPosted: Thu Dec 29, 2022 2:06 am    Post subject: Reply with quote

simonbcn wrote:
... Isn't there any method available to join both disks while preserving the content of at least one of them?

Bare bones device mapper? From Documentation/admin-guide/device-mapper/linear.rst in the Linux source tree,
Code:
  #!/bin/sh
  # Join 2 devices together
  size1=`blockdev --getsz $1`
  size2=`blockdev --getsz $2`
  echo "0 $size1 linear $1 0
  $size1 $size2 linear $2 0" | dmsetup create joined

Unmount the filesystems before you pull that stunt. I'm not sure what happens next, I never tried that. I think your combined disks show up as /dev/mapper/dm-n for some value of n. If it works, the filesystem on $1 is still okay and shows up when you mount the /dev/mapper-dm-n. Mount it read-only, just in case. If the content is intact, unmount the device and expand the filesystem with xfs_growfs, at the expense of everything on $2. If no, sorry for your loss...

EDIT: it's /dev/dm-n not /dev/mapper/dm-n .

simonbcn wrote:
I don't need backup those files. I'm not currently doing it as individual discs, why would I do it if I put them together?
Because you're about to incinerate all the data on them ? You should backup before you try to glue them.

Also you should practice --a lot-- on throwaway loop devices. (Read the losetup man page and the example therein to get started.)


Last edited by sublogic on Thu Dec 29, 2022 3:31 pm; edited 1 time in total
Back to top
View user's profile Send private message
gentoo_ram
Guru
Guru


Joined: 25 Oct 2007
Posts: 474
Location: San Diego, California USA

PostPosted: Thu Dec 29, 2022 5:51 am    Post subject: Reply with quote

I worked quite a bit with Device mapper and I agree the linear method as shown should work assuming you don't have other metadata (LVM, MD, etc.) at the end of your partition. The only other possible complication I can think of would have to do with the internal block size of the filesystem if it doesn't match the partition size evenly. But hopefully the xfs_growfs will take care of that. I can't think of why not.

What's weird from memory is that ext2/3/4 can only be grown when not mounted. I think xfs_growfs can only grow a *mounted* XFS filesystem! At least that's how it behaved before.

Definitely test on some scratch disks first. And also keep in mind that device mapper configurations are completely ephemeral. They need to be configured on every boot so you'll have to add the appropriate commands and parameters to a startup script somewhere.

Also, you should get a /dev/mapper/by-name/joined in the previous example that would link to the appropriate dm-n device.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54261
Location: 56N 3W

PostPosted: Thu Dec 29, 2022 8:55 am    Post subject: Reply with quote

sublogic,

The LVM metadata will trample on one filesystem, unless the intent is to do the join at every startup.
How do you then merge filesystems?
I suppose shrink one and grow the other (in that order). That requires access to the a filesystem that is towards the end of the join.

gentoo_ram

extX supports online growing but not online shrinking.

In the days when I looked at XFS, the advice was that its fast as it uses all the RAM it can get for cache, and because off all that cache, don't even think about XFS without a UPS.
As a result, I've not used it, as I've never invested in a UPS.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
sublogic
Apprentice
Apprentice


Joined: 21 Mar 2022
Posts: 222
Location: Pennsylvania, USA

PostPosted: Thu Dec 29, 2022 4:22 pm    Post subject: Reply with quote

NeddySeagoon wrote:
sublogic,

The LVM metadata will trample on one filesystem, unless the intent is to do the join at every startup.
What LVM metadata ? The OP says he has two disk drives with XFS, no mention of LVM anywhere. And yes, the OP would have to carefully script the devmapper join at boot, in /etc/local.d for example. And delete the original entries from his /etc/fstab. And whatever else is necessary to avoid destruction.

If the original files are on single partitions that span their respective disks, the joining script can pick the block devices to join from /dev/disk/by-partuuid . If they are on full disks without partition tables, use the entries in /dev/disk/by-id that go by model and serial number. (Joining the block devices in the wrong order would be a bad thing.)

NeddySeagoon wrote:
How do you then merge filesystems?
Well, I assume that the /dev/dm-n looks like the first filesystem followed by the second, sacrificial, file system, and that the sacrificial filesystem counts as free space usable to expand the first (the OP was open to reformatting one of the drives). The OP should definitely practice with toy filesystems on loop devices before taking the dive.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54261
Location: 56N 3W

PostPosted: Thu Dec 29, 2022 5:29 pm    Post subject: Reply with quote

sublogic,

I follow now. Thank you.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3140

PostPosted: Thu Dec 29, 2022 8:02 pm    Post subject: Reply with quote

I vaguely recall that the OP has 2 disks and doesn't mind wiping one of them.

So, why not just use LVM to create a volume group on that disk, then create a logical volume, move the important data to the new volume, and finally add this drive to the volume group too?
Looks trivial to me. Is there any additional constraint that makes this approach bad for your use case?


It comes with an added benefit that LVM supports many raid modes and applies them on per logical volume basis. It also allows reshaping the raid underneath, so you could for example start with a mirror, then as you need more space downgrade it to linear (AKA JBOD), and upgrade again to raid5 once you have another disk around.
Note: a disk failure in a non-redundant collection, be it raid0 (with striping) or jbod will result in a damaged filesystem and data lose. In case it jbod it may or may not be partially recoverable, in a stripped raid probably not recoverable but still nowhere near safe erasure. Making a single FS spanning multiple drives without any redundancy is probably a bad idea.

BTW, if you opt to manually assign logical volumes to physical devices, you can also create snapshots on different devices than the original data which should reduce performance hit from copy-on-write. This is a more advanced trick most people don't really need, but I think it's pretty cool.
Back to top
View user's profile Send private message
simonbcn
n00b
n00b


Joined: 01 Aug 2011
Posts: 69
Location: Denmark

PostPosted: Sun Jan 01, 2023 9:04 pm    Post subject: Reply with quote

I have tested on a virtual machine and finally I am going to do it as follows:
  1. I will use ZFS and create a pool on the disk1 (that I can empty).
  2. I will move the data from the other disk (disk2) to this newly created ZFS pool.
  3. I will add disk 2 to the pool.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum