Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Ssd, mdadm & raid1
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
ferg
Guru
Guru


Joined: 15 Nov 2002
Posts: 540
Location: Cambridge, UK

PostPosted: Sat Oct 10, 2020 8:38 am    Post subject: Ssd, mdadm & raid1 Reply with quote

I've just moved my root partition on a RAID1 device from spinning disks to two new fancy SSDs.

I did this by:
  • Failing one disk and removing it.
  • Replacing it with an SSD
  • Syncing
  • Repeating with the other disk
  • Finally growing the RAID1 device and resizing the filesystem


So I did not recreate the raid1 device.

However, now I've come to read the Gentoo SSD wiki I realise that there are quite a few things I should have done.

Mainly:
  • I should have aligned the partitions when I created them. I used Fdisk so I guess this is not done automatically
  • I should have have reformatted the EXT4 partition to match the ERASE block size of the SSDs.


How important are those? Is it worth recreating the RAID1 device or are the performance benefits not really worth it?

I will look at creating a weekly CRON job for TRIM/DISCARD.

My Portage TMPDIR is already on a RAM TMPFS and I plan to move any other frequent I/O location to another (I have plenty of RAM).

Thanks.
Cheers
Ferg
_________________
Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!"
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 55307
Location: 56N 3W

PostPosted: Sat Oct 10, 2020 9:09 am    Post subject: Reply with quote

ferg,

You can improve this in a number of ways.
Failing one disk and removing it means you lost redundancy for the rebuild.
You should have added one SSD to the existing raid set, then used the mdadm replace command.
It will use the entire raid as a source to recreate the member to be replaced on the SSD.
No loss of redundancy. It matters when you get a read error during the rebuild.

fdisk will be fine. It has created 4k aligned partitions since soon after Advanted Format rotating rust became a thing.
With UEFI, it moved the default first partition start to 1MiB too, to allow for the partition table.
You can check. if its wrong, it will hurt your write speed and increase your write amplification, which is a bad thing.

Aligning things to the erase block size is nice but not essential. Its very difficult to discover that.

One thing you don't mention.
You should run fstrim in a monthly cron job. Using the discard option to mount is OK if your SSDs implement it properly, many don't, leading to unnecessary erases.

If you don't have aligned partitions, its worth recreating the raid.
If your filesystem block size is smaller than 4k, recreate the filesystem.
For /boot, it won't matter as its rarely written.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ferg
Guru
Guru


Joined: 15 Nov 2002
Posts: 540
Location: Cambridge, UK

PostPosted: Sat Oct 10, 2020 9:26 am    Post subject: Reply with quote

As ever many thanks for your insightful comments!

NeddySeagoon wrote:
ferg,

You can improve this in a number of ways.
Failing one disk and removing it means you lost redundancy for the rebuild.
You should have added one SSD to the existing raid set, then used the mdadm replace command.
It will use the entire raid as a source to recreate the member to be replaced on the SSD.
No loss of redundancy. It matters when you get a read error during the rebuild.


I was pleasantly surprised to discover that I could have done. It's been some years since I've properly played with MDADM. However, I didn't have any free SATA ports on my motherboard and it would have meant a lot of shuffling around of stuff. All is backed up in multiple places so I reckoned the risk was minimal. Also for clarity I wasn't entirely honest with my original post as I did have three HDD in the original array. So at all times I had two synced devices in it. I've now reduced the number of devices to just the pair of SSDs.

Quote:
fdisk will be fine. It has created 4k aligned partitions since soon after Advanted Format rotating rust became a thing.
With UEFI, it moved the default first partition start to 1MiB too, to allow for the partition table.
You can check. if its wrong, it will hurt your write speed and increase your write amplification, which is a bad thing.

Aligning things to the erase block size is nice but not essential. Its very difficult to discover that.

I'm afraid I used the old DOS partitions instead of GPT. Is that a bad thing WRT aligned partitions?[/quote]

Quote:
One thing you don't mention.
You should run fstrim in a monthly cron job. Using the discard option to mount is OK if your SSDs implement it properly, many don't, leading to unnecessary erases.


Will do. Thanks!

Quote:
If you don't have aligned partitions, its worth recreating the raid.
If your filesystem block size is smaller than 4k, recreate the filesystem.
For /boot, it won't matter as its rarely written.

Block size is 4k

Code:
 # tune2fs -l /dev/md2 | grep -i block                                            !4033
Block count:              117212608
Reserved block count:     4782612
Free blocks:              107733806
First block:              0
Block size:               4096
Reserved GDT blocks:      996
Blocks per group:         32768
Inode blocks per group:   511
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
Journal backup:           inode blocks


Thanks again. Cheers Ferg
_________________
Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!"
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 55307
Location: 56N 3W

PostPosted: Sat Oct 10, 2020 9:40 am    Post subject: Reply with quote

ferg,

fdisk reserves the space for GPT.
When the drive is partitioned, fdisk does not know how it will be used.

What does
Code:
fdisk -l
tell ?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ferg
Guru
Guru


Joined: 15 Nov 2002
Posts: 540
Location: Cambridge, UK

PostPosted: Sat Oct 10, 2020 10:06 am    Post subject: Reply with quote

NeddySeagoon wrote:
ferg,

fdisk reserves the space for GPT.
When the drive is partitioned, fdisk does not know how it will be used.

What does
Code:
fdisk -l
tell ?


Here is the output (minus the other drives).

Code:
Disk /dev/sda: 447.13 GiB, 480103981056 bytes, 937703088 sectors
Disk model: SATA3 480GB SSD
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x8ee15595

Device     Boot Start       End   Sectors   Size Id Type
/dev/sda1        2048 937703087 937701040 447.1G fd Linux raid autodetect


Disk /dev/sdb: 447.13 GiB, 480103981056 bytes, 937703088 sectors
Disk model: SATA3 480GB SSD
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xdd9eef6a

Device     Boot Start       End   Sectors   Size Id Type
/dev/sdb1        2048 937703087 937701040 447.1G fd Linux raid autodetect

Disk /dev/md2: 447.13 GiB, 480102842368 bytes, 937700864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device     Boot Start   End Sectors Size Id Type
/dev/md2p3      27674 27674       0   0B  0 Empty


That looks Ok right? I should have read up a little bit more on what partition alignment is before posting :-)
_________________
Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!"
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 55307
Location: 56N 3W

PostPosted: Sat Oct 10, 2020 10:27 am    Post subject: Reply with quote

ferg.

Your SSD lies.
Code:
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

It will really have a 4k physical block size and 512B writes will be faked.

Code:
Device     Boot Start       End   Sectors   Size Id Type
/dev/sda1        2048 937703087 937701040 447.1G fd Linux raid autodetect

2048 is exactly divisible by 8, so your partition start sectors are 4k aligned. That means you are good.

2048 is also 1MiB, so there is space for the first copy of the GPT, if you wanted one.

Code:
Device     Boot Start   End Sectors Size Id Type
/dev/md2p3      27674 27674       0   0B  0 Empty

A first sight, that partition fails. 27674/8 is 3459.25 but its not that simple because thats from the start of the raid, not the underlying SSD, which is what matters.
What raid version do you use, 0.90 or 1.2?
That determines where the raid metadata is on disk.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ferg
Guru
Guru


Joined: 15 Nov 2002
Posts: 540
Location: Cambridge, UK

PostPosted: Sat Oct 10, 2020 10:34 am    Post subject: Reply with quote

NeddySeagoon wrote:
ferg.

Your SSD lies.
Code:
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

It will really have a 4k physical block size and 512B writes will be faked.


How sneaky! I guess since the filesystem has 4k blocks then it's still OK? ..or does that have any consequences?

Quote:

Code:
Device     Boot Start   End Sectors Size Id Type
/dev/md2p3      27674 27674       0   0B  0 Empty

A first sight, that partition fails. 27674/8 is 3459.25 but its not that simple because thats from the start of the raid, not the underlying SSD, which is what matters.
What raid version do you use, 0.90 or 1.2?
That determines where the raid metadata is on disk.


0.9. Mainly as I understood that you cannot boot from a 1.2 array with DOS partitions (when i setup this array many years ago). Is that still true?

I guess I should have created GPT partitions and not DOS ones. I only chose partitions to make it more flexible in case I need to change a drive in the future.

Thanks.
_________________
Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!"
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 55307
Location: 56N 3W

PostPosted: Sat Oct 10, 2020 12:52 pm    Post subject: Reply with quote

ferg,

That the SSD lies about block sizes is just something to keep in mind.

Partitioning a raid is a bit novel. Its only been supported for a few years.
Personally, I use LVM on top of raid and leave empty unallocated space in the physical volume, which I can allocate later.

For raid metadata=0.90 the metadata is at the end of the raid volume.
For raid metadata=1.2 the metadata is at the start of the raid volume.
We need to know which is in use on /dev/md2, so that we can calculate the on disk layout.

Then you have space for the partition table at the start of the raid.

What does
Code:
fdisk -l /dev/md2
tell?

grub legacy just ignored raid completely, hence metadata at the end on raid1 worked for /boot on raid, forcing both metadata=0.09 and raid1. The filesystem starts where it always does.
grub2 is raid aware and does its own thing to deal with /boot on raid. Any raid level works.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ferg
Guru
Guru


Joined: 15 Nov 2002
Posts: 540
Location: Cambridge, UK

PostPosted: Sat Oct 10, 2020 1:20 pm    Post subject: Reply with quote

Thanks.
Code:
  # fdisk -l /dev/md2                                                                                                                                  !4037
Disk /dev/md2: 447.13 GiB, 480102842368 bytes, 937700864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device     Boot Start   End Sectors Size Id Type
/dev/md2p3      27674 27674       0   0B  0 Empty



BTW I'm still using LILO. Could never get my head around Grub, and LILO is easy and just works..
_________________
Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!"
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum