View previous topic :: View next topic |
Author |
Message |
ferg Guru


Joined: 15 Nov 2002 Posts: 540 Location: Cambridge, UK
|
Posted: Sat Oct 10, 2020 8:38 am Post subject: Ssd, mdadm & raid1 |
|
|
I've just moved my root partition on a RAID1 device from spinning disks to two new fancy SSDs.
I did this by:
- Failing one disk and removing it.
- Replacing it with an SSD
- Syncing
- Repeating with the other disk
- Finally growing the RAID1 device and resizing the filesystem
So I did not recreate the raid1 device.
However, now I've come to read the Gentoo SSD wiki I realise that there are quite a few things I should have done.
Mainly:
- I should have aligned the partitions when I created them. I used Fdisk so I guess this is not done automatically
- I should have have reformatted the EXT4 partition to match the ERASE block size of the SSDs.
How important are those? Is it worth recreating the RAID1 device or are the performance benefits not really worth it?
I will look at creating a weekly CRON job for TRIM/DISCARD.
My Portage TMPDIR is already on a RAM TMPFS and I plan to move any other frequent I/O location to another (I have plenty of RAM).
Thanks.
Cheers
Ferg _________________ Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!" |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55307 Location: 56N 3W
|
Posted: Sat Oct 10, 2020 9:09 am Post subject: |
|
|
ferg,
You can improve this in a number of ways.
Failing one disk and removing it means you lost redundancy for the rebuild.
You should have added one SSD to the existing raid set, then used the mdadm replace command.
It will use the entire raid as a source to recreate the member to be replaced on the SSD.
No loss of redundancy. It matters when you get a read error during the rebuild.
fdisk will be fine. It has created 4k aligned partitions since soon after Advanted Format rotating rust became a thing.
With UEFI, it moved the default first partition start to 1MiB too, to allow for the partition table.
You can check. if its wrong, it will hurt your write speed and increase your write amplification, which is a bad thing.
Aligning things to the erase block size is nice but not essential. Its very difficult to discover that.
One thing you don't mention.
You should run fstrim in a monthly cron job. Using the discard option to mount is OK if your SSDs implement it properly, many don't, leading to unnecessary erases.
If you don't have aligned partitions, its worth recreating the raid.
If your filesystem block size is smaller than 4k, recreate the filesystem.
For /boot, it won't matter as its rarely written. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
ferg Guru


Joined: 15 Nov 2002 Posts: 540 Location: Cambridge, UK
|
Posted: Sat Oct 10, 2020 9:26 am Post subject: |
|
|
As ever many thanks for your insightful comments!
NeddySeagoon wrote: | ferg,
You can improve this in a number of ways.
Failing one disk and removing it means you lost redundancy for the rebuild.
You should have added one SSD to the existing raid set, then used the mdadm replace command.
It will use the entire raid as a source to recreate the member to be replaced on the SSD.
No loss of redundancy. It matters when you get a read error during the rebuild. |
I was pleasantly surprised to discover that I could have done. It's been some years since I've properly played with MDADM. However, I didn't have any free SATA ports on my motherboard and it would have meant a lot of shuffling around of stuff. All is backed up in multiple places so I reckoned the risk was minimal. Also for clarity I wasn't entirely honest with my original post as I did have three HDD in the original array. So at all times I had two synced devices in it. I've now reduced the number of devices to just the pair of SSDs.
Quote: | fdisk will be fine. It has created 4k aligned partitions since soon after Advanted Format rotating rust became a thing.
With UEFI, it moved the default first partition start to 1MiB too, to allow for the partition table.
You can check. if its wrong, it will hurt your write speed and increase your write amplification, which is a bad thing.
Aligning things to the erase block size is nice but not essential. Its very difficult to discover that.
| I'm afraid I used the old DOS partitions instead of GPT. Is that a bad thing WRT aligned partitions?[/quote]
Quote: | One thing you don't mention.
You should run fstrim in a monthly cron job. Using the discard option to mount is OK if your SSDs implement it properly, many don't, leading to unnecessary erases. |
Will do. Thanks!
Quote: | If you don't have aligned partitions, its worth recreating the raid.
If your filesystem block size is smaller than 4k, recreate the filesystem.
For /boot, it won't matter as its rarely written. |
Block size is 4k
Code: | # tune2fs -l /dev/md2 | grep -i block !4033
Block count: 117212608
Reserved block count: 4782612
Free blocks: 107733806
First block: 0
Block size: 4096
Reserved GDT blocks: 996
Blocks per group: 32768
Inode blocks per group: 511
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
Journal backup: inode blocks |
Thanks again. Cheers Ferg _________________ Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!" |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55307 Location: 56N 3W
|
Posted: Sat Oct 10, 2020 9:40 am Post subject: |
|
|
ferg,
fdisk reserves the space for GPT.
When the drive is partitioned, fdisk does not know how it will be used.
What does tell ? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
ferg Guru


Joined: 15 Nov 2002 Posts: 540 Location: Cambridge, UK
|
Posted: Sat Oct 10, 2020 10:06 am Post subject: |
|
|
NeddySeagoon wrote: | ferg,
fdisk reserves the space for GPT.
When the drive is partitioned, fdisk does not know how it will be used.
What does tell ? |
Here is the output (minus the other drives).
Code: | Disk /dev/sda: 447.13 GiB, 480103981056 bytes, 937703088 sectors
Disk model: SATA3 480GB SSD
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x8ee15595
Device Boot Start End Sectors Size Id Type
/dev/sda1 2048 937703087 937701040 447.1G fd Linux raid autodetect
Disk /dev/sdb: 447.13 GiB, 480103981056 bytes, 937703088 sectors
Disk model: SATA3 480GB SSD
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xdd9eef6a
Device Boot Start End Sectors Size Id Type
/dev/sdb1 2048 937703087 937701040 447.1G fd Linux raid autodetect
Disk /dev/md2: 447.13 GiB, 480102842368 bytes, 937700864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000
Device Boot Start End Sectors Size Id Type
/dev/md2p3 27674 27674 0 0B 0 Empty |
That looks Ok right? I should have read up a little bit more on what partition alignment is before posting  _________________ Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!" |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55307 Location: 56N 3W
|
Posted: Sat Oct 10, 2020 10:27 am Post subject: |
|
|
ferg.
Your SSD lies.
Code: | Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes |
It will really have a 4k physical block size and 512B writes will be faked.
Code: | Device Boot Start End Sectors Size Id Type
/dev/sda1 2048 937703087 937701040 447.1G fd Linux raid autodetect |
2048 is exactly divisible by 8, so your partition start sectors are 4k aligned. That means you are good.
2048 is also 1MiB, so there is space for the first copy of the GPT, if you wanted one.
Code: | Device Boot Start End Sectors Size Id Type
/dev/md2p3 27674 27674 0 0B 0 Empty |
A first sight, that partition fails. 27674/8 is 3459.25 but its not that simple because thats from the start of the raid, not the underlying SSD, which is what matters.
What raid version do you use, 0.90 or 1.2?
That determines where the raid metadata is on disk. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
ferg Guru


Joined: 15 Nov 2002 Posts: 540 Location: Cambridge, UK
|
Posted: Sat Oct 10, 2020 10:34 am Post subject: |
|
|
NeddySeagoon wrote: | ferg.
Your SSD lies.
Code: | Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes |
It will really have a 4k physical block size and 512B writes will be faked. |
How sneaky! I guess since the filesystem has 4k blocks then it's still OK? ..or does that have any consequences?
Quote: |
Code: | Device Boot Start End Sectors Size Id Type
/dev/md2p3 27674 27674 0 0B 0 Empty |
A first sight, that partition fails. 27674/8 is 3459.25 but its not that simple because thats from the start of the raid, not the underlying SSD, which is what matters.
What raid version do you use, 0.90 or 1.2?
That determines where the raid metadata is on disk. |
0.9. Mainly as I understood that you cannot boot from a 1.2 array with DOS partitions (when i setup this array many years ago). Is that still true?
I guess I should have created GPT partitions and not DOS ones. I only chose partitions to make it more flexible in case I need to change a drive in the future.
Thanks. _________________ Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!" |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55307 Location: 56N 3W
|
Posted: Sat Oct 10, 2020 12:52 pm Post subject: |
|
|
ferg,
That the SSD lies about block sizes is just something to keep in mind.
Partitioning a raid is a bit novel. Its only been supported for a few years.
Personally, I use LVM on top of raid and leave empty unallocated space in the physical volume, which I can allocate later.
For raid metadata=0.90 the metadata is at the end of the raid volume.
For raid metadata=1.2 the metadata is at the start of the raid volume.
We need to know which is in use on /dev/md2, so that we can calculate the on disk layout.
Then you have space for the partition table at the start of the raid.
What does tell?
grub legacy just ignored raid completely, hence metadata at the end on raid1 worked for /boot on raid, forcing both metadata=0.09 and raid1. The filesystem starts where it always does.
grub2 is raid aware and does its own thing to deal with /boot on raid. Any raid level works. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
ferg Guru


Joined: 15 Nov 2002 Posts: 540 Location: Cambridge, UK
|
Posted: Sat Oct 10, 2020 1:20 pm Post subject: |
|
|
Thanks. Code: | # fdisk -l /dev/md2 !4037
Disk /dev/md2: 447.13 GiB, 480102842368 bytes, 937700864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000
Device Boot Start End Sectors Size Id Type
/dev/md2p3 27674 27674 0 0B 0 Empty
|
BTW I'm still using LILO. Could never get my head around Grub, and LILO is easy and just works.. _________________ Climb up it, kayak down it + make sure it runs on GNU/Linux
"cease to exist, giving my goodbye, drive my car into the ocean,
you think I'm dead, but i sail away, on a wave of mutilation!" |
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|