| View previous topic :: View next topic |
| Author |
Message |
Cyker Veteran

Joined: 15 Jun 2006 Posts: 1746
|
Posted: Sun Mar 09, 2008 11:21 pm Post subject: RAID5 write performance - Any tips? |
|
|
Write performance on my software RAID5 is less than a third of the read performance and has very high IOwait.
I'm looking into ways of mitigating this with an eye to improving performance - Any suggestions out there?
The average disk buffer is 2GB of RAM and the array is using 64k stripes and a 4k block size with the recommended stride value.
I have done only minor tweaking with blockdev and hdparm, but they just shift the performance optimizations for certain types of load (i.e. small files vs big files), whereas I'm aiming for something more generic.
(blockdev --setra for md0 composite is 8192 and for sd? components is 2048)
I've tried setting various combinations of IO scheduler on both sd? and md0 devices, and CFQ seems to do the best job on average. |
|
| Back to top |
|
 |
drescherjm Advocate

Joined: 05 Jun 2004 Posts: 2790 Location: Pittsburgh, PA, USA
|
Posted: Mon Mar 10, 2008 8:27 pm Post subject: |
|
|
Take a look at /sys/block/mdX/md/stripe_cache_active and if it is being maxed out during writes increse stripe_cache_size. _________________ John
My gentoo overlay
Instructons for overlay |
|
| Back to top |
|
 |
Cyker Veteran

Joined: 15 Jun 2006 Posts: 1746
|
Posted: Mon Mar 10, 2008 10:32 pm Post subject: |
|
|
In the immortal words of Lister.... SMEGGING HELL!
I ran | Quote: | | echo 8192 > /sys/block/md0/md/stripe_cache_size | and the sustained write performance has jumped up by almost 300%!!!!
Thanks dude, that is fantastic tip!!!
This totally needs to go into the wiki! (Well, when it comes back ) |
|
| Back to top |
|
 |
drescherjm Advocate

Joined: 05 Jun 2004 Posts: 2790 Location: Pittsburgh, PA, USA
|
Posted: Tue Mar 11, 2008 6:19 pm Post subject: |
|
|
If anyone wonders what this does. It will reduce the amount of times that the os will need to read stripes from disk when writing part of a stripe. The reason for the reading is that when all stripes (including partial) are written the parity for the entire stripe needs to be updated. _________________ John
My gentoo overlay
Instructons for overlay |
|
| Back to top |
|
 |
Cyker Veteran

Joined: 15 Jun 2006 Posts: 1746
|
Posted: Tue Mar 11, 2008 7:16 pm Post subject: |
|
|
So it really is Yet Another Cache, but lower level than the normal disk buffer because it's caching the RAID stripe components?
Given that's one of the most intensive parts of the whole process I can see why boosting this affects the write speed so much
Question: Is there a way to make this 'dynamic' like the normal Linux disk buffer? So that it allocates as much RAM as it can, but shrinks down as 'real' programs need more memory?
Or do you think this would be a waste? (Is the 'diminishing returns' on this much worse than normal caching? My optimum seems to be 8192, but I tested it up to 32768 and it still displays a healthy performance boost, 'tho not as great as the jump to 8192.)
Also, are there any other nifty tweaks like this?
Now I just need to hope they fix the sky2 driver more (Previously I was getting 40MB/s but it'd stop working randomly under heavy load. Now it always works but peaks at 25-30MB/s) and I'l be sorted!  |
|
| Back to top |
|
 |
drescherjm Advocate

Joined: 05 Jun 2004 Posts: 2790 Location: Pittsburgh, PA, USA
|
Posted: Tue Mar 11, 2008 7:42 pm Post subject: |
|
|
| Quote: | | So it really is Yet Another Cache, but lower level than the normal disk buffer because it's caching the RAID stripe components? |
Yes, this is exactly what it does.
| Quote: | | Question: Is there a way to make this 'dynamic' like the normal Linux disk buffer? So that it allocates as much RAM as it can, but shrinks down as 'real' programs need more memory? |
I am not sure I have not investigated this. I know you can dynamically change the size at any time you want though. And you can always use stripe_cache_active as a guide.
| Quote: | | Also, are there any other nifty tweaks like this? |
None that effect performance as much. Besides the ones you have mentioned I have played around with /proc/sys/vm/dirty_expire_centisecs and /proc/sys/vm/dirty_writeback_centisecs to try to force linux to hold on to the writes a little longer so that it writes in larger blocks.
| Quote: | | Now I just need to hope they fix the sky2 driver more |
I had problems with that in the past. I believe a kernel change lessened the problem but I have not investigated this in a long time and I am not even sure what machine uses that driver. I do know the systems with I have tg3 gigabit nic drivers work the best. And the nvidia, sky2 and realtek Gbit nics had their problems... _________________ John
My gentoo overlay
Instructons for overlay |
|
| Back to top |
|
 |
richard.scott Veteran

Joined: 19 May 2003 Posts: 1497 Location: Oxfordshire, UK
|
Posted: Fri Oct 24, 2008 2:39 pm Post subject: |
|
|
| drescherjm wrote: | | Take a look at /sys/block/mdX/md/stripe_cache_active and if it is being maxed out during writes increse stripe_cache_size. |
I don't have this file on my system?
| Code: | # cd /sys/block/md0/md/
# ls
array_state component_size dev-hdb1 metadata_version raid_disks reshape_position suspend_hi sync_completed sync_speed_max
bitmap_set_bits degraded layout mismatch_cnt rd0 resync_start suspend_lo sync_max sync_speed_min
chunk_size dev-hda1 level new_dev rd1 safe_mode_delay sync_action sync_speed |
and if I try to create it I get this:
| Code: | # echo 8192 > stripe_cache_size
-su: stripe_cache_size: No such file or directory |
I guess this isn't something I can take advantage of  |
|
| Back to top |
|
 |
Cyker Veteran

Joined: 15 Jun 2006 Posts: 1746
|
Posted: Fri Oct 24, 2008 4:31 pm Post subject: |
|
|
Hmm... that's weird.
Hey, wait, what RAID are you using? |
|
| Back to top |
|
 |
drescherjm Advocate

Joined: 05 Jun 2004 Posts: 2790 Location: Pittsburgh, PA, USA
|
Posted: Fri Oct 24, 2008 10:03 pm Post subject: |
|
|
NOTE: You will not have that on raid 0 or raid 1. If this is raid 5 or 6 I agree this is weird. _________________ John
My gentoo overlay
Instructons for overlay |
|
| Back to top |
|
 |
richard.scott Veteran

Joined: 19 May 2003 Posts: 1497 Location: Oxfordshire, UK
|
Posted: Mon Oct 27, 2008 1:17 pm Post subject: |
|
|
| drescherjm wrote: | | NOTE: You will not have that on raid 0 or raid 1. If this is raid 5 or 6 I agree this is weird. |
It's on RAID 1  |
|
| Back to top |
|
 |
drescherjm Advocate

Joined: 05 Jun 2004 Posts: 2790 Location: Pittsburgh, PA, USA
|
|
| Back to top |
|
 |
manaka Apprentice


Joined: 23 Jul 2007 Posts: 178 Location: Spain
|
Posted: Fri Dec 12, 2008 7:14 pm Post subject: |
|
|
AFAIK, stripe_cache_size it's not dynamic in the current implementation, but can be changed at any time.
Another trick that can provide a little performance boost is aligning the data with the RAID chunks. In case you use ext3, have a look at the stride and stripe-width options of mke2fs. Other FS may provide similar options... _________________ Javier Miqueleiz
"Listen to your heart. It knows all things, because it came from the Soul of the World, and it will one day return there." |
|
| Back to top |
|
 |
drescherjm Advocate

Joined: 05 Jun 2004 Posts: 2790 Location: Pittsburgh, PA, USA
|
Posted: Mon Dec 15, 2008 3:56 am Post subject: |
|
|
| Quote: | | Another trick that can provide a little performance boost is aligning the data with the RAID chunks. |
I found this to be absolutely essential on windows software raid because they do not have a stripe cache. I have not tried this on linux. I will try to remember that next time I set up an array. I generally do not use ext3 because of its extremely slow deletes on large files (compared to xfs). I believe ext4 solves this so maybe that will be a good choice. _________________ John
My gentoo overlay
Instructons for overlay |
|
| Back to top |
|
 |
manaka Apprentice


Joined: 23 Jul 2007 Posts: 178 Location: Spain
|
Posted: Mon Dec 15, 2008 8:54 pm Post subject: |
|
|
BTW, mkfs.xfs also has an option for data aligning with RAID stripe size. It's the -d sunit option.
drescherjm, could you provide more details about that option? (I'm not a Windows guy, but for the sake of curiosity...). _________________ Javier Miqueleiz
"Listen to your heart. It knows all things, because it came from the Soul of the World, and it will one day return there." |
|
| Back to top |
|
 |
drescherjm Advocate

Joined: 05 Jun 2004 Posts: 2790 Location: Pittsburgh, PA, USA
|
Posted: Mon Dec 15, 2008 9:09 pm Post subject: |
|
|
| Quote: | | drescherjm, could you provide more details about that option? (I'm not a Windows guy, but for the sake of curiosity...). |
Hmm. I am a little confused at what you are asking but here goes:
On windows you have to partition the disks so that the starting sector is at 2048 instead of 63 or even better fully aligned depending on the number of disks. The following thread has a long discussion of this:
http://forums.storagereview.net/index.php?showtopic=25786
BTW, I use the same username there as here so the raid5 I was talking about there I ended up giving up and used single 750GB drives instead because performance under windows software raid (1, 10 and 5) with less than 5 drives was all slower than a single drive in reads and writes. In my opinion MSFT must have stock in some hardware raid controller company... _________________ John
My gentoo overlay
Instructons for overlay |
|
| Back to top |
|
 |
drescherjm Advocate

Joined: 05 Jun 2004 Posts: 2790 Location: Pittsburgh, PA, USA
|
Posted: Mon Dec 15, 2008 9:43 pm Post subject: |
|
|
| Quote: | | BTW, mkfs.xfs also has an option for data aligning with RAID stripe size. It's the -d sunit option. |
Thanks. I will check into that. _________________ John
My gentoo overlay
Instructons for overlay |
|
| Back to top |
|
 |
energyman76b Advocate


Joined: 26 Mar 2003 Posts: 2048 Location: Germany
|
Posted: Tue Jan 06, 2009 3:38 am Post subject: |
|
|
O.O I went from raid1 to raid5 today - and copying a 600mb file took ages - several minutes, with long pauses, when it was moving, it was around 3mb/sec...
I incresed stripe_cache_size from 256 to 2048 and it went over in a couple of seconds! Wow! Thanks for the hint. 4096 halfed it again - now to 5 seconds ....
all I need is a good place to put this settings somewhere. local.start? but it is the last one called in boot... _________________ Study finds stunning lack of racial, gender, and economic diversity among middle-class white males
I identify as a dirty penismensch. |
|
| Back to top |
|
 |
drescherjm Advocate

Joined: 05 Jun 2004 Posts: 2790 Location: Pittsburgh, PA, USA
|
|
| Back to top |
|
 |
|