Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
mdadm software raid 5 extreme slow write speed on harddisks.
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
linux_os2
Apprentice
Apprentice


Joined: 29 Aug 2018
Posts: 223
Location: Zedelgem Belgium

PostPosted: Sun Mar 10, 2024 3:13 pm    Post subject: mdadm software raid 5 extreme slow write speed on harddisks. Reply with quote

Slow write speed on raid 5 (mdadm)

read test:
Code:
 dd if=/home/test of=/dev/null bs=10M count=1000
1000+0 records in
1000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 1.97621 s, 5.3 GB/s

write test:
Code:
dd of=/home/test if=/dev/zero bs=10M count=1000
1000+0 records in
1000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 34.5883 s, 303 MB/s


setup:
    motherboard : ASUS Z10PE-D16 WS
    2 INTEL-XEON® E5-2640 V4 14NM 2.4~3.4GHZ 25MB 10 CORES
    memory: 128 GB
    harddisks: 6 x
    Code:
    === START OF INFORMATION SECTION ===
    Model Family:     Toshiba MG09ACA... Enterprise Capacity HDD
    Device Model:     TOSHIBA MG09ACA18TE
    Serial Number:    Y1L0A03RFJDH
    LU WWN Device Id: 5 000039 b48d9d5ec
    Firmware Version: 0104
    User Capacity:    18,000,207,937,536 bytes [18.0 TB]
    Sector Sizes:     512 bytes logical, 4096 bytes physical
    Rotation Rate:    7200 rpm
    Form Factor:      3.5 inches
    Device is:        In smartctl database 7.3/5319
    ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
    SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
    Local Time is:    Sun Mar 10 14:49:51 2024 CET
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    vmlinuz-6.6.13-gentoo-x86_64


Code:
cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md124 : active raid1 sdk[2] sdj[1]
      9766304768 blocks super 1.2 [2/2] [UU]
      bitmap: 0/73 pages [0KB], 65536KB chunk

md125 : active raid5 sdi1[6] sdh1[4] sdc1[2] sdb1[1] sda1[0] sdg1[3]
      87890972160 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/6] [UUUUUU]
      bitmap: 5/131 pages [20KB], 65536KB chunk

md126 : active raid1 sde[1] sdd[0]
      975585280 blocks super external:/md127/0 [2/2] [UU]
     
md127 : inactive sde[1](S) sdd[0](S)
      2354608 blocks super external:ddf
       
unused devices: <none>

md125 is the raid 5 array

during test 1 thread is 100% for about 12 seconds then percentage drops to about 1 or two percent above normal use.

dumpe2fs /dev/md125:
Code:
Filesystem volume name:   <none>
Last mounted on:          /home
Filesystem UUID:          05eeba2c-9ab4-4bb2-93ca-85c29bc9d852
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              1373296640
Block count:              21972743040
Reserved block count:     1098637152
Overhead clusters:        87727706
Free blocks:              20163879481
Free inodes:              1372448771
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         2048
Inode blocks per group:   128
RAID stride:              128
RAID stripe width:        640
Flex block group size:    16
Filesystem created:       Tue Jul 19 00:11:28 2022
Last mount time:          Sun Mar 10 11:52:29 2024
Last write time:          Sun Mar 10 14:59:26 2024
Mount count:              17
Maximum mount count:      200
Last checked:             Sun Mar  3 13:37:20 2024
Check interval:           2592000 (1 month)
Next check after:         Tue Apr  2 14:37:20 2024
Lifetime writes:          20 TB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:             256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
First orphan inode:       1008698824
Default directory hash:   half_md4
Directory Hash Seed:      ea36baf3-7a46-4859-9868-5c8345c490d4
Journal backup:           inode blocks
Checksum type:            crc32c
Checksum:                 0x28defea2
Journal features:         journal_incompat_revoke journal_64bit journal_checksum_v3
Total journal size:       1024M
Total journal blocks:     262144
Max transaction length:   262144
Fast commit length:       0
Journal sequence:         0x0030792f
Journal start:            175397
Journal checksum type:    crc32c
Journal checksum:         0xc51a21c5


----------------------------------------

the figures for md124: raid 1 on 2 10 TB drives: Western Digital Gold WDC WD101KRYZ-01JPDB1
connected via usb3 each in own usb-case
read:
Code:
dd if=/mntbackup/backup_partition/test of=/dev/null bs=10M count=1000
1000+0 records in
1000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 2.22996 s, 4.7 GB/s

write:
Code:
dd of=/mntbackup/backup_partition/test if=/dev/zero bs=10M count=1000
1000+0 records in
1000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 9.73622 s, 1.1 GB/s


Can a better performance be expected?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54254
Location: 56N 3W

PostPosted: Sun Mar 10, 2024 3:59 pm    Post subject: Reply with quote

linux_os2,

dd is a horrible speed test.

With zoned rotating rust HDD, the speed near the spindle is about 1/3 the speed at the edge as there are less sectors per track near the spindle but the platter rotates at 7,200 RPM.
Code:
dd if=/home/test of=/dev/null bs=10M count=1000
1000+0 records in
1000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 1.97621 s, 5.3 GB/s
is too fast to be true, so its probably not.
You might get 180Mb/sec sustained read speed near the edge of the platter and 40MB/sec near the spindle. That's the head/platter data rate limit for one drive. Caching in the drive and in the kernel can make it appear much faster, as transactions will be reported complete once the data is in the cache.
RAID should make it faster. Both RAID1 and RAID5.

303 MB/s is not too shabby, depending on where on the drive surface its being written.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
linux_os2
Apprentice
Apprentice


Joined: 29 Aug 2018
Posts: 223
Location: Zedelgem Belgium

PostPosted: Sun Mar 10, 2024 6:38 pm    Post subject: Reply with quote

Thanks Neddy,
I was expecting an answer from you...
So the backup - restore of the huge raid will take some time.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54254
Location: 56N 3W

PostPosted: Sun Mar 10, 2024 7:28 pm    Post subject: Reply with quote

linux_os2,

Yep. My one 18G drive takes 36h for the long test and you have 6 :)

What does smartctl -a ... say about the polling time for the long test.
That's long Toshiba think a full surface scan will take.

-- edit --

I would expect 36h (based on my drive) if you can work all 6 drives concurrently and keep the heads busy.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Sun Mar 10, 2024 7:57 pm    Post subject: Reply with quote

Yeah my meager 2TB x 3 array takes forever to backup and ... resilver ... which is the danger of these huge disks, even if my 2T disks are "not" (large, that is).
Not sure when I have to use RAID6 and forget about speed, just worry about uptime even if a disk fails.

Not sure when I'll get 18T disks...but yeah depressing that they only read at 180MB/sec. I'm stuck with 2T disks (mine are anywhere from 90MB/sec to 180MB/sec) at the moment just because they were cheap, I don't have much of a data hoard, and have a pile of them ready to replace if one goes.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54254
Location: 56N 3W

PostPosted: Sun Mar 10, 2024 8:07 pm    Post subject: Reply with quote

eccerr0r,

A few years ago, the limit was about 6TB/drive for raid5, where recalculating data for a failed drive from the remains of raid5 was thought to be a bit iffy.
I only discovered that after set up 4x8TB drives in rait5 ... Oops.

I know I should add another drive for raid6 ...
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
linux_os2
Apprentice
Apprentice


Joined: 29 Aug 2018
Posts: 223
Location: Zedelgem Belgium

PostPosted: Mon Mar 11, 2024 2:34 pm    Post subject: Reply with quote

was able to do a backup to the usb-raid 1 on the WD gold 10TB's
will get some other 6 drive of 18 or 20 TB when I win euromillions.

here is the log:
Code:
--------------------------------------
2024-03-10 20:29:36 backup of the partition starts
--------------------------------------
Opening User Interface mode.
Partclone v0.3.27 http://partclone.org
Starting to clone device (/dev/md125) to image (-)
Reading Super Block
memory needed: 2748690036 bytes
bitmap 2746592880 bytes, blocks 2*1048576 bytes, checksum 4 bytes
Calculating bitmap... Please wait...
Total Time: 00:12:21, Ave. Rate:   0.00byte/min, 100.00% completed!
done!
File system:  EXTFS
Device size:   90.0 TB = 21972743040 Blocks
Space in use:   7.4 TB = 1808859127 Blocks
Free Space:    82.6 TB = 20163883913 Blocks
Block size:   4096 Byte
Total block 21972743040
Total Time: 18:04:06, Ave. Rate:   6.83GB/min, 100.00% completed!
Syncing... OK!
Partclone successfully cloned the device (/dev/md125) to the image (-)
--------------------------------------
2024-03-11 14:51:56 backup of the partition was succesfull
--------------------------------------
--------------------------------------
2024-03-11 14:51:56 Wrote fdisk of the drive
--------------------------------------


not so bad I think.

my backup-script pipes the output from partclone to zstd. runs under lfs.
the 7.4 TB is compressed to 6.9 TB.
the compression rate is so low because the partition contains mostly flacs and the backups of the other partitions. 5.4 TB together.

Neddy, tests with smartctl will be done later when system can be missed for some time. (and there is sun enough, my system is consuming about 250 watt so 36 hours means 9 KWh) :)

Marc.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54254
Location: 56N 3W

PostPosted: Mon Mar 11, 2024 2:49 pm    Post subject: Reply with quote

linux_os2,

I just intended you to read the smartctl output. Four an 8TB drive I get.

Code:
# smartctl -a /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.7.6-gentoo] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba N300/MN NAS HDD
Device Model:     TOSHIBA HDWG480
Serial Number:    71R0A0NWFA3H
LU WWN Device Id: 5 000039 b08e0f122
Firmware Version: 0601
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5577
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Mar 11 14:41:25 2024 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)   Offline data collection activity
               was completed without error.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       (  120) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   2) minutes.
Extended self-test routine
recommended polling time:     ( 690) minutes.


The
Code:
Extended self-test routine
recommended polling time:     ( 690) minutes
is what I wanted to see. It only takes seconds :)
That's the time it takes the drive to do a surface scan with no data passing over the external data interface.

Your 6.83GB/min is about 114MB/sec which is not too bad.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
linux_os2
Apprentice
Apprentice


Joined: 29 Aug 2018
Posts: 223
Location: Zedelgem Belgium

PostPosted: Mon Mar 11, 2024 3:32 pm    Post subject: Reply with quote

here it is:
Code:
# smartctl -a /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.13-gentoo-x86_64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba MG09ACA... Enterprise Capacity HDD
Device Model:     TOSHIBA MG09ACA18TE
Serial Number:    Y1A0A031FJDH
LU WWN Device Id: 5 000039 b48d08a0c
Firmware Version: 0104
User Capacity:    18,000,207,937,536 bytes [18.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5319
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Mar 11 16:28:08 2024 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)   Offline data collection activity
               was suspended by an interrupting command from host.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       (  120) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   2) minutes.
Extended self-test routine
recommended polling time:     (1484) minutes.
SCT capabilities:           (0x003d)   SCT Status supported.
               SCT Error Recovery Control supported.
               SCT Feature Control supported.
               SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       8765
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       2673
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   091   091   000    Old_age   Always       -       3945
 10 Spin_Retry_Count        0x0033   100   100   030    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       2551
 23 Helium_Condition_Lower  0x0023   100   100   075    Pre-fail  Always       -       0
 24 Helium_Condition_Upper  0x0023   100   100   075    Pre-fail  Always       -       0
 27 MAMR_Health_Monitor     0x0023   100   100   030    Pre-fail  Always       -       331287
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       21
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       282
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       3084
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       27 (Min/Max 13/33)
196 Reallocated_Event_Count 0x0033   100   100   010    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       69337098
222 Loaded_Hours            0x0032   091   091   000    Old_age   Always       -       3760
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       679
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       12189486532
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       91336331000

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Mon Mar 11, 2024 3:52 pm    Post subject: Reply with quote

BTW, I use rsync to backup, and risk a bit of bit rot (though scrubbing the arrays help).

It was kind of funny, I chanced upon some 10Gb Ethernet cards for cheap and was thinking I should start upgrading to 10GbE but with the HDD bottleneck it probably doesn't make a whole lot of sense... On random reads even my RAID5 can't saturate 1GbE, and of course the single disk machines (other than the machines with SSDs) can't saturate 1GbE either.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54254
Location: 56N 3W

PostPosted: Mon Mar 11, 2024 3:56 pm    Post subject: Reply with quote

linux_os2,

Code:
Extended self-test routine
recommended polling time:     (1484) minutes.


Or just under 25 hours for a surface scan. You won't be able to read the entire drive any faster that that,
The good news is tat raid sets are accessed in parallel.

That's an average speed of 18,000,207,937,536/1484 bytes/min or 202MB/sec
The data sheet says

Quote:
Data Transfer Speed
(Sustained)(Typ.) 268MiB/s

which hints are reading data from several heads concurrently. I not seen that since magnetic drum storage.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
linux_os2
Apprentice
Apprentice


Joined: 29 Aug 2018
Posts: 223
Location: Zedelgem Belgium

PostPosted: Mon Mar 11, 2024 4:06 pm    Post subject: Reply with quote

speaking of drum-storage, where is the time that we painted back magnetic material after a crash.
I also remember magnetic ring storage, they were nonvolatile too.
we had units of 16KB in the 360-115 wow.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54254
Location: 56N 3W

PostPosted: Mon Mar 11, 2024 4:26 pm    Post subject: Reply with quote

linux_os2,

I know
Code:
magnetic ring storage
as core store.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum