Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Kernel & Hardware
  • Search

How come EXT4 slows my ssd so much?

Kernel not recognizing your hardware? Problems with power management or PCMCIA? What hardware is compatible with Gentoo? See here. (Only for kernels supported by Gentoo.)
Post Reply
Advanced search
22 posts • Page 1 of 1
Author
Message
RayDude
Advocate
Advocate
User avatar
Posts: 2195
Joined: Sat May 29, 2004 6:11 am
Location: San Jose, CA

How come EXT4 slows my ssd so much?

  • Quote

Post by RayDude » Thu May 30, 2019 4:44 pm

Code: Select all

server /mnt/backup/root # hdparm -tT /dev/nvme0n1

/dev/nvme0n1:
 Timing cached reads:   22260 MB in  2.00 seconds = 11144.34 MB/sec
 Timing buffered disk reads: 8146 MB in  3.00 seconds = 2715.05 MB/sec
server /mnt/backup/root # hdparm -tT /dev/nvme0n1p4

/dev/nvme0n1p4:
 Timing cached reads:   20114 MB in  2.00 seconds = 10068.90 MB/sec
 Timing buffered disk reads: 3356 MB in  3.00 seconds = 1118.66 MB/sec
This bugs me. I mean I really don't notice the performance difference, but it seem wrong for ext4 to create such an incredible overhead.

Is this normal? Is this expected?
Some day there will only be free software.
Top
Ant P.
Watchman
Watchman
Posts: 6920
Joined: Sat Apr 18, 2009 7:18 pm
Contact:
Contact Ant P.
Website

  • Quote

Post by Ant P. » Thu May 30, 2019 4:46 pm

Is the partition correctly aligned?
Top
tmcca
Tux's lil' helper
Tux's lil' helper
Posts: 120
Joined: Fri May 24, 2019 11:30 pm

  • Quote

Post by tmcca » Thu May 30, 2019 5:30 pm

I was going to say same thing make sure it is aligned. Also use fstrim instead of discard on root. You can use discard on boot I think that is correct approach.

How did you partition drive? Did you use parted?
Top
mike155
Advocate
Advocate
Posts: 4438
Joined: Fri Sep 17, 2010 11:33 pm
Location: Frankfurt, Germany

  • Quote

Post by mike155 » Thu May 30, 2019 5:33 pm

Is ext4's lazy inode table zeroing still running? See: 'man mkfs.ext4', option 'lazy_itable_init'.
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56085
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Thu May 30, 2019 6:23 pm

RayDude,

Code: Select all

# hdparm -tT /dev/nvme0n1
does raw sequential reads from the block device.
The contents of the blocks read are ignored. That is, the read speed returned by

Code: Select all

hdparm -tT
does not depend on the filesystem, if any.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
RayDude
Advocate
Advocate
User avatar
Posts: 2195
Joined: Sat May 29, 2004 6:11 am
Location: San Jose, CA

  • Quote

Post by RayDude » Thu May 30, 2019 8:30 pm

Thanks for the quick replies.

I used gparted to partition the disk so the alignment should be correct.

I'll put fstrim on root and see if that makes a difference.

I'll check the lazy itable feature as well.
Some day there will only be free software.
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56085
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Thu May 30, 2019 8:51 pm

RayDude,

fstrim is about erasing used but free space in good time before you want to reuse it.
It will make no difference to the read speed.

Boot from a liveCD and rerun the tests when you are sure the partitions are not in use.
Don't even mount them.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
RayDude
Advocate
Advocate
User avatar
Posts: 2195
Joined: Sat May 29, 2004 6:11 am
Location: San Jose, CA

  • Quote

Post by RayDude » Thu May 30, 2019 9:42 pm

NeddySeagoon wrote:RayDude,

fstrim is about erasing used but free space in good time before you want to reuse it.
It will make no difference to the read speed.

Boot from a liveCD and rerun the tests when you are sure the partitions are not in use.
Don't even mount them.
Thanks Neddy, I'll try that.
Some day there will only be free software.
Top
Naib
Watchman
Watchman
User avatar
Posts: 6101
Joined: Fri May 21, 2004 9:42 pm
Location: Removed by Neddy
Contact:
Contact Naib
Website

  • Quote

Post by Naib » Fri May 31, 2019 1:13 pm

What is the IO scheduler being used?
#define HelloWorld int
#define Int main()
#define Return printf
#define Print return
#include <stdio>
HelloWorld Int {
Return("Hello, world!\n");
Print 0;
Top
Zucca
Administrator
Administrator
User avatar
Posts: 4692
Joined: Thu Jun 14, 2007 10:31 pm
Location: Rasi, Finland
Contact:
Contact Zucca
Website

  • Quote

Post by Zucca » Fri May 31, 2019 4:16 pm

If you want to test filesystem performance, then use some other tool, like fio for example.

As Neddy said, hdparm "skips" filesystem. You can test disk performance with hdparm or (apparently) partition performance. As to why the partition performance is that much slower on an SSD, I have no clue. It would make sense if it was HDD you're testing...

Maybe it's about the IO scheduler as Naib was questioning.

I want to see how this ends up...
..: Zucca :..

Code: Select all

init=/sbin/openrc-init
-systemd -logind -elogind seatd
I am NaN! I am a man!
Top
Naib
Watchman
Watchman
User avatar
Posts: 6101
Joined: Fri May 21, 2004 9:42 pm
Location: Removed by Neddy
Contact:
Contact Naib
Website

  • Quote

Post by Naib » Fri May 31, 2019 4:20 pm

also note that hdparm expects pata/sata type devices, nvme is not that so it might mis-report. nvme-tools provides means to do block reads
#define HelloWorld int
#define Int main()
#define Return printf
#define Print return
#include <stdio>
HelloWorld Int {
Return("Hello, world!\n");
Print 0;
Top
Pearlseattle
Apprentice
Apprentice
User avatar
Posts: 165
Joined: Thu Oct 04, 2007 11:07 am
Location: Switzerland
Contact:
Contact Pearlseattle
Website

Re: How come EXT4 slows my ssd so much?

  • Quote

Post by Pearlseattle » Fri May 31, 2019 10:44 pm

RayDude wrote:

Code: Select all

server /mnt/backup/root # hdparm -tT /dev/nvme0n1

/dev/nvme0n1:
 Timing cached reads:   22260 MB in  2.00 seconds = 11144.34 MB/sec
 Timing buffered disk reads: 8146 MB in  3.00 seconds = 2715.05 MB/sec
server /mnt/backup/root # hdparm -tT /dev/nvme0n1p4

/dev/nvme0n1p4:
 Timing cached reads:   20114 MB in  2.00 seconds = 10068.90 MB/sec
 Timing buffered disk reads: 3356 MB in  3.00 seconds = 1118.66 MB/sec
This bugs me. I mean I really don't notice the performance difference, but it seem wrong for ext4 to create such an incredible overhead.

Is this normal? Is this expected?
I thought that the tests done by hdparm did not involve at all the specific filesystem used for the partition?
Top
Hu
Administrator
Administrator
Posts: 24389
Joined: Tue Mar 06, 2007 5:38 am

  • Quote

Post by Hu » Sat Jun 01, 2019 12:39 am

That is what NeddySeagoon and Zucca both said, yes. The hdparm tests should be usable even on a device with no filesystem at all.

RayDude: please post the actual alignment so we can review whether the alignment is correct. The smartctl -a output could also be interesting. Hide any identifying data (such as serial numbers). We only need general model information.
Top
RayDude
Advocate
Advocate
User avatar
Posts: 2195
Joined: Sat May 29, 2004 6:11 am
Location: San Jose, CA

  • Quote

Post by RayDude » Sat Jun 01, 2019 12:42 am

Update: I ran hdparm from a system-restore boot flash on an unmounted /dev/nvmen0p4 and got the same results.

Thanks for telling me about fio, I'll try it.

I just checked and my kernel is configured for no IO Scheduler. How is that possible?

There are three choices: MQ deadline, Kyber, and BFQ. Which should I select?

What does it use if none is selected. I seriously wonder how I did this...

Update: none is apparently good for NVME: https://wiki.ubuntu.com/Kernel/Reference/IOSchedulers

Edit: since I'm using a raid6 array, it looks like I should use deadline...
Some day there will only be free software.
Top
mike155
Advocate
Advocate
Posts: 4438
Joined: Fri Sep 17, 2010 11:33 pm
Location: Frankfurt, Germany

  • Quote

Post by mike155 » Sat Jun 01, 2019 12:49 am

RayDude wrote:I just checked and my kernel is configured for no IO Scheduler. How is that possible?
"none" (aka "noop") is the correct scheduler to use for NVMe disks.

See: https://stackoverflow.com/questions/276 ... h-nvme-ssd
Top
Pearlseattle
Apprentice
Apprentice
User avatar
Posts: 165
Joined: Thu Oct 04, 2007 11:07 am
Location: Switzerland
Contact:
Contact Pearlseattle
Website

  • Quote

Post by Pearlseattle » Sat Jun 01, 2019 9:36 pm

Edit: since I'm using a raid6 array, it looks like I should use deadline...
What do you mean RayDude? I think that you previously posted tests done directy against an nvme device and not against a raid... .
Top
RayDude
Advocate
Advocate
User avatar
Posts: 2195
Joined: Sat May 29, 2004 6:11 am
Location: San Jose, CA

  • Quote

Post by RayDude » Sun Jun 02, 2019 5:03 pm

Pearlseattle wrote:
Edit: since I'm using a raid6 array, it looks like I should use deadline...
What do you mean RayDude? I think that you previously posted tests done directy against an nvme device and not against a raid... .
The system boots off an NVME, but has a RAID6 arrary. To optimize the kernel for both the NVME and the RAID6 array it's best for me to use a deadline I/O scheduler. deadline doesn't slow the NVME much, but it improves the performance of the HD ARRAY.
Some day there will only be free software.
Top
RayDude
Advocate
Advocate
User avatar
Posts: 2195
Joined: Sat May 29, 2004 6:11 am
Location: San Jose, CA

  • Quote

Post by RayDude » Sun Jun 02, 2019 5:09 pm

Here's the partition table, according to parted:

Code: Select all

server ~ # parted /dev/nvme0n1
GNU Parted 3.2
Using /dev/nvme0n1
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p                                                                
Model: Unknown (unknown)
Disk /dev/nvme0n1: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system     Name      Flags
 1      1049kB  3146kB  2097kB                  BIOSBOOT  bios_grub
 2      3146kB  213MB   210MB   fat16           EFI       msftdata
 3      213MB   8803MB  8590MB  linux-swap(v1)  SWAP
 4      8803MB  1000GB  991GB   ext4            SERVER
Here's smarctl -a:

Code: Select all

server ~ # smartctl -a /dev/nvme0n1
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-5.1.5-gentoo] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       CT1000P1SSD8
Serial Number:                      XXXXXXXXXXX
Firmware Version:                   P3CR010
PCI Vendor/Subsystem ID:            0xc0a9
IEEE OUI Identifier:                0x000000
Controller ID:                      1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,000,204,886,016 [1.00 TB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Sun Jun  2 10:06:43 2019 PDT
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0016):   Format Frmw_DL Self_Test
Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         32 Pages
Warning  Comp. Temp. Threshold:     70 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     9.00W       -        -    0  0  0  0        5       5
 1 +     4.60W       -        -    1  1  1  1       30      30
 2 +     3.80W       -        -    2  2  2  2       30      30
 3 -   0.0500W       -        -    3  3  3  3     1000    1000
 4 -   0.0040W       -        -    4  4  4  4     6000    8000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        40 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    2,925,784 [1.49 TB]
Data Units Written:                 3,735,578 [1.91 TB]
Host Read Commands:                 16,841,519
Host Write Commands:                25,212,969
Controller Busy Time:               844
Power Cycles:                       12
Power On Hours:                     198
Unsafe Shutdowns:                   2
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               41 Celsius
Temperature Sensor 2:               39 Celsius
Temperature Sensor 5:               59 Celsius

Error Information (NVMe Log 0x01, max 256 entries)
No Errors Logged
Thanks for your help, everyone!
Some day there will only be free software.
Top
molletts
Tux's lil' helper
Tux's lil' helper
User avatar
Posts: 137
Joined: Sat Feb 16, 2013 12:08 pm

  • Quote

Post by molletts » Mon Jun 03, 2019 12:58 pm

RayDude wrote:The system boots off an NVME, but has a RAID6 arrary. To optimize the kernel for both the NVME and the RAID6 array it's best for me to use a deadline I/O scheduler. deadline doesn't slow the NVME much, but it improves the performance of the HD ARRAY.
You can use different schedulers on different devices if you like.

Put a line like this into /etc/udev/rules.d/10-ioscheduler.rules:

Code: Select all

ACTION=="add|change", KERNEL=="nvme*", ATTR{queue/scheduler}="none"
and the system should automatically use the noop scheduler for all NVMe devices and whatever you select as the default scheduler (e.g. deadline) for all other devices.

You can check which is being used for each device with something like:

Code: Select all

cat /sys/block/nvme0n1/queue/scheduler
substituting the device name as appropriate. It will show a list of available schedulers with the selected one bracketed.

(If you want to try out different schedulers, you can also echo the name of a scheduler that is available in your kernel to the file to change it on the fly.)

Hope this helps,
Stephen
Top
Anon-E-moose
Watchman
Watchman
User avatar
Posts: 6566
Joined: Fri May 23, 2008 7:31 pm
Location: Dallas area

  • Quote

Post by Anon-E-moose » Mon Jun 03, 2019 3:59 pm

hdparm works on devices, not partitions, and (I don't think) arrays.

if you want file system performance, then something like iozone would be more what you need.

Edit to add: not sure why there's a performance difference in your first post, it should make no difference whether you point to whole device or a partition of it, it still uses the whole device, because it talks to the controller (if I'm not mistaken)

https://ssd.userbenchmark.com/SpeedTest ... 1000P1SSD8

If running on a NVMe/PCIe Gen3 x4 slot then the device is supposed to hit ~2000 for reads and ~1700 for writes.
if it's not a gen3 slot then it will be slower, especially if that slot is shared with other cards, which is common on many motherboards.
UM780 xtx, 6.18 zen kernel, gcc 15, openrc, wayland
minixforum m1-s1 max -- same software as above but used for ai learning


Zealots are gonna be zealots, just like haters are gonna be haters
Top
Hu
Administrator
Administrator
Posts: 24389
Joined: Tue Mar 06, 2007 5:38 am

  • Quote

Post by Hu » Tue Jun 04, 2019 1:03 am

Please post the partition table without rounding. sgdisk --print can do this. The parted output is not clear whether the partitions are aligned to any of the commonly important boundaries.
Top
RayDude
Advocate
Advocate
User avatar
Posts: 2195
Joined: Sat May 29, 2004 6:11 am
Location: San Jose, CA

  • Quote

Post by RayDude » Sat Jun 08, 2019 3:51 pm

Hu wrote:Please post the partition table without rounding. sgdisk --print can do this. The parted output is not clear whether the partitions are aligned to any of the commonly important boundaries.
update: found it:

Code: Select all

server ~ # sgdisk --print /dev/nvme0n1
Disk /dev/nvme0n1: 1953525168 sectors, 931.5 GiB
Model: CT1000P1SSD8                            
Sector size (logical/physical): 512/512 bytes
Disk identifier (GUID): 1A547616-F8A0-485F-B15F-B6723E76FF7C
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 3437 sectors (1.7 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048            6143   2.0 MiB     EF02  BIOSBOOT
   2            6144          415743   200.0 MiB   0700  EFI
   3          415744        17192959   8.0 GiB     8200  SWAP
   4        17192960      1953523711   923.3 GiB   8300  SERVER


I can't find sgdisk...

How about this output from fdisk:

Code: Select all

server ~ # fdisk /dev/nvme0n1

Welcome to fdisk (util-linux 2.33.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): p
Disk /dev/nvme0n1: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: CT1000P1SSD8                            
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 1A547616-F8A0-485F-B15F-B6723E76FF7C

Device            Start        End    Sectors   Size Type
/dev/nvme0n1p1     2048       6143       4096     2M BIOS boot
/dev/nvme0n1p2     6144     415743     409600   200M Microsoft basic data
/dev/nvme0n1p3   415744   17192959   16777216     8G Linux swap
/dev/nvme0n1p4 17192960 1953523711 1936330752 923.3G Linux filesystem
Some day there will only be free software.
Top
Post Reply

22 posts • Page 1 of 1

Return to “Kernel & Hardware”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic