Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Gentoo RAM performance very low compared to Ubuntu
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
tnt
Veteran
Veteran


Joined: 27 Feb 2004
Posts: 1222

PostPosted: Wed Mar 27, 2024 9:06 pm    Post subject: Gentoo RAM performance very low compared to Ubuntu Reply with quote

While plotting Chia compressed plots using MadMax's GigaHorse ploter, I have noticed a huge performance gap between my custom Gentoo setup and out-of-the-box Ubuntu 23.10.
Test is made on the same hardware:
Asus H570 board
Intel i5-11500
4x 32GB DDR4 3600MHz running at 3200MHz
Sabrent 1TB PCIE 4.0 NVME
GeForce RTX 3060

Difference in times needed to accomplish particular part of plotting were the most obvious in RAM-bound phases - 1 and 3, while phase 2 should be NVME-bound and also had some performance hit.

Booted into Gentoo runnign kernel 6.6.21, I got following results:
Code:
[P1] Setup took 0.323 sec
[P1] Table 1 took 5.627 sec, 4294967296 entries, 16787746 max, 66754 tmp, 0 GB/s up, 5.86462 GB/s down
[P1] Table 2 took 9.856 sec, 4294880415 entries, 16788826 max, 66552 tmp, 3.24675 GB/s up, 5.44087 GB/s down
[P1] Table 3 took 16.615 sec, 4294750448 entries, 16788375 max, 66615 tmp, 3.12964 GB/s up, 5.46195 GB/s down
[P1] Table 4 took 23.009 sec, 4294484364 entries, 16786628 max, 66626 tmp, 3.8244 GB/s up, 5.01979 GB/s down
[P1] Table 5 took 25.984 sec, 4293956223 entries, 16785222 max, 66726 tmp, 4.30986 GB/s up, 4.44505 GB/s down
[P1] Table 6 took 20.051 sec, 4292877198 entries, 16784067 max, 66568 tmp, 3.98889 GB/s up, 4.93742 GB/s down
[P1] Table 7 took 13.704 sec, 4290648161 entries, 16769769 max, 66575 tmp, 4.6679 GB/s up, 3.31109 GB/s down
Phase 1 took 115.563 sec
[P2] Setup took 0.103 sec
[P2] Table 7 took 8.163 sec, 3.91619 GB/s up, 0.0622128 GB/s down
[P2] Table 6 took 8.206 sec, 3.89769 GB/s up, 0.0618868 GB/s down
Phase 2 took 16.575 sec
[P3] Setup took 0.196 sec
[P3] Table 5 LPSK took 9.859 sec, 3531613004 entries, 13869026 max, 55054 tmp, 3.29651 GB/s up, 4.60242 GB/s down
[P3] Table 5 NSK took 11.031 sec, 3531613004 entries, 13813151 max, 55054 tmp, 3.27983 GB/s up, 5.34285 GB/s down
[P3] Table 6 PDSK took 9.624 sec, 3711285104 entries, 14525420 max, 57641 tmp, 3.37617 GB/s up, 4.7148 GB/s down
[P3] Table 6 LPSK took 13.024 sec, 3711285104 entries, 15084740 max, 60282 tmp, 4.93957 GB/s up, 3.80069 GB/s down
[P3] Table 6 NSK took 11.398 sec, 3711285104 entries, 14513425 max, 59903 tmp, 3.63896 GB/s up, 5.17082 GB/s down
[P3] Table 7 PDSK took 10.379 sec, 4290648161 entries, 16790096 max, 66575 tmp, 4.23507 GB/s up, 4.37183 GB/s down
[P3] Table 7 LPSK took 14.483 sec, 4290648161 entries, 17195398 max, 68992 tmp, 4.94421 GB/s up, 3.735 GB/s down
[P3] Table 7 NSK took 12.304 sec, 4290648161 entries, 16769769 max, 68255 tmp, 3.89725 GB/s up, 5.03897 GB/s down
Phase 3 took 92.423 sec


Booted in Unbuntu running kernel 6.5.0, results were:
Code:
[P1] Setup took 0.368 sec
[P1] Table 1 took 2.68 sec, 4294967296 entries, 16787300 max, 66613 tmp, 0 GB/s up, 12.3135 GB/s down
[P1] Table 2 took 7.636 sec, 4294886151 entries, 16787262 max, 66647 tmp, 4.19068 GB/s up, 7.02269 GB/s down
[P1] Table 3 took 10.1 sec, 4294702536 entries, 16787988 max, 66709 tmp, 5.14842 GB/s up, 8.98517 GB/s down
[P1] Table 4 took 11.571 sec, 4294354262 entries, 16785013 max, 66581 tmp, 7.60475 GB/s up, 9.98187 GB/s down
[P1] Table 5 took 15.38 sec, 4293648928 entries, 16784858 max, 66567 tmp, 7.28115 GB/s up, 7.50977 GB/s down
[P1] Table 6 took 11.766 sec, 4292330925 entries, 16776858 max, 66581 tmp, 6.79717 GB/s up, 8.4141 GB/s down
[P1] Table 7 took 12.129 sec, 4289584295 entries, 16767908 max, 66574 tmp, 5.27337 GB/s up, 3.74105 GB/s down
Phase 1 took 72.001 sec
[P2] Setup took 0.127 sec
[P2] Table 7 took 5.057 sec, 6.31993 GB/s up, 0.100424 GB/s down
[P2] Table 6 took 5.042 sec, 6.34279 GB/s up, 0.100723 GB/s down
Phase 2 took 10.279 sec
[P3] Setup took 0.203 sec
[P3] Table 5 LPSK took 5.252 sec, 3531259000 entries, 13877916 max, 55015 tmp, 6.18774 GB/s up, 8.63961 GB/s down
[P3] Table 5 NSK took 6.588 sec, 3531259000 entries, 13807884 max, 55015 tmp, 5.49122 GB/s up, 8.94611 GB/s down
[P3] Table 6 PDSK took 5.624 sec, 3710654034 entries, 14513293 max, 57605 tmp, 5.77671 GB/s up, 8.06814 GB/s down
[P3] Table 6 LPSK took 5.096 sec, 3710654034 entries, 15087294 max, 60459 tmp, 12.6224 GB/s up, 9.71355 GB/s down
[P3] Table 6 NSK took 6.18 sec, 3710654034 entries, 14506820 max, 59942 tmp, 6.71032 GB/s up, 9.53673 GB/s down
[P3] Table 7 PDSK took 6.352 sec, 4289584295 entries, 16778633 max, 66574 tmp, 6.91827 GB/s up, 7.14346 GB/s down
[P3] Table 7 LPSK took 5.723 sec, 4289584295 entries, 17198315 max, 68990 tmp, 12.5094 GB/s up, 9.45203 GB/s down
[P3] Table 7 NSK took 7.197 sec, 4289584295 entries, 16767908 max, 68324 tmp, 6.66109 GB/s up, 8.61463 GB/s down
Phase 3 took 48.345 sec


Practically, Gentoo was maxing out at throughput of 5GB/s up and down, while Ubutnu was achiving up to 12GB/s in both directions.
Cumulative difference can be seen in number of seconds needed per phase:
Phase 1 - 115 vs 72 sec
Phase 3 - 92 vs 48 sec

So, my guess is that I have heavily miss-configured Gentoo kernel, as I have no idea what else could have such a heavy impact on performance.
My kernel config is here:
https://pastebin.com/PpDgbtDn

Does anybody have an idea into which kernel options I could look and investigate in order to achieve performance comparable to Ubuntu ones?
Thx.
_________________
gentoo user
Back to top
View user's profile Send private message
CooSee
Veteran
Veteran


Joined: 20 Nov 2004
Posts: 1441
Location: Earth

PostPosted: Wed Mar 27, 2024 9:48 pm    Post subject: Reply with quote

you compared 2 different kernel versions ?!

install the same kernel on ubuntu and test again.

version 6.5 is from september last year and until now there were so many changes - security patches etc.

now, i'am curious.

8)
_________________
" Die Realität ist eine Illusion, die durch Mangel an ehrlicher Kommunikation entsteht "
---
" Der Mensch ist von Natur aus neugierig, was am Ende übrig bleibt ist die Gier "


Last edited by CooSee on Wed Mar 27, 2024 9:51 pm; edited 1 time in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54253
Location: 56N 3W

PostPosted: Wed Mar 27, 2024 9:50 pm    Post subject: Reply with quote

tnt,

Your
Code:
emerge --info
and /proc/cpuinfo may be more relevant than your kernel.
That's two pastebins please.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 4160
Location: Bavaria

PostPosted: Wed Mar 27, 2024 11:03 pm    Post subject: Reply with quote

tnt,

you have a really fine kernel .config ... HUGEPAGE, CONFIG_LRU_GEN_ENABLED=y and all Accelerated Cryptographic Algorithms you use enabled static ... I have seldom seen such a good configuration. IMHO there are only three options which would do better:
Code:
1)
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
2)
# CONFIG_MCORE2 is not set
3)
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL=y

1) Setting to None will give you higher latencies but with a better throughput
2) You have an Intel i5-11500 so this is a minimum.
3) Setting to "Performance" gives you a little bit more (at the cost of higher power consumption)

You have enabled some security mitigations ... but I think Ubuntu has it also enabled ... so there should be no difference ... but you could here get some performance ... at the price of security ... your decision.

Code:
CONFIG_SPECULATION_MITIGATIONS=y
CONFIG_PAGE_TABLE_ISOLATION=y
CONFIG_RETPOLINE=y
CONFIG_RETHUNK=y
CONFIG_CALL_DEPTH_TRACKING=y
CONFIG_CPU_IBRS_ENTRY=y


BTW:

Code:
4)
# CONFIG_X86_INTEL_LPSS is not set
5)
CONFIG_X86_INTEL_TSX_MODE_AUTO=y
6)
# CONFIG_MFD_INTEL_LPSS_PCI is not set
7)
# CONFIG_SOUND is not set
8)
# CONFIG_INTEL_IDMA64 is not set
CONFIG_INTEL_IOATDMA=y
9)
# CONFIG_X86_PLATFORM_DEVICES is not set
10)
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=m

4) Some Intel systems want this - EVEN if it is not a Notebook. Enabling costs nothing.
5) I would this switch off.
6) You could enable it as module and look if it is loaded (needs 4)
7) No sound ... seldom ... I see you know what you do ;-)
8) Are you sure you need this one and not IDMA64 ? If yes, then all is okay.
9) Because this is disabled you have no Intel PMC Core driver, but I guess you dont need it (*)
10) You can disable EXT2 because it can be handled from EXT4 also (EXT2 will get deprecated with 6.9 and removed later)

*) What I always recommend - and you maybe already did - is to check all loaded modules with "lsmod" (yes, also with Ubuntu) to see which of them are used/loaded.


P.S.: You might also check with "cpuid | grep x2APIC" if you have it and if yes you might enable it also.
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
tnt
Veteran
Veteran


Joined: 27 Feb 2004
Posts: 1222

PostPosted: Thu Mar 28, 2024 9:28 am    Post subject: Reply with quote

@CooSee
Yes, different kernels, but they are just one version away and I really do not expect to have so many regressions in 6.6.x to get this kind of performance difference.
I opted for 6.6 on Gentoo as it is 24th LTS release and should be supported longer then a regular 6.5 and other non-LTS versions.

@NeddySeagoon
Here's my emerge --info
https://pastebin.com/pWRwkXgJ

and here's mu /proc/cpuinfo
https://pastebin.com/rBCqSjxe

I just realized I have i5-11400 and not 11500, but the difference between those two is purely in clock speeds.

@pietinger
Thx for a kind works. I'm no kernel expert and my conf was slowly evolving by the years, but I try to be reasonable with it.
I will read about your suggestions and try them. Let's hope that brings some difference.
_________________
gentoo user
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54253
Location: 56N 3W

PostPosted: Thu Mar 28, 2024 7:36 pm    Post subject: Reply with quote

tnt,

Code:
CFLAGS="-march=native -O2 -pipe"

CXXFLAGS="-march=native -O2 -pipe"

CPU_FLAGS_X86="aes avx avx2 avx512_bitalg avx512_vbmi2 avx512_vnni avx512_vpopcntdq avx512bw avx512cd avx512dq avx512f avx512ifma avx512vbmi avx512vl f16c fma3 mmx mmxext pclmul popcnt rdrand sha sse sse2 sse3 sse4_1 sse4_2 ssse3 vpclmulqdq"


Are all the right settings.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
e8root
n00b
n00b


Joined: 09 Feb 2024
Posts: 71

PostPosted: Thu Mar 28, 2024 9:57 pm    Post subject: Reply with quote

Try emerging default kernel - read: without any custom changes.
Then if you really need to make changes move one change at the time and note results.

Generally such huge performance differences indicate CPU frequency might not be correct.
Is this performance difference only visible in memory bandwidth or general CPU performance also?
_________________
Unix Wars - Episode V: AT&T Strikes Back
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum