Previous references on this forum:
- viewtopic-t-1140476-start-0.html "[SOLVED] qtwebengine fails to build" (2021)
- viewtopic-t-1165113-start-0.html "badmem static allocation of memory hole" (2023)
I had compiler segfault when building in big packages, such as dev-qt/qtwebengine-6.8.2, or other qt or kde applications. With --keep-going in the options and going until the end, I would relaunch it and often it would then be successful, though on very large packages like qtwebengine, or chromium, it would fail again.
Memtest86+
I installed sys-apps/memtest86+-7.20 and rebooted into it. Memtest86+ runs 10 different tests (e.g. simple read/write, block move, modulo 20) and on my machine at 1 GB/min, meaning 2 hours for the 128 GB I have.
Anytime during the test, press <F1><F4><F4> to go to the badram mode, where the output is most usable for the solution. Instead of listing individual failed bytes (which could be thousands), memtest86+ tries to summarize them into ranges using a mask. The output is limited to 10 lines.
Here is the output in my case, copied from a mobile phone picture:
Code: Select all
badram=0x00000010168001b8,0xfffffffffffdb8,
0x0000001016800538,0xfffffffffffd38,
0x00000010168009f8,0xfffffffffffff8,
0x0000001016801178,0xfffffffffff578,
0x00000010168040b8,0xfffffffffff3f8,
0x0000001016804638,0xfffffffffffff8,
0x0000001016804cb8,0xfffffffffffdb8,
0x0000001016805038,0xfffffffffff978,
0x00000011af842960,0xfffffffffffff8,
0x00000011af847ea0,0xfffffffffffff8,
It can be seen that the addresses are very close and can be summarized by two 64k blocks:
Code: Select all
0x0000001016800000,0xffffffffffff0000
0x00000011af840000,0xffffffffffff0000
GRUB2
We add the ranges to /etc/default/grub then update the grub configuration:
Code: Select all
GRUB_BADRAM=0x0000001016800000,0xffffffffffff0000,0x00000011af840000,0xffffffffffff0000
grub-mkconfig -o /boot/grub/grub.cfg
Code: Select all
…
play 60 800 1
badram 0x0000001016800000,0xffffffffffff0000,0x00000011af840000,0xffffffffffff0000
### END /etc/grub.d/00_header ###
### BEGIN /etc/grub.d/10_linux ###
menuentry 'Gentoo GNU/Linux, with Linux 6.13.2-gentoo-x86_64' --class gentoo --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-6.13.2-gentoo-x86_64-advanced-36f0c021-f626-4ac4-b1f1-b3dc23d8f85d' {
The badram parameter is not a command-line parameter to the linux kernel. Nothing special is not visible in the grub menu, even if using "e" to edit a line.
After reboot, we check that the kernel takes into account the excluded address. We use the number "11af" which is part of the address (we could also check the other range which contains "10168").
Code: Select all
dmesg | grep 11af # to be adapted to each address range
[ 0.000000] BIOS-e820: [mem 0x0000001016810000-0x00000011af83ffff] usable
[ 0.000000] BIOS-e820: [mem 0x00000011af840000-0x00000011af84ffff] unusable
[ 0.000000] BIOS-e820: [mem 0x00000011af850000-0x000000201f2fffff] usable
[ 0.000000] reserve setup_data: [mem 0x0000001016810000-0x00000011af83ffff] usable
[ 0.000000] reserve setup_data: [mem 0x00000011af840000-0x00000011af84ffff] unusable
[ 0.000000] reserve setup_data: [mem 0x00000011af850000-0x000000201f2fffff] usable
[ 0.167868] node 0: [mem 0x0000001016810000-0x00000011af83ffff]
[ 0.167869] node 0: [mem 0x00000011af850000-0x000000201f2fffff]
[ 0.173324] PM: hibernation: Registered nosave memory: [mem 0x11af840000-0x11af84ffff]
[ 0.679391] e820: reserve RAM buffer [mem 0x11af840000-0x11afffffff]
Another, or complementary solution, is to ask the kernel to perform a memtest at boot.
According to the kernel documentation:
https://www.kernel.org/doc/html/latest/ ... eters.htmlMemtest fills the memory with this pattern, validates memory contents and reserves bad memory regions that are detected.
It will write different patterns and read them back, finding memory elements which are not functional and reserving them.
The parameter can be added dynamically at boot by editing the kernel line in GRUB. It can also be added to the GRUB configuration /etc/default/grub:
Code: Select all
GRUB_CMDLINE_LINUX_DEFAULT="memtest=4"
grub-mkconfig -o /boot/grub/grub.cfg
After boot is complete, one can see the result within the first lines of dmesg
Code: Select all
dmesg | less
[ 0.000000] early_memtest: # of tests: 4
[ 0.000000] 0x0000000000100000 - 0x0000000001000000 pattern aaaaaaaaaaaaaaaa
...
[ 0.000000] 0x0000000100000000 - 0x0000001016800000 pattern 5555555555555555
[ 0.000000] 5555555555555555 bad mem addr 0x00000001d386fee8 - 0x00000001d386fef0 reserved
[ 0.000000] 0x0000001016810000 - 0x00000011af840000 pattern 5555555555555555
...
The user can add a block size at choice in the GRUB configuration; for example to exclude a 64 kb block containing the bad byte: GRUB_BADRAM=0x00000001d3860000,0xffffffffffff0000


