View previous topic :: View next topic |
Author |
Message |
truekaiser l33t
Joined: 05 Mar 2004 Posts: 801
|
Posted: Sun Apr 02, 2017 9:55 pm Post subject: Easy way to tell if a seg fault is from ram? |
|
|
My new build is being stubborn. I have been trying to rebuild @world now that I am on the actual hardware and not the live cd.
I am getting some semi uncommon segmentation faults. And i am trying to determine if it is ram or using -j16.
they go something like
Code: | sh[800]: segfault at 32 ip 0000000000419be sp 00007ffe6d3b49f0 error 6 in bash[400000+a6000] |
the part after bash is the same with the latest two on this reboot.
these two sticks of coursair vengence ram set at 16 18 18 36 but running at 2133 rather than 2666(which they are spec'ed at because ryzen's being a bitch.) with i think running at it's rated 1.20v
Just trying to determine if this is a ram or software issue. I already spent more money on this build than i wanted too. |
|
Back to top |
|
|
Naib Watchman
Joined: 21 May 2004 Posts: 6051 Location: Removed by Neddy
|
Posted: Sun Apr 02, 2017 10:12 pm Post subject: |
|
|
have you checked the voltage?
I ran into an annoying issue when I did a Core2 build years ago .. turned out the mobo was conservatively setting the voltage on the low side while the Giel sticks I was using needed a bit more
found one of my threads from around that time: https://forums.gentoo.org/viewtopic-t-515645-highlight-ram.html
this was also the clock of the ram causing issues BUT I swear voltage was involved as well... _________________
Quote: | Removed by Chiitoo |
|
|
Back to top |
|
|
Jaglover Watchman
Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
|
Naib Watchman
Joined: 21 May 2004 Posts: 6051 Location: Removed by Neddy
|
|
Back to top |
|
|
Jaglover Watchman
Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
|
Naib Watchman
Joined: 21 May 2004 Posts: 6051 Location: Removed by Neddy
|
Posted: Sun Apr 02, 2017 11:08 pm Post subject: |
|
|
Sort of...
The CAS timings are what needs to be honoured otherwise the DRAM will become ambiguous. If you drop the operating frequency then the CAS settings need to be juggled to ensure the same timings. _________________
Quote: | Removed by Chiitoo |
|
|
Back to top |
|
|
Jaglover Watchman
Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
|
truekaiser l33t
Joined: 05 Mar 2004 Posts: 801
|
Posted: Mon Apr 03, 2017 3:02 am Post subject: |
|
|
i forgot to mention those are the default specs of the sticks.
but the asrock taichi board wouldn't post unless i let it auto detect it at 15 15 15 or so at 2133, i had to risk bricking the board by updating the bios to get the xmp profile to work.
Currently testing by compiling kde(want to try it out again) if the compile segfaults again like my first post i am just going to get another set of ram.
I already had to drop 220 to get a replacement board.. |
|
Back to top |
|
|
Syl20 l33t
Joined: 04 Aug 2005 Posts: 619 Location: France
|
Posted: Mon Apr 03, 2017 12:38 pm Post subject: Re: Easy way to tell if a seg fault is from ram? |
|
|
truekaiser wrote: | Just trying to determine if this is a ram or software issue. I already spent more money on this build than i wanted too. |
You could run memtest86+ several hours. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54220 Location: 56N 3W
|
Posted: Mon Apr 03, 2017 3:28 pm Post subject: |
|
|
Syl20,
memtest86+ exercises the RAM, CPU and several voltage regulators.
Problems reported by memtest86+ are not always RAM. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
truekaiser l33t
Joined: 05 Mar 2004 Posts: 801
|
Posted: Tue Apr 04, 2017 1:37 am Post subject: |
|
|
Well the sticks passed a couple of full passes of memtest86 on the sys-rescue cd. It has the latest version.
I am thinking this may be because I foolishly set the amount of jobs in make.conf to 16. I think I may be getting some bad race conditions because of this but ram is still suspect. Will see if a full system re-emerge will draw out the issue.. |
|
Back to top |
|
|
Syl20 l33t
Joined: 04 Aug 2005 Posts: 619 Location: France
|
Posted: Tue Apr 04, 2017 8:26 am Post subject: |
|
|
NeddySeagoon wrote: | memtest86+ exercises the RAM, CPU and several voltage regulators.
Problems reported by memtest86+ are not always RAM. |
Indeed. But if memtest86+ reports no errors, we can suppose the problem isn't hardware-related. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54220 Location: 56N 3W
|
Posted: Tue Apr 04, 2017 9:23 am Post subject: |
|
|
Syl20,
memtest86+ hardly exercises the CPU at all, therefore there is no stress on the motherboard Vcore regulator either.
It does do a good job of pushing the RAM subsystem fairly hard.
Maybe Prime95 or cpuburn would find something?
PSU issues normally arise around transient regulation .. for example when the CPU goes from sleep to 95W in 300ps.
I've not seen this for a long time but I used to work around it in older systems by setting the performance CPU governor in the kernel and adding nohalt to the kernel command line, so the CPU ran flat out all the time and the transient regulation demand on the PSU was much reduced.
nohalt is no longer documented in /usr/src/linux/Documentation/admin-guide/kernel-parameters.txt for x86. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Roman_Gruber Advocate
Joined: 03 Oct 2006 Posts: 3846 Location: Austro Bavaria
|
Posted: Tue Apr 04, 2017 4:51 pm Post subject: |
|
|
Syl20 wrote: | NeddySeagoon wrote: | memtest86+ exercises the RAM, CPU and several voltage regulators.
Problems reported by memtest86+ are not always RAM. |
Indeed. But if memtest86+ reports no errors, we can suppose the problem isn't hardware-related. |
nope.
MEmtest does not cover all cases. e.g. the row hammer method was later discovered and added. I am quite sure there are other cases which are not really covered.
Also these days microcode of the cpu, and other firmware (basically anyhting has hidden firmware, in silicon or updateable), can cause issues.
You can see on new processors how they patch for RAM issues, and other issues.
could be UEFI, microcode, firmware, hardware, software issues,hidden bugs...
updating the bios may fix issues also sometimes ... |
|
Back to top |
|
|
|