Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Easy way to tell if a seg fault is from ram?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
truekaiser
l33t
l33t


Joined: 05 Mar 2004
Posts: 801

PostPosted: Sun Apr 02, 2017 9:55 pm    Post subject: Easy way to tell if a seg fault is from ram? Reply with quote

My new build is being stubborn. I have been trying to rebuild @world now that I am on the actual hardware and not the live cd.
I am getting some semi uncommon segmentation faults. And i am trying to determine if it is ram or using -j16.
they go something like
Code:
sh[800]: segfault at 32 ip 0000000000419be sp 00007ffe6d3b49f0 error 6 in bash[400000+a6000]

the part after bash is the same with the latest two on this reboot.
these two sticks of coursair vengence ram set at 16 18 18 36 but running at 2133 rather than 2666(which they are spec'ed at because ryzen's being a bitch.) with i think running at it's rated 1.20v

Just trying to determine if this is a ram or software issue. I already spent more money on this build than i wanted too.
Back to top
View user's profile Send private message
Naib
Watchman
Watchman


Joined: 21 May 2004
Posts: 6051
Location: Removed by Neddy

PostPosted: Sun Apr 02, 2017 10:12 pm    Post subject: Reply with quote

have you checked the voltage?

I ran into an annoying issue when I did a Core2 build years ago .. turned out the mobo was conservatively setting the voltage on the low side while the Giel sticks I was using needed a bit more

found one of my threads from around that time: https://forums.gentoo.org/viewtopic-t-515645-highlight-ram.html

this was also the clock of the ram causing issues BUT I swear voltage was involved as well...
_________________
Quote:
Removed by Chiitoo
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Sun Apr 02, 2017 10:36 pm    Post subject: Reply with quote

Not being a hardware guru, but I believe theoretically you can have RAM errors if you downclock it, because of less frequent refresh.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
Naib
Watchman
Watchman


Joined: 21 May 2004
Posts: 6051
Location: Removed by Neddy

PostPosted: Sun Apr 02, 2017 10:50 pm    Post subject: Reply with quote

Jaglover wrote:
Not being a hardware guru, but I believe theoretically you can have RAM errors if you downclock it, because of less frequent refresh.
you are sort of right but the key info is the timing and the voltage.
The OP has set is to: 16 18 18 36

Looking around for DDR4, 2666 Corsair brings up: https://www.overclockers.co.uk/corsair-vengeance-lpx-16gb-2x8gb-ddr4-pc4-21300c16-2666mhz-dual-channel-kit-black-cmk16gx4m2a26-my-441-cs.html which has CAS timings of: 16-18-18-35 & voltage of 1.20-1.35v

I would be tempted to increase the voltage ever so slightly (1.22V). Also check the mobo slots [/quote]
_________________
Quote:
Removed by Chiitoo
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Sun Apr 02, 2017 11:02 pm    Post subject: Reply with quote

Those numbers are derived from actual clock frequency, right? Meaning if you lower the clock you also slow down the refresh and may theoretically have a bit flipped every now and then because refresh come too late.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
Naib
Watchman
Watchman


Joined: 21 May 2004
Posts: 6051
Location: Removed by Neddy

PostPosted: Sun Apr 02, 2017 11:08 pm    Post subject: Reply with quote

Sort of...
The CAS timings are what needs to be honoured otherwise the DRAM will become ambiguous. If you drop the operating frequency then the CAS settings need to be juggled to ensure the same timings.
_________________
Quote:
Removed by Chiitoo
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Sun Apr 02, 2017 11:13 pm    Post subject: Reply with quote

Of course rising the voltage should help in case like this, it takes longer for 1 to drop below threshold and become 0 when the starting level is higher.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
truekaiser
l33t
l33t


Joined: 05 Mar 2004
Posts: 801

PostPosted: Mon Apr 03, 2017 3:02 am    Post subject: Reply with quote

i forgot to mention those are the default specs of the sticks.
but the asrock taichi board wouldn't post unless i let it auto detect it at 15 15 15 or so at 2133, i had to risk bricking the board by updating the bios to get the xmp profile to work.
Currently testing by compiling kde(want to try it out again) if the compile segfaults again like my first post i am just going to get another set of ram.

I already had to drop 220 to get a replacement board..
Back to top
View user's profile Send private message
Syl20
l33t
l33t


Joined: 04 Aug 2005
Posts: 619
Location: France

PostPosted: Mon Apr 03, 2017 12:38 pm    Post subject: Re: Easy way to tell if a seg fault is from ram? Reply with quote

truekaiser wrote:
Just trying to determine if this is a ram or software issue. I already spent more money on this build than i wanted too.

You could run memtest86+ several hours.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54220
Location: 56N 3W

PostPosted: Mon Apr 03, 2017 3:28 pm    Post subject: Reply with quote

Syl20,

memtest86+ exercises the RAM, CPU and several voltage regulators.
Problems reported by memtest86+ are not always RAM.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
truekaiser
l33t
l33t


Joined: 05 Mar 2004
Posts: 801

PostPosted: Tue Apr 04, 2017 1:37 am    Post subject: Reply with quote

Well the sticks passed a couple of full passes of memtest86 on the sys-rescue cd. It has the latest version.
I am thinking this may be because I foolishly set the amount of jobs in make.conf to 16. I think I may be getting some bad race conditions because of this but ram is still suspect. Will see if a full system re-emerge will draw out the issue..
Back to top
View user's profile Send private message
Syl20
l33t
l33t


Joined: 04 Aug 2005
Posts: 619
Location: France

PostPosted: Tue Apr 04, 2017 8:26 am    Post subject: Reply with quote

NeddySeagoon wrote:
memtest86+ exercises the RAM, CPU and several voltage regulators.
Problems reported by memtest86+ are not always RAM.

Indeed. But if memtest86+ reports no errors, we can suppose the problem isn't hardware-related.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54220
Location: 56N 3W

PostPosted: Tue Apr 04, 2017 9:23 am    Post subject: Reply with quote

Syl20,

memtest86+ hardly exercises the CPU at all, therefore there is no stress on the motherboard Vcore regulator either.
It does do a good job of pushing the RAM subsystem fairly hard.
Maybe Prime95 or cpuburn would find something?

PSU issues normally arise around transient regulation .. for example when the CPU goes from sleep to 95W in 300ps.
I've not seen this for a long time but I used to work around it in older systems by setting the performance CPU governor in the kernel and adding nohalt to the kernel command line, so the CPU ran flat out all the time and the transient regulation demand on the PSU was much reduced.

nohalt is no longer documented in /usr/src/linux/Documentation/admin-guide/kernel-parameters.txt for x86.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Roman_Gruber
Advocate
Advocate


Joined: 03 Oct 2006
Posts: 3846
Location: Austro Bavaria

PostPosted: Tue Apr 04, 2017 4:51 pm    Post subject: Reply with quote

Syl20 wrote:
NeddySeagoon wrote:
memtest86+ exercises the RAM, CPU and several voltage regulators.
Problems reported by memtest86+ are not always RAM.

Indeed. But if memtest86+ reports no errors, we can suppose the problem isn't hardware-related.


nope.

MEmtest does not cover all cases. e.g. the row hammer method was later discovered and added. I am quite sure there are other cases which are not really covered.

Also these days microcode of the cpu, and other firmware (basically anyhting has hidden firmware, in silicon or updateable), can cause issues.

You can see on new processors how they patch for RAM issues, and other issues.

could be UEFI, microcode, firmware, hardware, software issues,hidden bugs...

updating the bios may fix issues also sometimes ...
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum