Segfaults during compilation on AMD Ryzen.

NeddySeagoon · Posted: Fri Apr 28, 2017 10:04 am Post subject:

It would be interesting to see if this problem correlates to particular motherboards or even motherboard vendors.

I've not seen issues like this for a long time. k6-2 old P2 long time ...
That was proved to be Vcore power supplies failing allowing the Vcore and RAM voltages to 'brown out' (go transiently out of spec) when the CPU switched from a low power to high power state.
I didn't have the test equipment to make measurements to confirm that but replacing all the capacitors in the motherboard regulator fixed the problem.

Designing for the required transient response in the Vcore regulator is difficult. The CPU can go from almost nothing to 100A in one clock cycle and the voltage must be held within a few millivolts.

Some correlation across motherboards or motherboard vendors could indicate that the Vcore regulators aren't quite up to the job.
There have been one or two reports that switching to the performance governor mitigated the issue.
That supports the above speculation, as the 'almost nothing' starts from a higher value, so the transient to full power step is smaller.

Note that a positive correlation would be interesting, it does not establish cause and effect.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

Naib · Posted: Fri Apr 28, 2017 12:11 pm Post subject:

exactly, while Correlation does not imply Causation, statistical information to narrow down any common aggravates is of interest.

I have a RAM voltage issue when I built a Core2 system years ago (thread still in the amd64 section)
_________________

daemon32 · n00b Joined: 28 Apr 2017 Posts: 2

I tried what I had said in my previous post and turned off the 'OP Cache' setting in the EFI...
And I ran `emerge mesa` in a loop that would terminate upon a non-zero exit status for 2 and a half hours without interruption.
I then went back and turned the 'OP Cache' back on and the loop failed upon the first build.

I really should've learned my lesson from the last time I was an early adopter

drizzt · Guru Joined: 21 Jul 2002 Posts: 428

NeddySeagoon · Posted: Sun May 07, 2017 10:32 am Post subject:

drizzt,

Try the performance governor. The high power level is the same, the CPU is running flat out, but the low power level is higher.
The power transients switching from one to the other are thus smaller.

I'm reluctant to suggest the powersave governor as its runs the CPU at its lowest clock frequency but the power transients will be smaller still.
Reducing the CPU clock like this brings in so many other variables too, so its not worth the test.
Nobody buy a Ryzen expecting to run it at its minimum clock speed for its useful life.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

drizzt · Guru Joined: 21 Jul 2002 Posts: 428

NeddySeagoon · Posted: Sun May 07, 2017 11:08 am Post subject:

drizzt,

Raising the core voltage might help but I'm very reluctant to suggest that.

If CPU Vcore brownouts are the issue, they can come from several sources, the Vcore regulator on the motherboard or the upstream 12v supply that feeds that.
Changing Vcore may help if the problem is due to the Vcore regulator but not if its from further upstream.

Its all still speculation.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

drizzt · Guru Joined: 21 Jul 2002 Posts: 428

Is it possible that the bios is blocking any power governor controls ?
I tested with three different governors (ondemand, conservative, performance) and the output that atop gives me (avgf and avgscal) look nearly identical over emerge time on every run.
_________________
People don't have to earn my respect. I offer my respect to them, but be careful to lose my respect...

Tony0945 · Posted: Sun May 07, 2017 2:46 pm Post subject:

Is this your board? https://www.newegg.com/Product/Product.aspx?Item=N82E16813132965

I see it gets terrible reviews. The review comparing two Linices on the board was very interesting.

drizzt · Guru Joined: 21 Jul 2002 Posts: 428

Tony0945 · Posted: Sun May 07, 2017 3:56 pm Post subject:

Been thinking of this https://www.newegg.com/Product/Product.aspx?Item=9SIA2F85F29679&cm_re=b350_tomahawk-_-13-144-028-_-Product and it's not so glitzy (but rarely available) cousin,https://www.newegg.com/Product/Product.aspx?Item=N82E16813144018

I usually buy Gigabyte, but it seems like MSI is more aggressive for Zen at least in providing timely BIOS updates. Gigabyte emphasizes Windows based tools that are useless on a Linux only system. Also, they seem to emphasize their Intel products.

I have only had one Biostar board in my life but it was surprisingly reliable.

Naib · Posted: Sun May 07, 2017 4:37 pm Post subject:

drizzt · Guru Joined: 21 Jul 2002 Posts: 428

Naib · Posted: Wed May 10, 2017 3:31 pm Post subject:

There is a new wave of MSI bios update:
MSI carbon: 7A32v15
- Improved memory compatibility.
- Fixed PCIe Hot-plug function issue.

Microcode is the same ( 0x0800111c)
AGESA version? ???

Slight update with respect to compilers:

drizzt · Guru Joined: 21 Jul 2002 Posts: 428

roarinelk · Guru Joined: 04 Mar 2004 Posts: 520

mblnx · n00b Joined: 04 Mar 2008 Posts: 16

asan · n00b Joined: 14 May 2017 Posts: 1

I have the exact same segfaults on sh (especially mesa when using -j16) happening for my own system:
CROSSHAIR VI HERO, BIOS 1107 04/28/2017
AMD Ryzen 7 1800X

Everything stock, no overclocking.
Unfortunately I have not found any "OP Cache" option in the BIOS.