Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
What to do after removing bad ram?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
treffer
Apprentice
Apprentice


Joined: 14 Dec 2004
Posts: 150

PostPosted: Fri Jan 24, 2014 10:37 pm    Post subject: What to do after removing bad ram? Reply with quote

Hi,

I recently noticed that my Gentoo laptop was extremly unstable (emerge would randomly fail to build a package, the android source tree was near impossible to build, random graphics glitches under load and full system crashes)....

It turns out that one memory module had some problems. Just some cells out of 16GB of RAM, small enough to go undetected for at least a month.
I do not trust binaries/data produced on the system. (Obviously)
Is there a recommended way to gain at least some confidence about the state of my system?

I'm currently recompiling the kernel as bitflips and errors in that binary could kill hardware. I'm also thinking about a reemerge of @system. Anything else I could try? Apart from a heavy @world recompile (my @world is HUGE: kde, gnome, mate, all tex stuff, java + IDEs, ...)
_________________
root@localhost# whois POEM-RIPE55-SONG
root@localhost# : ( ) { : | : & } ; :
Back to top
View user's profile Send private message
PaulBredbury
Watchman
Watchman


Joined: 14 Jul 2005
Posts: 7310

PostPosted: Fri Jan 24, 2014 10:53 pm    Post subject: Reply with quote

Running memtest overnight can give some confidence.

It's not a comprehensive test, because other system components aren't being stressed at the same time.
Back to top
View user's profile Send private message
treffer
Apprentice
Apprentice


Joined: 14 Dec 2004
Posts: 150

PostPosted: Fri Jan 24, 2014 11:35 pm    Post subject: Reply with quote

PaulBredbury wrote:
Running memtest overnight can give some confidence.


That's how I found the broken RAM. The problem is I know that it corrupted builds. I don't know if a broken binary got through the build (e.g. the .o file was damaged within a large function and the resulting .so will crashing any caller).
_________________
root@localhost# whois POEM-RIPE55-SONG
root@localhost# : ( ) { : | : & } ; :
Back to top
View user's profile Send private message
Logicien
Veteran
Veteran


Joined: 16 Sep 2005
Posts: 1555
Location: Montréal

PostPosted: Sat Jan 25, 2014 1:17 am    Post subject: Reply with quote

Now that you remove the bad ram, if you do not see the behavior you described when your bad ram was in 'service'
treffer wrote:
(emerge would randomly fail to build a package, the android source tree was near impossible to build, random graphics glitches under load and full system crashes

that's an indication that your emerge builds are good. If not completely sure, you can use the emerge option
Code:
--emptytree (-e)
              Reinstalls  target  atoms  and  their  entire  deep dependency tree, as though no packages are currently installed. You should run this with --pretend first to make sure the
              result is what you expect.

But you can stay in doubt even with this option because if the binaries used to do so, Gcc, Glibc, Ld and so on are broken, the binaries result can be broken too. So at this point I do not see any other solution than use an other Gentoo host to recompile your entire world if possible, or reinstall Gentoo from zero.
_________________
Paul
Back to top
View user's profile Send private message
shazeal
Apprentice
Apprentice


Joined: 03 May 2006
Posts: 206
Location: New Zealand

PostPosted: Sat Jan 25, 2014 8:13 am    Post subject: Reply with quote

treffer wrote:
PaulBredbury wrote:
Running memtest overnight can give some confidence.


That's how I found the broken RAM. The problem is I know that it corrupted builds. I don't know if a broken binary got through the build (e.g. the .o file was damaged within a large function and the resulting .so will crashing any caller).


You can 'emerge -e world' as above. I had a similar issue some time ago. I did the emerge -e, however the gcc compiler and glibc were corrupted themselves so things did not build correctly. Ended up just reinstalling the system using the old world file as I didnt trust my backups either.
If you can run a full -e then your system should be fine.
_________________
CFLAGS="-OmgWTFR1CE --fun-lol-loops --march=asmx86go"
Back to top
View user's profile Send private message
PaulBredbury
Watchman
Watchman


Joined: 14 Jul 2005
Posts: 7310

PostPosted: Sat Jan 25, 2014 10:25 am    Post subject: Reply with quote

You should run memtest again, like I said, to gain confidence that your system is now reliable.

Then recompile everything, from the bottom up.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum