Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Disk activity goes crazy and keyboard stops responding.
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
fert
n00b
n00b


Joined: 19 Sep 2007
Posts: 40

PostPosted: Sun Oct 28, 2018 1:02 am    Post subject: Disk activity goes crazy and keyboard stops responding. Reply with quote

I recently had a hard drive die, taking /var with it. Of course I didn't have a backup of /var, so a complete reinstall was performed, as I quickly figured out rebuilding a gentoo system without a world file was just torture. (Lesson learned.)

Now, with gentoo-sources 4.18.16, after anywhere from 10 minutes to maybe an hour, disk activity skyrockets (you can hear at least one of the hard drives rapidly, constantly seeking, for hours if you let it run) and the keyboard stops responding (mouse works fine).

Any troubleshooting gurus want to give me any hints/suggestions how to troubleshoot this?
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9646
Location: almost Mile High in the USA

PostPosted: Sun Oct 28, 2018 1:31 am    Post subject: Reply with quote

sounds typical when you run out of RAM and have to start swapping.
Linux will do text page swapping even if you don't have anonymous swap enabled. Text page swapping is deadly slow, much slower than anonymous swap.

What were you doing during this time? Monitor your RAM while using the machine with 'top' in a terminal window and see if you're running out of memory. Chances are. At the very least, the last thing you see on the screen should be a snapshot of what's using CPU at the time.

You should add anonymous (regular) swap if you don't have any. Else you likely need more RAM.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
fert
n00b
n00b


Joined: 19 Sep 2007
Posts: 40

PostPosted: Sun Oct 28, 2018 1:55 am    Post subject: Reply with quote

Thanks for the reply.

32G of RAM, though, and a 64G swap partition. Should be overkill for a basic system, no? And this exact same system has been running Gentoo for years.

I was thinking that the problem seemed more obvious when running mythbackend, but I had it stopped and when I first tried to reply to your post I ran into the same problem.

This time it was as if the "a" key was being held down. There was no response from the keyboard other than the continuous input of "aaaaaaaaaaaaaaaaaaaaaaaaa...", but the mouse still worked.

I dunno. Did I fubar my install by jumping to gcc 8.2.0-r3, maybe?

ETA: and the constant crashes are starting to wreak havoc on /home (btrfs) -- losing email passwords in claws-mail, etc. (I have good backups of /home, though :) )
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9646
Location: almost Mile High in the USA

PostPosted: Sun Oct 28, 2018 5:43 am    Post subject: Reply with quote

Well, the same plan is still there, find out what's accessing the disk, do you have tons of programs locking memory like virtual machines. Having the hard drive do tons of accesses suddenly is usually software triggered, the keyboard problem may be due to interrupt overload though it should generally affect the mouse too...

might end up being a kernel bug too, trying a different kernel might be worth a shot.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
fert
n00b
n00b


Joined: 19 Sep 2007
Posts: 40

PostPosted: Sun Oct 28, 2018 11:07 pm    Post subject: Reply with quote

Well, it is definitely firefox related. System will stay up all night if firefox is never opened. Run firefox, and in a short time it the drive thrashes and keyboard is stuck.

Tried with firefox and firefox-bin 63.0, even removing .mozilla to start with a new profile and no add-ons. Nothing helped.

Downgraded to 62.0.3 and it seems better. No disk thrashing and stuck keyboard YET. Time will tell, but 63.0 is garbage on my system.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9646
Location: almost Mile High in the USA

PostPosted: Mon Oct 29, 2018 4:28 am    Post subject: Reply with quote

was it leaking memory?
It'd had to have leaked pretty badly though with the amount of memory you had, unless you had a lot of locked memory?
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
fert
n00b
n00b


Joined: 19 Sep 2007
Posts: 40

PostPosted: Mon Oct 29, 2018 11:40 pm    Post subject: Reply with quote

It was a premature diagnosis.

Firing up mythfrontend also brings about the exact same issue. Shortly after video playback begins, keyboard input sticks, like the last key used is being held down when the problem is triggered, and the disk starts thrashing.

Tried pf-sources 4.19-pf3 and it is no better.

Shotgunning at this point. Guess I'll jump back to kernel 4.14.78 to see if it helps.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9646
Location: almost Mile High in the USA

PostPosted: Tue Oct 30, 2018 2:03 am    Post subject: Reply with quote

hmm....perhaps something happened and your logfiles are filling up rapidly (though it shouldn't be enough to hammer the disk light solid on), is there anything there that could be interesting?
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
AJM
Apprentice
Apprentice


Joined: 25 Sep 2002
Posts: 189
Location: Aberdeen, Scotland

PostPosted: Tue Oct 30, 2018 10:38 pm    Post subject: Reply with quote

fert wrote:
32G of RAM, though, and a 64G swap partition.


In my experience ridiculous amounts of swap like that are much more problematic than having no swap at all. The ancient advice of having swap = 2x RAM hasn't been valid for years - even Windows has stopped that nonsense these days!

Why not try disabling your swap completely? What are you running that would require you to have any swap at all - loads of virtual machines? My own desktop "only" has 16GB RAM and no swap and absolutely never requires any - I never see the OOM killer pounce.

You will quite probably find your system responsiveness improved too... either way, if something is gobbling memory it should show up more quickly with less "memory" to play with.

For monitoring system behaviour in real time I really like nmon
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9646
Location: almost Mile High in the USA

PostPosted: Tue Oct 30, 2018 11:33 pm    Post subject: Reply with quote

It's your discretion to disable swap, but if your machine ever needs to swap, I'm telling you, anonymous swap is a LOT faster than text page swapping, which is the default when you have no anonymous swap space.

With today's bigger and bigger GUIs I'd say swap is even more and more important because it allows the OS to page out less important stuff first. Without this feature it will page stuff it can "safely" page out, even pages that are being used. You will notice that slowly increasing memory (leaks is usually a good example example) as well as fragmentation will result in an unusable machine faster than without swap. Keyboard/GUI lockouts are typical of an OOM situation if the OS has no choice of what to swap; with anonymous swap and OS having a choice of what to swap, keyboard/GUI lockouts will happen more gracefully.

All in all, the advice to not use swap is a dangerous one. While dedicating 64GB of swap space is a waste of hard drive space if you never use it, there is no problematic issue unless you're short of hard drive space as well. At the very least for laptop users, swap space is also used for hibernation, which gives another out for low battery situations.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
AJM
Apprentice
Apprentice


Joined: 25 Sep 2002
Posts: 189
Location: Aberdeen, Scotland

PostPosted: Wed Oct 31, 2018 12:07 am    Post subject: Reply with quote

eccerr0r wrote:
All in all, the advice to not use swap is a dangerous one.


With 32GB of RAM and normal desktop use I'd suggest it's not even remotely risky for the vast majority of users (and it's the way I work on my own work desktop PC every day - I've never once seen anything killed from lack of memory on that machine.)

A laptop requiring suspend/resume is a different matter of course as you say.
Back to top
View user's profile Send private message
fert
n00b
n00b


Joined: 19 Sep 2007
Posts: 40

PostPosted: Wed Oct 31, 2018 12:09 am    Post subject: Reply with quote

My swap never seems to be touched, that I can see. I can spare 64G, just to make sure, even if it is outdated (2x RAM). And, it was set up that way prior to my hard drive's demise and worked fine.

4.14.78 doesn't fare any better than 4.19-pf3, 4.19.0, or 4.18.16. Firefox or mythfrontend fairly quickly triggers the issue (although ff 63.0 seemed to cause the problem much quicker than 62.0.3 does).

I'm thinking another complete reinstall, foregoing gcc-8.2.0, may be in my future.

Gonna give nmon a go. See if maybe I can tell what the heck is going on...
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9646
Location: almost Mile High in the USA

PostPosted: Wed Oct 31, 2018 1:12 am    Post subject: Reply with quote

AJM wrote:
With 32GB of RAM and normal desktop use I'd suggest it's not even remotely risky for the vast majority of users (and it's the way I work on my own work desktop PC every day - I've never once seen anything killed from lack of memory on that machine.)

# PORTAGE_TMPDIR=/tmpfs MAKEOPTS="-j32" gnome-terminal emerge chromium
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
ct85711
Veteran
Veteran


Joined: 27 Sep 2005
Posts: 1791

PostPosted: Wed Oct 31, 2018 2:45 am    Post subject: Reply with quote

Quote:
The ancient advice of having swap = 2x RAM hasn't been valid for years - even Windows has stopped that nonsense these days!


Well, I can even say this for Windows machines, that they still maintain having a swap file by default. Anymore Windows, does more like a scaling one that grows as needed, to a max of 3x of ram or 4GB, which ever is larger. For my Windows machine (has 16GB of ram), the swap file is currently only 256MB. While I know some of the laptops I work on, their swapfiles tend to get up to 4GB+ easily on. I admit my linux machines, I don't have a swap file for my main machine, since it also has 16GB of ram; my other machine (8GB of ram) I have the swap partition standing by in the odd chance I need it. Even then, that machine is headless and I dropped the jobs to 4 (it used to have 16GB until one memory stick died).

https://support.microsoft.com/en-us/help/2860880/how-to-determine-the-appropriate-page-file-size-for-64-bit-versions-of
Back to top
View user's profile Send private message
AJM
Apprentice
Apprentice


Joined: 25 Sep 2002
Posts: 189
Location: Aberdeen, Scotland

PostPosted: Wed Oct 31, 2018 3:08 pm    Post subject: Reply with quote

fert wrote:
4.14.78 doesn't fare any better than 4.19-pf3, 4.19.0, or 4.18.16. Firefox or mythfrontend fairly quickly triggers the issue (although ff 63.0 seemed to cause the problem much quicker than 62.0.3 does).

Gonna give nmon a go. See if maybe I can tell what the heck is going on...


It's not graphics related is it? Sometimes graphics drivers can cause weird lockups - though I'm not sure I can tie in disk thrashing with that... do you have any other graphics intensive software to try provoking it?
Back to top
View user's profile Send private message
The Main Man
Veteran
Veteran


Joined: 27 Nov 2014
Posts: 1164
Location: /run/user/1000

PostPosted: Wed Oct 31, 2018 5:20 pm    Post subject: Reply with quote

If you're using Intel GPU check if you have intel-microcode installed.
I had similar problems few years back, turned out I forgot to install it, after that everything was fine.
Back to top
View user's profile Send private message
fert
n00b
n00b


Joined: 19 Sep 2007
Posts: 40

PostPosted: Fri Nov 02, 2018 10:33 pm    Post subject: Reply with quote

Been busy and haven't had much time to dig into this. Should have some time this weekend to work on one heck of a frustrating issue.

Running top and nmon when the problem occurs has been unable to point to a specific cause, although I'm not 100% sure of what I should be looking for. The problem remains easy to trigger, though. Run firefox for a bit or try to play a recording via mythfrontend and you start to hear disk activity. Then the keyboard stops responding and sometimes acts like a key is being held down, but the system is not frozen. Mythfrontend will continue to play the video, although with quite a bit of choppiness, and the mouse still works.

I'm using nvidia-drivers. May try to revert to an older version to see if that makes a difference.

Originally, the problem occurred after reinstalling due to the hard drive containing /var giving up the ghost. The replacement /var drive is not new, but smartctl says it is OK. Is it possible for a funky drive to cause issues without logging any errors?

ETA: swapped out the /var drive (nmon seemed to indicate that is where the heavy disk activity was occurring) but it made no difference.
Back to top
View user's profile Send private message
ct85711
Veteran
Veteran


Joined: 27 Sep 2005
Posts: 1791

PostPosted: Sat Nov 03, 2018 12:29 am    Post subject: Reply with quote

Just wondering, but is any or your drives almost full? It is known that the system tends to slow down considerably when it is unable to easily find new space. A simple check of df and df -i, will let you check if any mounted partitions are pretty full on both regular storage space and inodes.

An additional thought, rebalancing drives in a raid setup can also drag the system to it's knees.
Back to top
View user's profile Send private message
Goverp
Veteran
Veteran


Joined: 07 Mar 2007
Posts: 1972

PostPosted: Sat Nov 03, 2018 10:17 am    Post subject: Reply with quote

fert wrote:
...
Running top and nmon when the problem occurs has been unable to point to a specific cause, although I'm not 100% sure of what I should be looking for. The problem remains easy to trigger, though. Run firefox for a bit or try to play a recording via mythfrontend and you start to hear disk activity. Then the keyboard stops responding and sometimes acts like a key is being held down, but the system is not frozen. Mythfrontend will continue to play the video, although with quite a bit of choppiness, and the mouse still works.
...

Might be worth trying iotop.

One recent performance hit I've had was running both firefox and a big emerge (specifically webkit-gtk). My default emerge environment uses tmpfs for work; webkit-gtk took up so much space, along with firefox, that I think it was paging itself to death. The keyboard and mouse took about 5 minutes to respond. Cure for me was to limit the tmpfs size to something sensible, and change webkit-gtk's environment to use real disk for work.
_________________
Greybeard
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54099
Location: 56N 3W

PostPosted: Sat Nov 03, 2018 10:45 am    Post subject: Reply with quote

fert,

For a HDD to be the cause of this it would either leave a trail of IO errors in dmesg or its not properly sector aligned, which plays havoc with the write speed, doing all those misaligned writes.
You don't get errors dmesg errors for misaligned writes.

You say that the smart data is OK. The summary "pass" is not useful.

Post your
Code:
fdisk -l
so we can check your partition alignment. We need the output in sectors.

Post the output of
Code:
smartctl -a /dev/...
for all your drives.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
fert
n00b
n00b


Joined: 19 Sep 2007
Posts: 40

PostPosted: Thu Nov 08, 2018 4:44 am    Post subject: Reply with quote

Didn't get as much time last weekend to look into this. Just a stab or two here and there...

All of my partitions were done via parted, and parted tells me all partitions are optimally aligned.

After many, many hard resets while trying to figure out what's happening, I decided to reinstall, leaving gcc-7 the default. The system still chokes on playing videos via mpv, running firefox, or mythfrontend. Along with the reinstall, my final spinning drive was replaced by SSD, so I no longer hear the ticking or churning of the hard drive(s), but I suppose I am shortening the life of one or more SSDs.

Next attempt will be to go with a live DVD/CD and see if it also exhibits the same behavior.
Back to top
View user's profile Send private message
fert
n00b
n00b


Joined: 19 Sep 2007
Posts: 40

PostPosted: Sun Dec 02, 2018 10:46 pm    Post subject: Reply with quote

Just FYI...

It appeared to be an issue with vdpau and nvidia-drivers.

After adding the -vdpau use flag and 'emerge -avuDN @world' the system doesn't freeze any longer when running mpv, firefox, or mythtvfrontend.

I have no explanation for why it was hitting the disks hard right before it would freeze.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum