Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Architectures & Platforms Gentoo on AMD64
  • Search

AMD64 system slow/unresponsive during disk access (Part 2)

Have an x86-64 problem? Post here.
Locked
Advanced search
158 posts
  • Page 5 of 7
    • Jump to page:
  • Previous
  • 1
  • …
  • 3
  • 4
  • 5
  • 6
  • 7
  • Next
Author
Message
tallica
Apprentice
Apprentice
User avatar
Posts: 152
Joined: Fri Jul 27, 2007 2:52 pm
Location: Lublin, POL

  • Quote

Post by tallica » Tue Nov 23, 2010 8:52 am

Looks like there are some problems with patch v4: http://lkml.org/lkml/2010/11/21/41
Gentoo ~AMD64 | Audacious
Top
kernelOfTruth
Watchman
Watchman
User avatar
Posts: 6111
Joined: Tue Dec 20, 2005 10:34 pm
Location: Vienna, Austria; Germany; hello world :)
Contact:
Contact kernelOfTruth
Website

  • Quote

Post by kernelOfTruth » Fri Dec 03, 2010 6:36 pm

if you're rsyncing regularly and apps are suffering during heavy reads, try the following:

[PATCH v3 0/3] f/madivse(DONTNEED) support
http://marc.info/?l=linux-kernel&m=129104424110018&w=2
http://marc.info/?l=linux-kernel&m=129104424210023&w=2
http://marc.info/?l=linux-kernel&m=129104424210027&w=2

there are several other potential improvements in this area but the most if not all, that are applicable to 2.6.36 to the current date are incorporated
already in the zen-kernel so give it a try:

git.zen-kernel.org

www.zen-kernel.org

dm crypt: scale to multiple CPUs
might be useful if you need it - in my experience it increases latency probably a little

there also seems to be some potential filesystem corruption being triggered with ext4 and 2.6.37-rc* right now which is under investigation so don't use it with 2.6.37-rc* - yet
https://github.com/kernelOfTruth/ZFS-fo ... scCD-4.9.0
https://github.com/kernelOfTruth/pulsea ... zer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Top
pste
Tux's lil' helper
Tux's lil' helper
User avatar
Posts: 108
Joined: Tue Dec 14, 2004 6:28 pm

What about 64-bit kernels and I/O...

  • Quote

Post by pste » Tue Mar 01, 2011 6:44 am

I think this is a bug, an inconsistency, an unfortunate system design or something similar that has been around for quite a while, which makes me very annoyed, not least because it pokes a hole in the hallmark of linux - speed and stability! - and it seems never to get fixed!?!? Unfortunately I'm in no position (not timewise, nor knowledgewise) to pursue this problem myself, which leaves me with no other option than to provide background, some additional information and experiences and hope that someone clever can get something out of it and accept the challenge!

Quick summary:
Probably since kernel 2.6.18 (googling seems to converge on this kernel) there's been some kind of problem with I/O (disk I/O, ...??) on 64-bit kernels, resulting in high cpu load and a lagging or even hanging system. It seems like the problem begins when I/O reaches a certain amount, like copying many large files (taking a backup...) or doing many things concurrently, like copying files while rsync'ing through a vpn tunnel (high cpu load in general). It seems like it has something to do with the I/O scheduler, but I experience problems even if I use the "simplest" deadline scheduler. To me, zen-sources also works slightly better that gentoo-sources.

It's not hardware problems, because it works fine with windows (which is kind of extra annoying, isn't it?), and my impression (haven't tested thoroughly though - but my home servers, both with gentoo-sources-32bit with several usb drives on usb hubs seems to work flawlessly) is that there's no problem with 32-bit kernels. The external usb drives I'm using are of different brands and models, all have the same behavior. I've tried -a lot- of different kernel settings, including many minimal ones, but I cannot find a pattern.

What happens?
Personally I think it's connected with usb and usb-harddrives (it's at least a failsafe way to make the problem show!). Every time I make a backup, copying goes at 20MB/s plus for a while, but then it starts to slow down. I cannot say exactly when, but either when copied size reaches - say 8GB (just a guess - sometimes more), or when the file count is big! (no number...). Then transfer speed falls down to a few MB/s (2-5MB/s perhaps), system start to lag and cpu is 100%. Often (not always) I get something about "reset usb-device" in the log, sometimes "I/O-error on device" forcing me to shut-down drives and computer, start over, and fsck. Occasionally I get a complete system lock-up (my comp. freezes and all leds on keyb shine).

Making backups between two usb-drives on the same usb-hub, seems to create most problems, including total system hang-ups

Hopefully, someone that thinks this is something that is required to be fixed can make someting out of this start digging - I'm cheering loudly in that case!

Good luck!

/pste
Top
idella4
Retired Dev
Retired Dev
User avatar
Posts: 1600
Joined: Fri Jun 09, 2006 11:29 am
Location: Australia, Perth

  • Quote

Post by idella4 » Tue Mar 01, 2011 9:48 am

pste,

very general. use top, iotop, and setup the conditions that it occurs. You can at least capture some snapshots of the system state with the tops. Then dmesg content. Need some sort of baseline.
idella4@aus
Top
frostschutz
Advocate
Advocate
User avatar
Posts: 2978
Joined: Tue Feb 22, 2005 11:23 am
Location: Germany

Re: What about 64-bit kernels and I/O...

  • Quote

Post by frostschutz » Tue Mar 01, 2011 9:59 am

pste wrote:Making backups between two usb-drives on the same usb-hub, seems to create most problems
That's a total bottleneck on the hardware side though. Even in ideal conditions you shouldn't see more than 10MB/s transfer speeds for usb to usb especially with a hub involved. You're talking to both disks on a line that can ideally transfer 40MB/s, that means 20MB/s for each drive, then in comes the protocol overhead, filesystem overhead, hub overhead, context switches overhead, and you end up with extremely slow speed as is typical for USB... Add unreliable hardware to that (such as an overheating usb hub) and you're in data corruption land...

Performance issues in the kernel, it's a possibility, happens all the time, but if you also get strange stuff in dmesg, it's much more likely that your hardware is the culprit somehow
Top
ppurka
Advocate
Advocate
Posts: 3256
Joined: Sun Dec 26, 2004 5:05 pm

  • Quote

Post by ppurka » Tue Mar 01, 2011 10:14 am

See http://forums.gentoo.org/viewtopic-t-793263.html
[topic=797019]emerge --quiet redefined[/topic] | E17 vids: I, II | Now using kde5 | e is unstable :-/
Top
tomk
Bodhisattva
Bodhisattva
User avatar
Posts: 7221
Joined: Tue Sep 23, 2003 1:41 pm
Location: Sat in front of my computer

  • Quote

Post by tomk » Tue Mar 01, 2011 10:40 am

Merged from [post=6596553]here[/post].
Search | Read | [topic=119906]Answer[/topic] | [topic=28820]Report[/topic] | [topic=160179]Strip[/topic]
Top
pste
Tux's lil' helper
Tux's lil' helper
User avatar
Posts: 108
Joined: Tue Dec 14, 2004 6:28 pm

  • Quote

Post by pste » Tue Mar 01, 2011 11:17 am

ppurka - yes, that (this) thread is one of the sources that made me say that this is a since long problem! Google gives you more...

frostschutz - yes I know that the setup with usb-drives is a hardware bottleneck, but I do believe that it should not mean anything else than that the backup takes a long time, it should -not- make the entire system lag, or hang! I do think this is caused by some kind of race condition that occur in 64-bit kernels... And, NO! it's not the hardware, I wrote above that the same setup works fine in windows and (similar setups) with 32-bit kernels! - but yes, hardware related, meaning (kernel) driver problems, perhaps? I do agree that overheating is a possible explanation for the I/O-error situations, but I find it strange that this differ between OS:es (or kernel types). Furthermore, the problem does also occur without the hub (e.g. copying from system drive to usb drive or between usb drives on different usb ports of the comp.), I stated the example because my impression is that it's the quickest way to create problems...

idella4 - sure, I'll try to capture something, although not today... (I need to recompile the kernel with a few new flags - the iotop emerge told me, but I need my comp running a while longer...) But a problem is that for the worst case (the most interesting one) I must try to create one of these total lock-ups and then hand-copy (or photograph) the tops and dmesg screens precisely because the system is frozen, and it feels a little risky to recover the filesystem(s) everytime it hangs... A concrete example (for anyone to try): try starting a rsync -avh --progress /home /media/your-usb-drive/ (or similar) and wait (of course, /home must be many GB large!). I'm doing precisely this at the moment! For me this command keeps showing about 15-20MB/s for every file. But after a while the system gets lagged (rsync keeps running at the same speed though), then if I for instance try starting a movie in vlc (having to read a big file from the harddrive) - this is (naturally) really slow, but sometimes the movie hangs, and closing vlc takes about 5 minutes! When sync is finished, system is back to normal responsivess...

Thanks for the response!

/pste
Top
joeklow
n00b
n00b
User avatar
Posts: 46
Joined: Sun Jan 23, 2011 11:58 am

  • Quote

Post by joeklow » Tue Mar 01, 2011 6:29 pm

Reporting 2.6.36 ck-sources running at multicore Phenom II.

Recompiled this kernel, changing deprecated SATA support ("ATA/ATAPI support") over to new (serial ATA/PATA drivers).
I/O scheduler: BFQ (was CFQ)
Profile: Desktop (was Server)
CPU scheduler: CFS+autogroups
Timer: 1000 (was 200)

Also, /etc/init.d/local.start has the following to disable cache (stupid XFS loves to flush data once in hour, and it would be stupid to let the flushed data stay in cache).
hdparm -W0 /dev/sda
Now can emerge -u world at host, in virtual machine and run Windows virtual machine simultaneously, and the remaining resources are sufficient to have a far better response (can surf/code).
Without those tricks was unable to do anything while merging something, and system was almost unresponsible while emerge --sync'ing.
Top
Yamakuzure
Advocate
Advocate
User avatar
Posts: 2323
Joined: Wed Jun 21, 2006 11:06 am
Location: Adendorf, Germany
Contact:
Contact Yamakuzure
Website

  • Quote

Post by Yamakuzure » Wed Mar 02, 2011 4:32 pm

Huh? I haven't had any lag since gentoo-sources-2.6.36-rsomething and with gentoo-sources-2.6.37 (okay, with cgroups hack) I have no lag even if I do a huge parallel merge (load between 25 and 40 on an i7 Dualcore laptop with HT) and have VMWare with WindowsXP open.

Is it just this cgroups stuff? I am basically using what is described here: http://forums.gentoo.org/viewtopic-t-852922.html
(And no, I do not have any problems with Amarok, DragonPlayer or any other multi media stuff)
Edited 220,176 times by Yamakuzure
Top
devsk
Advocate
Advocate
User avatar
Posts: 3039
Joined: Fri Oct 24, 2003 1:16 am
Location: Bay Area, CA

  • Quote

Post by devsk » Tue Mar 29, 2011 7:19 am

2.6.38 with AUTOGROUP helps a lot with this issue.
Top
devsk
Advocate
Advocate
User avatar
Posts: 3039
Joined: Fri Oct 24, 2003 1:16 am
Location: Bay Area, CA

  • Quote

Post by devsk » Tue Apr 19, 2011 6:34 am

Any news on this front? Does AUTOGROUP help people with this issue? Or this is a non-issue now?
Top
kernelOfTruth
Watchman
Watchman
User avatar
Posts: 6111
Joined: Tue Dec 20, 2005 10:34 pm
Location: Vienna, Austria; Germany; hello world :)
Contact:
Contact kernelOfTruth
Website

  • Quote

Post by kernelOfTruth » Tue Apr 19, 2011 6:26 pm

devsk wrote:Any news on this front? Does AUTOGROUP help people with this issue? Or this is a non-issue now?
autogroup definitely does help

deactivate invalidated pages, too with heavy rsync jobs (fadvise support)


but there are still hickups and short interruptions when listening to music while it's heavily flushing to disk

so it's still present but got a lot lighter
https://github.com/kernelOfTruth/ZFS-fo ... scCD-4.9.0
https://github.com/kernelOfTruth/pulsea ... zer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Top
TimeManx
n00b
n00b
Posts: 55
Joined: Mon Jul 11, 2011 10:48 pm

  • Quote

Post by TimeManx » Sun Jan 01, 2012 11:02 am

I've configured 3.1.6 with autogroup, cfq, zram (128 MB as swap), zcache, transparent huge pages (madvise), memory compaction, preemptile kernel on a system with 2 GB of RAM. The system is quite responsive in the first half hour after boot but the performance keeps deteriorating.
Also, copying large amounts of data from one drive to another is sporadic and the drives become inaccesible during that time which causes dolphin to freeze.
Top
TinheadNed
Guru
Guru
User avatar
Posts: 339
Joined: Sat Apr 05, 2003 5:12 pm
Location: Farnborough, UK
Contact:
Contact TinheadNed
Website

  • Quote

Post by TinheadNed » Thu Jan 05, 2012 8:45 pm

From the changelog of the 3.2 kernel: "desktop reponsiveness in presence of heavy writes has been improved"
Top
Holysword
l33t
l33t
User avatar
Posts: 946
Joined: Sun Nov 19, 2006 10:03 pm
Location: Greece

  • Quote

Post by Holysword » Fri Feb 17, 2012 6:05 pm

Is this problem still up? I'm suffering from unresponsiveness very often; even stupid facebook flash games can bring my i7 with 4GB down - and let's say, I've got an infinite swap.

I've tried to blame the kernel (was zen-sources-something, can't remember but it doesn't have more than 2 weeks that I've updated it), but gentoo-sources also freezes/slows down. Was using BFQ+BFS, and now I'm with CFQ+CFS; same. SLUB to SLAB? Same. Remarking that BFQ is incompatible with cgroups, one would see that I was not using cgroups initially. Now I am using cgroups+autogroups. Nothing. My main system lies in a ReiserFS3 partition, not ext4 though.

I wouldn't claim this is annoying; this is being dangerous for me, since at least once a day my system crashes hopelessly and I have to hard-reboot it. Have lost a couple of files so far. I was considering a depclean + emerge -e world but now I am wondering if this is worth to try. 4 months ago I didn't have this problem (was using zen-sources back then) and some of you seem to have been having this for years...
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Top
depontius
Advocate
Advocate
Posts: 3533
Joined: Wed May 05, 2004 4:06 pm

  • Quote

Post by depontius » Fri Feb 17, 2012 6:38 pm

Are you running /home from a network drive?

My performance problems were from /home being mounted on nfsv4, and were related to firefox and its sqlite sync() behavior. A year or two back I moved .mozilla and .thunderbird to local disk, then symlinked the nfs-mounted .mozilla and .thunderbird directories to the local ones. Problem gone.

Some time after moving that system to 3.2.x I saw the notice of improved responsiveness, and tried moving .mozilla back to nfs. My performance problems came back, though they didn't seem quite as bad. The other night I moved .mozilla back to local disk.

Other than that, I'm happy and even with that I wasn't having problems with crashing. Have you tried memtest86+?
.sigs waste space and bandwidth
Top
Holysword
l33t
l33t
User avatar
Posts: 946
Joined: Sun Nov 19, 2006 10:03 pm
Location: Greece

  • Quote

Post by Holysword » Fri Feb 17, 2012 6:55 pm

depontius wrote:Are you running /home from a network drive?

My performance problems were from /home being mounted on nfsv4, and were related to firefox and its sqlite sync() behavior. A year or two back I moved .mozilla and .thunderbird to local disk, then symlinked the nfs-mounted .mozilla and .thunderbird directories to the local ones. Problem gone.

Some time after moving that system to 3.2.x I saw the notice of improved responsiveness, and tried moving .mozilla back to nfs. My performance problems came back, though they didn't seem quite as bad. The other night I moved .mozilla back to local disk.

Other than that, I'm happy and even with that I wasn't having problems with crashing. Have you tried memtest86+?
No, everything is local. I also use chrome, not firefox, but have tested a few times with firefox, it seems to have the same behaviour.
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Top
Holysword
l33t
l33t
User avatar
Posts: 946
Joined: Sun Nov 19, 2006 10:03 pm
Location: Greece

  • Quote

Post by Holysword » Fri Feb 24, 2012 2:16 pm

Okay folks... it seems it was both my kernel (it was zen-sources with BFS+BFQ), as seen in this thread, and a memory leak problem with my wm (as seen here).
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Top
xman1
n00b
n00b
Posts: 58
Joined: Sun Apr 11, 2004 7:48 pm

  • Quote

Post by xman1 » Thu Apr 26, 2012 4:14 pm

Has this been solved yet? I had these same issues and it turned out my Western Digital hard drive has a bug with APM. Pop into PM-utils default config and set APM to 255 to disable it and all works well now.

You can also do this with hdparm:

Code: Select all

hdparm -B 255 /dev/sda
Maybe this will help someone as the pauses are quite annoying.

-X

PS. I forgot to mention the pauses were affecting things system wide. The whole system would wait on the APM bug. Thanks WD.
Top
smlbstcbr
n00b
n00b
Posts: 51
Joined: Sat Apr 08, 2006 3:52 am

  • Quote

Post by smlbstcbr » Wed Jun 27, 2012 4:52 pm

Bump. I still have those issues in 3.3.8-gentoo.
Top
kernelOfTruth
Watchman
Watchman
User avatar
Posts: 6111
Joined: Tue Dec 20, 2005 10:34 pm
Location: Vienna, Austria; Germany; hello world :)
Contact:
Contact kernelOfTruth
Website

  • Quote

Post by kernelOfTruth » Wed Jul 04, 2012 10:37 am

in total we're trading in some throughput for interactivity & responsiveness:


you guys having the problems could give following tweaks a try:

Code: Select all

echo cfq > /sys/block/sda/queue/scheduler
echo 10000 > /sys/block/sda/queue/iosched/fifo_expire_async
echo 250 > /sys/block/sda/queue/iosched/fifo_expire_sync
echo 80 > /sys/block/sda/queue/iosched/slice_async
echo 1 > /sys/block/sda/queue/iosched/low_latency
echo 6 > /sys/block/sda/queue/iosched/quantum
echo 5 > /sys/block/sda/queue/iosched/slice_async_rq
echo 3 > /sys/block/sda/queue/iosched/slice_idle
echo 100 > /sys/block/sda/queue/iosched/slice_sync
hdparm -q -M 254 /dev/sda
(source: http://unix.stackexchange.com/questions ... em-caching)


I'm currently using all except the last one



in addition I'm using ck2 patchset for 3.4* kernel, patched BFS cpu scheduler up to version 424

and added Chen's O(1) tweak:

http://pastebin.com/ixw9PXAw


(thread: http://phoronix.com/forums/showthread.p ... ased/page7 )


this helps A LOT


edit:

some additional stuff

when your system uses swap heavily raise page-cluster:

Code: Select all

echo "12" > /proc/sys/vm/page-cluster
or

Code: Select all

echo "10" > /proc/sys/vm/page-cluster
helps with interactivity issues for me



keep swapping low if possible:

Code: Select all

echo "15" > /proc/sys/vm/swappiness
Con Kolivas afaik recommends 10

Code: Select all

echo "10" > /proc/sys/vm/swappiness


keep

dirty_background_ratio and dirty_ratio low

Code: Select all

echo "5" > /proc/sys/vm/dirty_background_ratio
and

Code: Select all

echo "9"   > /proc/sys/vm/dirty_ratio

also make sure that pdflush/bdflush don't write out stuff too seldom

Code: Select all

echo "300"  > /proc/sys/vm/dirty_writeback_centisecs 
300 (3 seconds) should be the default, afaik powertop and other tools recommend 1500 (15 seconds)


edit:

added some settings I'm currently playing around with

edit2:

set

Code: Select all

echo "300"  > /proc/sys/vm/dirty_writeback_centisecs
instead of 500 that seems to improve stalls
Last edited by kernelOfTruth on Sat Jul 07, 2012 11:40 am, edited 1 time in total.
https://github.com/kernelOfTruth/ZFS-fo ... scCD-4.9.0
https://github.com/kernelOfTruth/pulsea ... zer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Top
smlbstcbr
n00b
n00b
Posts: 51
Joined: Sat Apr 08, 2006 3:52 am

  • Quote

Post by smlbstcbr » Sat Jul 07, 2012 2:09 am

I'll see how that works in my machine. How unfortunate to have such issues in the Gentoo Kernel. It seems to me that it has slowed since the change to 3.XX kernels.
Top
kernelOfTruth
Watchman
Watchman
User avatar
Posts: 6111
Joined: Tue Dec 20, 2005 10:34 pm
Location: Vienna, Austria; Germany; hello world :)
Contact:
Contact kernelOfTruth
Website

  • Quote

Post by kernelOfTruth » Sat Jul 07, 2012 11:39 am

smlbstcbr wrote:I'll see how that works in my machine. How unfortunate to have such issues in the Gentoo Kernel. It seems to me that it has slowed since the change to 3.XX kernels.
try setting dirty_writeback_centisecs even lower

I just set it to 300 yesterday and it seems to play a very important role

Code: Select all

echo "300"  > /proc/sys/vm/dirty_writeback_centisecs
https://github.com/kernelOfTruth/ZFS-fo ... scCD-4.9.0
https://github.com/kernelOfTruth/pulsea ... zer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Top
smlbstcbr
n00b
n00b
Posts: 51
Joined: Sat Apr 08, 2006 3:52 am

  • Quote

Post by smlbstcbr » Sun Jul 08, 2012 3:24 pm

Well, I'm trying your solution (thank you for posting them). There's a slight improvement. Not as smooth as it used to be.
EDIT: I have been using a value of 200 for the last parameter and my system has improved significantly, though there's still some lag when swapping windows or opening some documents.
Top
Locked

158 posts
  • Page 5 of 7
    • Jump to page:
  • Previous
  • 1
  • …
  • 3
  • 4
  • 5
  • 6
  • 7
  • Next

Return to “Gentoo on AMD64”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy