View previous topic :: View next topic |
Author |
Message |
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Tue Dec 30, 2003 12:41 pm Post subject: 2.4.22-gentoo-r2 eats memory?! |
|
|
I've been running 2.4.22-gentoo-r2 for over a week and noticed that on both of my boxes, memory seems to get eaten and never gets released. On the Athlon box w/512MB, 100% of physical memory and the swap file is now 44MB, and is still slowly increasing. On my old Pentium 166 box w/ 128MB, I've had to reboot it every two days as the memory gets all eaten up and constantly swaps for hours upon hours. Has anyone seen similiar problems like this? _________________ Cheers,
Alex.
Linux - the best text adventure game ever. |
|
Back to top |
|
|
M104 Tux's lil' helper
Joined: 13 Jan 2003 Posts: 132 Location: Riverside, CA
|
Posted: Tue Dec 30, 2003 11:54 pm Post subject: |
|
|
I've had a problem recently with slocate (usually called updatedb, which is run at midnight every night) eating up hundreds of megs of memory and then not releasing them. I'm running a Duron with 768 MB of RAM using 2.6.0 and ext3 filesystem, but it was the same with -test9, -test10, and -test11. My "solution" was to lose the nightly updatedb cron job and just reboot to free the memory. It's worked fine without the nightly process, so I haven't really bothered investigating. My other box (unaffected) is a Pentium MMX on 2.4.19 with 32 MB of RAM and using ReiserFS. Both are up to date with x86, besides the kernels. I'm thinking that maybe my slocate and ext3 don't play well together, since everything was peachy when I used ReiserFS and 2.4.22 on the same system, but I really haven't had a change to check this out. _________________ "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions."
Terry Pratchett, The Truth |
|
Back to top |
|
|
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Wed Dec 31, 2003 12:35 am Post subject: Yep! |
|
|
Chuffin' hell, the slocate nightly run *was* definitely the culprit. I use ext3fs on both my boxes. Have you tried recompiling slocate? I'm about to try that out now. _________________ Cheers,
Alex.
Linux - the best text adventure game ever.
Last edited by alexbuell on Wed Dec 31, 2003 12:37 am; edited 1 time in total |
|
Back to top |
|
|
M104 Tux's lil' helper
Joined: 13 Jan 2003 Posts: 132 Location: Riverside, CA
|
Posted: Wed Dec 31, 2003 12:37 am Post subject: |
|
|
Wow, so this is the problem, huh? This bug sucked up like 500MB of RAM from me and it sounds like you too. I'll check the gentoo bug repository for this and see what's up. And yes, I've re-emerged to no avail.
I'm glad I caught your post since I thought I was the only one to have this problem and ignored it! _________________ "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions."
Terry Pratchett, The Truth |
|
Back to top |
|
|
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Wed Dec 31, 2003 12:41 am Post subject: |
|
|
M104 wrote: | Wow, so this is the problem, huh? This bug sucked up like 500MB of RAM from me and it sounds like you too. I'll check the gentoo bug repository for this and see what's up. And yes, I've re-emerged to no avail.
I'm glad I caught your post since I thought I was the only one to have this problem and ignored it! |
Loosk like we've stumbled across a nasty kernel bug. I've run updatedb twice now, and will keep trying until the problem happens again - I've got to be quite sure I'm blaming the right miscreant. _________________ Cheers,
Alex.
Linux - the best text adventure game ever. |
|
Back to top |
|
|
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Wed Dec 31, 2003 12:42 am Post subject: |
|
|
Wow, kswapd looks _very_ unhappy now. _________________ Cheers,
Alex.
Linux - the best text adventure game ever. |
|
Back to top |
|
|
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Wed Dec 31, 2003 12:48 am Post subject: |
|
|
Hang on. Show me your tune2fs -l output for your filesystems - most particularly, I want to see what filesystem features you've got enabled. _________________ Cheers,
Alex.
Linux - the best text adventure game ever. |
|
Back to top |
|
|
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Wed Dec 31, 2003 12:54 am Post subject: |
|
|
alexbuell wrote: | Hang on. Show me your tune2fs -l output for your filesystems - most particularly, I want to see what filesystem features you've got enabled. |
Once I've seen your tune2fs -l results, and that confirms my suspicions, the next step will be to download vanilla kernel sources and confirm that it's not caused by a gentoo patch as used with the gentoo-sources. _________________ Cheers,
Alex.
Linux - the best text adventure game ever. |
|
Back to top |
|
|
M104 Tux's lil' helper
Joined: 13 Jan 2003 Posts: 132 Location: Riverside, CA
|
Posted: Wed Dec 31, 2003 12:56 am Post subject: |
|
|
Alright, I added this as Bug 36855. (My first Gentoo bug!) I'm not at the affected computer right now, but I will be in about two hours. I'll post more information then and hopefully we can narrow it down a bit. _________________ "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions."
Terry Pratchett, The Truth |
|
Back to top |
|
|
M104 Tux's lil' helper
Joined: 13 Jan 2003 Posts: 132 Location: Riverside, CA
|
Posted: Wed Dec 31, 2003 1:04 am Post subject: |
|
|
alexbuell wrote: | Loosk like we've stumbled across a nasty kernel bug. I've run updatedb twice now, and will keep trying until the problem happens again - I've got to be quite sure I'm blaming the right miscreant. |
I ran for over a week with this bug and the nightly updates and the funny thing was that the memory leak seemed to decrease over a few days and then stabilize from a maximum of 600MB down to around 400MB. I really thought that it was some new kernel caching feature because of the weird behaviour. Eventually, though, I just got sick of all that wasted memory so I dropped the cron job and rebooted.
alexbuell wrote: | Once I've seen your tune2fs -l results, and that confirms my suspicions, the next step will be to download vanilla kernel sources and confirm that it's not caused by a gentoo patch as used with the gentoo-sources. |
Well, first of all, I'm running development-sources. Are there any Gentoo patches in there? I'm betting that the problem is slocate, but we'll see. _________________ "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions."
Terry Pratchett, The Truth
Last edited by M104 on Wed Dec 31, 2003 1:11 am; edited 1 time in total |
|
Back to top |
|
|
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Wed Dec 31, 2003 1:10 am Post subject: |
|
|
M104 wrote: | alexbuell wrote: | Loosk like we've stumbled across a nasty kernel bug. I've run updatedb twice now, and will keep trying until the problem happens again - I've got to be quite sure I'm blaming the right miscreant. |
I ran for over a week with this bug and the nightly updates and the funny thing was that the memory leak seemed to decrease over a few days and then stabilize from a maximum of 600MB down to around 400MB. I really thought that it was some new kernel caching feature because of the weird behaviour. Eventually, though, I just got sick of all that wasted memory so I dropped the cron job and rebooted. |
Ah, that explains why my 512MB box hasn't died yet. I'm on my fifth consecutive updatedb run on the 128MB box, it seems to be quite distinctly slow. I now think it's a combination of rsync and updatedb operations that triggers this bug. _________________ Cheers,
Alex.
Linux - the best text adventure game ever. |
|
Back to top |
|
|
M104 Tux's lil' helper
Joined: 13 Jan 2003 Posts: 132 Location: Riverside, CA
|
Posted: Wed Dec 31, 2003 1:14 am Post subject: |
|
|
Mods, can we move this to Other Things Gentoo? _________________ "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions."
Terry Pratchett, The Truth |
|
Back to top |
|
|
incubator Guru
Joined: 05 Jun 2003 Posts: 584 Location: Belgium
|
Posted: Wed Dec 31, 2003 1:23 am Post subject: |
|
|
I too suffer from it.
I have 512mb ddr and have curretly 200mb free.
There was 435mb free when I just booted, and after a few emerges and an updatedb this is whats left.
The programs i am running atm is low: konqueror, xmms, apache, mysql, gkrellm and the regular batch of system processes.
since I use fluxbox I should have about 380mb free right now, but I dont.
Yestderday (first time I tested my frashly installed 2.4.22-r2)
I experienced enormous lags when emerging (regular or pretend), X restarted at random and konqueror crashed at random. Today I recompiled the kernel, (not much vital changes mind you, just and addition to i2c wich I apparently didn t even need :p), I recompiled kde and updated xmms (wich also crashed yesterday at random) today I have experienced no crashes anymore (even after rebooting several times) though as I said; currently 200mb left (this is the value shown by gkrellm, not top, since top always says I have a few legs free, even with other kernels due to the different memory managment in linux)
my output from tune2fs -l:
Code: |
tune2fs -l /dev/hda1
tune2fs 1.34 (25-Jul-2003)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: fba8d41a-f018-4ac4-a8bd-778d7e2a1430
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal dir_index filetype sparse_super
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 18072
Block count: 72261
Reserved block count: 3613
Free blocks: 63865
Free inodes: 18043
First block: 1
Block size: 1024
Fragment size: 1024
Blocks per group: 8192
Fragments per group: 8192
Inodes per group: 2008
Inode blocks per group: 251
Filesystem created: Fri Jul 18 00:50:08 2003
Last mount time: Tue Dec 30 20:46:38 2003
Last write time: Tue Dec 30 20:47:08 2003
Mount count: 10
Maximum mount count: 24
Last checked: Wed Oct 15 00:43:56 2003
Check interval: 15552000 (6 months)
Next check after: Mon Apr 12 00:43:56 2004
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
Default directory hash: tea
Directory Hash Seed: 9567ca6c-efd1-47d9-8a70-0c1029316425
|
this is from /boot wich is the only ext3 partition, root is reiserfs |
|
Back to top |
|
|
M104 Tux's lil' helper
Joined: 13 Jan 2003 Posts: 132 Location: Riverside, CA
|
Posted: Wed Dec 31, 2003 3:01 am Post subject: |
|
|
Here's mine: Code: | almond root # tune2fs -l /dev/sda1
tune2fs 1.34 (25-Jul-2003)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 9b04a478-7f67-4d87-b265-7eb3eac679e4
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal filetype needs_recovery sparse_super
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 625248
Block count: 1249045
Reserved block count: 62452
Free blocks: 1055673
Free inodes: 612390
First block: 0
Block size: 4096
Fragment size: 4096
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 16032
Inode blocks per group: 501
Filesystem created: Sun Nov 23 19:53:41 2003
Last mount time: Thu Dec 25 02:29:35 2003
Last write time: Thu Dec 25 02:29:35 2003
Mount count: 16
Maximum mount count: 30
Last checked: Sun Nov 23 19:53:41 2003
Check interval: 15552000 (6 months)
Next check after: Fri May 21 20:53:41 2004
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
Default directory hash: tea
Directory Hash Seed: 033736f2-da61-4958-a6af-5c0839b08af6 |
_________________ "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions."
Terry Pratchett, The Truth |
|
Back to top |
|
|
wll n00b
Joined: 25 Aug 2003 Posts: 38
|
Posted: Wed Dec 31, 2003 5:09 am Post subject: |
|
|
For what it's worth, I've got the same problem on a UML system (at Linode.com, highly recommended) based on 2.4.23.
This kernel is used by all the Linode.com distros, not just Gentoo (Debian, Fedora, Mandrake, RedHat and Slackware, too).
Tracked it down a couple of days before this thread (damn!) and was wondering whether to bring it up here or at my host. Guess I'll point him here. I wrote a little process monitoring script that found it (live, debug and learn). I'm about to move from UML to a dedicated host, so I guess slocate will stay disabled (/etc/cron.never). |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20067
|
Posted: Wed Dec 31, 2003 5:39 am Post subject: |
|
|
Moved from Installing Gentoo. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
incubator Guru
Joined: 05 Jun 2003 Posts: 584 Location: Belgium
|
Posted: Wed Dec 31, 2003 12:11 pm Post subject: |
|
|
update: X crashed again this night, and even though all apps were shutdown, I was left with 235mb ram.
Wich should be 435 when I would restart |
|
Back to top |
|
|
incubator Guru
Joined: 05 Jun 2003 Posts: 584 Location: Belgium
|
Posted: Fri Jan 02, 2004 12:55 pm Post subject: |
|
|
and apparently, reiserfs IS affected as well.
I had 450mb ram free, updatedb finished and 100 mb was left.
(still after 30 ' )
and my entire root (full storage etc) is reiserfs. Only my /boot is ext3. |
|
Back to top |
|
|
Epyon l33t
Joined: 11 Sep 2003 Posts: 754 Location: NJ, USA
|
Posted: Fri Jan 02, 2004 2:05 pm Post subject: |
|
|
I have the same problem with every version of 2.6.0 I have tried. I'm using reiserfs. The slocate job ran over an hour ago and half my 512mb of ram is still apparently being used. |
|
Back to top |
|
|
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Sat Jan 03, 2004 9:07 am Post subject: Culprit |
|
|
Look at your /proc/slabinfo figures for inode_cache and dentry_cache. Those two are responsible for the problems we've been seeing. if you get hold of Larry McVoy's lmbench programs, runnng lmdd will force a reclaim and you get all that memory freed. Caveat: Exit X11 first before running it. See http://www.ussg.iu.edu/hypermail/linux/kernel/0112.1/0072.html for more details. _________________ Cheers,
Alex.
Linux - the best text adventure game ever. |
|
Back to top |
|
|
M104 Tux's lil' helper
Joined: 13 Jan 2003 Posts: 132 Location: Riverside, CA
|
Posted: Sat Jan 03, 2004 10:56 am Post subject: Re: Culprit |
|
|
alexbuell wrote: | Look at your /proc/slabinfo figures for inode_cache and dentry_cache. Those two are responsible for the problems we've been seeing. if you get hold of Larry McVoy's lmbench programs, runnng lmdd will force a reclaim and you get all that memory freed. Caveat: Exit X11 first before running it. See http://www.ussg.iu.edu/hypermail/linux/kernel/0112.1/0072.html for more details. |
If it's the same problem we're experiencing, umm wow... That post was from over two years ago! Well I'm off to test lmdd and see what happens. _________________ "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions."
Terry Pratchett, The Truth |
|
Back to top |
|
|
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Sat Jan 03, 2004 11:22 am Post subject: Re: Culprit |
|
|
M104 wrote: | alexbuell wrote: | Look at your /proc/slabinfo figures for inode_cache and dentry_cache. Those two are responsible for the problems we've been seeing. if you get hold of Larry McVoy's lmbench programs, runnng lmdd will force a reclaim and you get all that memory freed. Caveat: Exit X11 first before running it. See http://www.ussg.iu.edu/hypermail/linux/kernel/0112.1/0072.html for more details. |
If it's the same problem we're experiencing, umm wow... That post was from over two years ago! Well I'm off to test lmdd and see what happens. |
Well yes, the VM sucks mightily in not reclaiming memory aggressively enough. _________________ Cheers,
Alex.
Linux - the best text adventure game ever. |
|
Back to top |
|
|
M104 Tux's lil' helper
Joined: 13 Jan 2003 Posts: 132 Location: Riverside, CA
|
Posted: Sat Jan 03, 2004 12:16 pm Post subject: |
|
|
Alright, results after a fresh reboot:
Code: | #free -mt
total used free shared buffers cached
Mem: 756 68 688 0 2 41
-/+ buffers/cache: 24 732
Swap: 972 0 972
Total: 1729 68 1660 |
Only 24 MB being used. Now let's activate the bug and see how much we have left:
Code: | #updatedb
#free -mt
total used free shared buffers cached
Mem: 756 493 263 0 162 45
-/+ buffers/cache: 285 470
Swap: 972 0 972
Total: 1729 493 1235 |
OK, So I've "lost" 260 MB of space. Now lets see what I can reclaim:
Code: | #lmdd opat=1 count=1 bs=900m
#free -mt
total used free shared buffers cached
Mem: 756 87 669 0 0 3
-/+ buffers/cache: 84 672
Swap: 972 72 900
Total: 1729 159 1570
|
So, there's still an additional 150MB or so that was not reclaimed in this case. That memory isn't cached or buffered, though, so it's technically unavailable to the system, right? That sounds like a memory leak to me, but I'm new to debugging like this. _________________ "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions."
Terry Pratchett, The Truth |
|
Back to top |
|
|
alexbuell Guru
Joined: 18 Jul 2002 Posts: 490 Location: "Hemp"shire, UK
|
Posted: Sat Jan 03, 2004 12:56 pm Post subject: |
|
|
M104 wrote: |
So, there's still an additional 150MB or so that was not reclaimed in this case. That memory isn't cached or buffered, though, so it's technically unavailable to the system, right? That sounds like a memory leak to me, but I'm new to debugging like this. |
Try it with bs=1500m (i.e. total amount of memory available (includes swap)) _________________ Cheers,
Alex.
Linux - the best text adventure game ever. |
|
Back to top |
|
|
M104 Tux's lil' helper
Joined: 13 Jan 2003 Posts: 132 Location: Riverside, CA
|
Posted: Sat Jan 03, 2004 10:11 pm Post subject: |
|
|
alexbuell wrote: | M104 wrote: |
So, there's still an additional 150MB or so that was not reclaimed in this case. That memory isn't cached or buffered, though, so it's technically unavailable to the system, right? That sounds like a memory leak to me, but I'm new to debugging like this. |
Try it with bs=1500m (i.e. total amount of memory available (includes swap)) |
I tried it with the maximum bs=???m that it would let me (1600m, for the first run) and then increased it as more memory was reclaimed. I got to about 1720m before it wouldn't reclaim any more. At that point though, I was still out about 30 MB in "lost" memory and it took a while to get to that point. So I guess it is possible to reclaim most of the lost memory, but it's still pretty gross.
EDIT: I went looking through the kernel source for any hints as to where this lost memory might be going and the first thing I saw is that I don't know anything about the kernel source... _________________ "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions."
Terry Pratchett, The Truth
Last edited by M104 on Sat Jan 03, 2004 10:15 pm; edited 1 time in total |
|
Back to top |
|
|
|