I initially tracked it down to FAM (File Access Monitor), but it has turned out to be the calls that FAM is using, opendir, readdir and closedir.
Now (and here is the weird thing), it only happens on 1 of 5 pretty much identical (in terms of configuration) machines and only in certain directories. I am EXT3 based, and don't have any other filesystems types to try. I am including some code that demonstrates the problem.
WARNING: If you have the problem, then this progam will crash or cause you to have to reboot your machine. Do not run on critical machines or if you not prepared to have to reboot.
First the vital statistics:
Kernel: gentoo-sources-2.4.22-r3
Config:
Code: Select all
CONFIG_EXT3_FS=y
# CONFIG_EXT3_FS_XATTR is not set
# CONFIG_EXT3_FS_XATTR_SHARING is not set
# CONFIG_EXT3_FS_XATTR_USER is not set
# CONFIG_EXT3_FS_XATTR_TRUSTED is not set
# CONFIG_EXT3_FS_POSIX_ACL is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
Code: Select all
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <dirent.h>
#include <stdio.h>
// WARNING: May cause your machine to crash!
main()
{
for (;;)
{
DIR *dir = opendir(".");
if (dir == NULL)
{
perror("opendir");
continue;
}
readdir(dir);
closedir(dir);
}
}
Available memory falls quickly, and cannot be recovered even by stopping the program. Change the loop to run a specific number of times to see the loss without killing the machine.
If anyone can work out what the common factor in all this is, I would be grateful.
Thanks.

