Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[TIP] Find the biggest files on your HD
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks
View previous topic :: View next topic  
Author Message
bLUEbYTE84
Guru
Guru


Joined: 21 Jul 2006
Posts: 566
Location: universe.tar.gz, src/earth.h, struct homo_sapiens_table

PostPosted: Mon Jan 01, 2007 12:47 am    Post subject: [TIP] Find the biggest files on your HD Reply with quote

Hi,
Here is a script you can use to find top 20(can be adjusted of course) files on your file system. I made it so that I can check if I forgot big files like .mpegs, .isos and such on the filesystem before doing a full system backup with dar; as I don't want those entering to the compressed backup archive and always burn them to dvds. Enjoy.

Code:
#!/bin/sh
 echo "Note: It is advised to run this script as root"
 echo "Please Wait... Creating the list."
 echo "-----------------------------------------------------------------------------------------"
 find / -printf "%s bytes %p\n" \
 | sed -e '/\/proc/d' \
 | sed -e '/\/dev/d' \
 | sed -e '/\/sys/d' \
 | sed -e '/\/mnt/d' > ~/listtemp.txt
 echo "-----------------------------------------------------------------------------------------"
 echo "Filtered List Created. Now sorting..."
 sort -rn ~/listtemp.txt > ~/list.txt
 rm ~/listtemp.txt
 NUM=20
 echo "-----------------------------------------------------------------------------------------"
 echo "Here is the list of top $NUM biggest files excluding those in /proc, /sys, /dev and /mnt:"
 cat ~/list.txt | head -$NUM
 rm ~/list.txt
 


Last edited by bLUEbYTE84 on Wed Jan 10, 2007 4:10 pm; edited 1 time in total
Back to top
View user's profile Send private message
SkyeAdun
n00b
n00b


Joined: 24 Oct 2006
Posts: 56
Location: France

PostPosted: Mon Jan 01, 2007 11:18 am    Post subject: Reply with quote

Hi.

Sorry to correct you, but it could be more efficient. While scripting, it's truly important to avoid pipes and temporary files which cost a lot of time and risk to prevent concurrential use of the script (~list.txt is not dependent of the process). Also, if you use temporary files, don't forget to trap interruptions in order to remove the files.

Thank you for posting your tips, it could be usefull. Here is the script I propose (not tested) :

Code:

#!/bin/sh
echo "Note: It is advised to run this script as root"
echo "Please Wait... Creating the list."
buffer="`find / -printf "%s bytes %p\n" \
           | sed -e '/\/proc/d' \
                 -e '/\/dev/d' \
                 -e '/\/sys/d' \
                 -e '/\/mnt/d'`"
echo "Filtered List Created. Now sorting..."
NUM=20
buffer="`echo "$buffer" | sort -rn | head -$NUM`"
echo "Here is the list of top $NUM biggest files excluding those in /proc, /sys, /dev and /mnt:"
echo "$buffer"


We can certainly avoid the filtering with sed by using find options, but I can't test it for now. It may also be usefull to avoid the listing of mounted filesystems with find / -xdev, it seems that it was what you want to do when filtering /mnt.
Back to top
View user's profile Send private message
SkyeAdun
n00b
n00b


Joined: 24 Oct 2006
Posts: 56
Location: France

PostPosted: Tue Jan 02, 2007 9:50 am    Post subject: Reply with quote

Hello

Here is the best version :

Code:

#!/bin/sh
echo "Note: It is advised to run this script as root"
echo "Please Wait... Creating the list."
buffer="`find / -not \( -path /proc -path /sys -path /dev \) -xdev -printf "%s bytes %p\n" 2>/dev/null`"
echo "Filtered List Created. Now sorting..."
NUM=20
time buffer="`echo "$buffer" | sort -rn | head -$NUM`"
echo "Here is the list of top $NUM biggest files excluding those in /proc, /sys, /dev and mount points :"
echo "$buffer"


Hope it will help. It is mostly important to take care of optimization problems. Now it is possible to make $NUM become a parameter...
Back to top
View user's profile Send private message
Dralnu
Veteran
Veteran


Joined: 24 May 2006
Posts: 1919

PostPosted: Wed Jan 03, 2007 3:58 am    Post subject: Reply with quote

SkyeAdun wrote:
Hello

Here is the best version :

Code:

#!/bin/sh
echo "Note: It is advised to run this script as root"
echo "Please Wait... Creating the list."
buffer="`find / -not \( -path /proc -path /sys -path /dev \) -xdev -printf "%s bytes %p\n" 2>/dev/null`"
echo "Filtered List Created. Now sorting..."
NUM=20
time buffer="`echo "$buffer" | sort -rn | head -$NUM`"
echo "Here is the list of top $NUM biggest files excluding those in /proc, /sys, /dev and mount points :"
echo "$buffer"


Hope it will help. It is mostly important to take care of optimization problems. Now it is possible to make $NUM become a parameter...


Maybe you could comment your script so those of us who want to, could read through it without spending alot of time hunting down the hows and whys of it?

Its good coding practice to do so, anyways. Makes it easier to maintain.
_________________
The day Microsoft makes a product that doesn't suck, is the day they make a vacuum cleaner.
Back to top
View user's profile Send private message
nanafunk
n00b
n00b


Joined: 29 Jun 2005
Posts: 36

PostPosted: Sun Jan 07, 2007 11:48 pm    Post subject: Reply with quote

An alias I use to do this.....
Code:
find / \( -type d -regex '^/\(dev\|lost\+found\|mnt\|proc\)' -prune -o -type f -printf "%-12s%p\n" \) | sort -rn -k1,1 | less

The shell may appear blank for a few seconds depending on your system, while it fills up less with the results
Back to top
View user's profile Send private message
Conan
Guru
Guru


Joined: 02 Nov 2004
Posts: 360

PostPosted: Mon Jan 08, 2007 2:20 am    Post subject: Reply with quote

filelight is a nice guiish program to display the size of files also for those who would rather a graphic representation :)

Not a bad script though, fairly self explanatory and logical if a bit odd. I'd suggest giving find a -size +50M or some number to reduce the amount of paths stored.
Back to top
View user's profile Send private message
likewhoa
l33t
l33t


Joined: 04 Oct 2006
Posts: 778
Location: Brooklyn, New York

PostPosted: Thu Feb 01, 2007 12:31 pm    Post subject: Reply with quote

another way is to add the -size flag to find for finding files or a certain size, this will speed the find process a bit.

example:

find / -regex '^/\(dev\|sys\|lost\+found\|mnt\|proc\)' -prune -o -type f -size +200M -printf "%-12s%p\n"

will search the / root structure skipping the /dev,/sys,/lost+found,/mnt, & /proc folders looking for files with a size value equal or greater than 200Megabytes.

read the find man page for more info on the -size flag.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum