Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Script to cut rsync file list by upto 95%
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Unsupported Software
View previous topic :: View next topic  
Author Message
doppelganger
Tux's lil' helper
Tux's lil' helper


Joined: 30 Jun 2004
Posts: 84

PostPosted: Fri Aug 26, 2005 4:42 pm    Post subject: update Reply with quote

well I re-ran emerge --sync and everything seemed to work fine then. I guess the first run creates the files it uses for the second --sync.....<SOLVED>
Back to top
View user's profile Send private message
ZoeF
n00b
n00b


Joined: 02 Jul 2005
Posts: 17

PostPosted: Sun Aug 28, 2005 4:24 pm    Post subject: Reply with quote

Thank you!

I love this script. My rsyncs are less than a tenth of what they used to be.
Back to top
View user's profile Send private message
gentop
l33t
l33t


Joined: 29 Nov 2004
Posts: 639

PostPosted: Tue Oct 18, 2005 9:53 am    Post subject: Reply with quote

Hi,

qpkg is deprecated now. So, if you run prlock.py and qpkg is not found edit it: Replace
Code:
pkgl = os.popen('/bin/bash /usr/bin/qpkg -nc -I','r').readlines()
with
Code:
pkgl=[]
for i in os.listdir('/var/db/pkg'):
        for j in os.listdir('/var/db/pkg/%s' % (i)):
                elem = re.search('^(.*\/.*)-[0-9].*$',"%s/%s" % (i,j)).group(1)
                try:
                        pkgl.index(elem)
                except:
                        pkgl.append(elem)


If you still want to use qpkg, look at /usr/lib/gentoolkit/bin/qpkg. So you can also change
Code:
pkgl = os.popen('/bin/bash /usr/bin/qpkg -nc -I','r').readlines()
to
Code:
pkgl = os.popen('/bin/bash /usr/lib/gentoolkit/qpkg -nc -I','r').readlines()
But this is deprecated!

//gentop
Back to top
View user's profile Send private message
_hephaistos_
Advocate
Advocate


Joined: 07 Apr 2004
Posts: 2694
Location: salzburg, austria

PostPosted: Tue Oct 18, 2005 5:33 pm    Post subject: Reply with quote

@gentop: the path should be: /usr/lib/gentoolkit/bin/qpkg

but thanks for the replace code!

cheers
_________________
-l: signature: command not found
Back to top
View user's profile Send private message
Yonathan
l33t
l33t


Joined: 05 Jan 2005
Posts: 662

PostPosted: Thu Dec 29, 2005 3:27 pm    Post subject: Reply with quote

when i add two packets with:
prlock A
prlock B
i can't find A in rsync_excludes, so i have to re-add A manually to rsync_...

why? someone has a solution for that problem?

yona
_________________
Athlon XP+ 2400 Thunderbird,
Abit NF7
1536MB DDR (266),
Radeon 9200 (256mb)
gentoo 2.6.19-r5
Back to top
View user's profile Send private message
indanet
n00b
n00b


Joined: 05 Sep 2004
Posts: 54

PostPosted: Fri Dec 30, 2005 8:47 pm    Post subject: Reply with quote

Hi!
Yonathan wrote:
when i add two packets with:
prlock A
prlock B
i can't find A in rsync_excludes, so i have to re-add A manually to rsync_...

I can't reproduce your problem. Here's my version of the script:
[removed, see below]


Last edited by indanet on Sat Dec 31, 2005 11:52 am; edited 2 times in total
Back to top
View user's profile Send private message
Gentree
Watchman
Watchman


Joined: 01 Jul 2003
Posts: 5350
Location: France, Old Europe

PostPosted: Sat Dec 31, 2005 12:39 am    Post subject: Reply with quote

Nice idea , I've been wanting to sort out some form of rsync_excludes for a long time but I could never get it to work even with a trivial exclude file. Never sussed why not, and never got any help. :(

One fault is that it assumes default portage directories. You should probably read $PORTDIR etc from make.conf and use those values instead.

Equally you check for the presence of rsync_excludes and break if it is not there as the std value. It would be better to read what is there and use it (heck , you've read this line in already , just parse out the filename.)

Great idea anyway should seriously lighten the load on the mirrors. 8)
_________________
Linux, because I'd rather own a free OS than steal one that's not worth paying for.
Gentoo because I'm a masochist
AthlonXP-M on A7N8X. Portage ~x86
Back to top
View user's profile Send private message
indanet
n00b
n00b


Joined: 05 Sep 2004
Posts: 54

PostPosted: Sat Dec 31, 2005 11:57 am    Post subject: Reply with quote

Gentree wrote:
One fault is that it assumes default portage directories. You should probably read $PORTDIR etc from make.conf and use those values instead.

Equally you check for the presence of rsync_excludes and break if it is not there as the std value. It would be better to read what is there and use it (heck , you've read this line in already , just parse out the filename.)


This version of the script searches make.conf as requested and checks if the environment variables $PORTDIR or $RSYNC_EXCLUDES are set.

prlock.py
Code:
#!/usr/bin/python
#
# prlock.py (Portage Rsync Lockdown)
# This script creates a exclude list for portage's rsync. The exclude list
# excludes all but the installed packages (found with qpkg).
# Specify 'branch-name/package-name' arguments to unlock additional packages
# that are not yet installed.
# Remember to edit make.conf and make the dir /etc/portage
#
# *WARNING*
# If you use this script and find yourself wanting to add a new package...
# You MUST either add it to the exclude list (By passing it as an arg to
# prlock) or comment out the exclude list entirly in make.conf.
# Then do a sync
# THEN add the package
# If you don't you will add a package that may be out of date!
#
# Eg.....
#     > prlock.py dev-python/wxPython
# This will restrict the rsync to only the installed packages AND wxPython
# All other packages will not be updated.
#
# Nick Fisher prlock@nickdafish.com
#
import os, re, sys

# Quit if an arg that doesn't look like a package is encountered.
for arg in sys.argv[1:]:
    if not re.match('[-\w]+/[-\w]+',arg):
        print 'Line doesn\'t look like it is a package....\n',arg,'\n','...aborting.'
        sys.exit(1)

# Trying to get portage dir and exclude file from environment
excl = os.environ.get('RSYNC_EXCLUDES', '')
portdir = os.environ.get('PORTDIR', '')

# Extract path of exclude file from make.conf
re_excl = re.compile('^\s*RSYNC_EXCLUDEFROM="?(?P<excl>(/[-\w]+)+)"?\s*$')
re_portdir = re.compile('^\s*PORTDIR="?(?P<portdir>(/[-\w]+)+)"?\s$')
for line in open('/etc/make.conf','r').readlines():
    if not len(excl):
        res = re_excl.match(line)
        if res: excl = res.group('excl')
    if not len(portdir):
        res = re_portdir.match(line)
        if res: portdir = res.group('portdir')

if not len(excl):
    print 'WARNING! Could not get value of RSYNC_EXCLUDEFROM! Using default...'
    excl = '/etc/portage/rsync_excludes'
    print 'See make.conf(5) for help on setting the variable'
    print 'Portage will not exclude without it!'

if not len(portdir):
    portdir = '/usr/portage'

# Check that we will be able to write the file
#print 'Using',portdir,'as portage directory'
if not os.path.isdir(portdir):
    print portdir,"does not exist!\nPlease make the dir and try again."
    sys.exit(1)
#print 'Using',excl,'as exclude file'
if not os.path.isfile(excl):
    print excl,"does not exist!\nPlease make the file and try again."
    sys.exit(1)

# Find all the installed packages
pkgl=[]
for i in os.listdir('/var/db/pkg'):
    for j in os.listdir('/var/db/pkg/%s' % (i)):
        elem = re.search('^(.*\/.*)-[0-9].*$',"%s/%s" % (i,j)).group(1)
        try:
            pkgl.index(elem)
        except:
            pkgl.append(elem)

# Add the args
for arg in sys.argv[1:]:
    pkgl.append(arg)
    print 'Adding package from cmd line ',arg

# Add the branches to list and format
fmt = re.compile(r'[\w|-]*/')
list=[]
for line in pkgl:
    m = fmt.search(line)
    if not('+ '+m.group()+'\n' in list): list.append('+ '+m.group()+'\n')
    if not('+ metadata/cache/'+m.group()+'\n' in list): list.append('+ metadata/cache/'+m.group()+'\n')

# Add the package dir and all subdirs/files to list
for x in range(len(pkgl)):
    list.append('+ '+pkgl[x].strip()+'/\n')
    list.append('+ '+pkgl[x].strip()+'/**\n')
    list.append('+ metadata/cache/'+pkgl[x].strip()+'*\n')

# Append the directives to allow access to the metadata
list.append('+ metadata/\n')
list.append('+ metadata/*\n')
list.append('+ metadata/cache/\n')
list.append('+ metadata/cache/*\n')

# Allow sync for eclasses
# I don't know how I could exclude them well
list.append('+ eclass/\n')
list.append('+ eclass/**\n')

# Allow sync for files
# These shouldn't be excluded
list.append('+ files/\n')
list.append('+ files/**\n')

# Allow sync for libsidplay
# I havn't a clue what this branch is about
list.append('+ libsidplay/\n')
list.append('+ libsidplay/**\n')

# Allow sync for licenses
# Shouldn't be updated that often
list.append('+ licenses/\n')
list.append('+ licenses/**\n')

# Allow sync for packages
# Dunno what this branch is about
# Include for safteys sake
list.append('+ packages/\n')
list.append('+ packages/**\n')

# Allow sync for profiles
# Should smarten this up at some point
list.append('+ profiles/\n')
list.append('+ profiles/**\n')

# Allow sync for scripts
list.append('+ scripts/\n')
list.append('+ scripts/**\n')

# Sort the list for readability
list.sort()

# Add the include all files and exclude everything else
list.append('+ /*\n- *')

# Write the pkg list out to the exclude file
open(excl,'w').writelines(''.join(list))


I started to write a version which uses the gentoolkit and portage packages, which has colorized output. Maybe I'll be able to finish that in the next few days...

Best regards
indanet
Back to top
View user's profile Send private message
Gentree
Watchman
Watchman


Joined: 01 Jul 2003
Posts: 5350
Location: France, Old Europe

PostPosted: Sat Dec 31, 2005 12:42 pm    Post subject: Reply with quote

Quote:
This version of the script searches make.conf as requested and checks if the environment variables $PORTDIR or $RSYNC_EXCLUDES are set.


That was fast , nice work.

Now to see if I can get rsync_excludeds to work at all here, so far it's rsync=rs-ache. :wink:
_________________
Linux, because I'd rather own a free OS than steal one that's not worth paying for.
Gentoo because I'm a masochist
AthlonXP-M on A7N8X. Portage ~x86
Back to top
View user's profile Send private message
IQgryn
l33t
l33t


Joined: 05 Sep 2005
Posts: 764
Location: WI, USA

PostPosted: Tue Jan 10, 2006 6:16 am    Post subject: Suggestions Reply with quote

I have a couple of suggestions:
  • Allow people to add their own rules. I have a few ideas on how to do this:
    • Add options (perhaps flags) to insert the contents of another file before and/or after the prlock output.
    • Instead of deleting and recreating the existing excludes file, only modify it (and perhaps have a flag to delete and re-create)
    • If there's an existing file, move it to rsync_exludes.old, and output to rsync_excludes.new. After the new file is created, run
      Code:
      sdiff -o rsync_excludes rsync_exludes.old rsync_exludes.new

  • Allow the line that includes all files and folder in the root directory of the protage tree to be excluded (probably another flag). I know that this makes the script less general, but it irks me that there's a bunch of empty folder lying around. :roll: Another option would be to specifically exclude all the top-level folders that aren't used, which would be a little harder, but still keep the generality.
Back to top
View user's profile Send private message
indanet
n00b
n00b


Joined: 05 Sep 2004
Posts: 54

PostPosted: Sat Jan 14, 2006 7:42 pm    Post subject: Reply with quote

This would be great:
If a packages is added by the commandline, automatically add it's dependencies. (This is probably not possible because the ebuild to look this up is missing. But maybe I'm wrong.)
Back to top
View user's profile Send private message
kernelcowboy
Guru
Guru


Joined: 14 Feb 2004
Posts: 391
Location: New Plymouth, New Zealand

PostPosted: Fri Mar 10, 2006 7:51 am    Post subject: Reply with quote

Very cool script. As a virtual server user over at Linode.com, I will make good use of it.

There seems to be a bit of a trap. Please follow along, and let me know if there's a way out...

Code:

emerge X
emerge: there are no ebuilds to satisfy "blah/A".
./prlock.py blah/A
emerge sync
emerge X
emerge: there are no ebuilds to satisfy "blah/B".
./prlock.py blah/B
emerge sync

>>>
>>> Timestamps on the server and in the local repository are the same.
>>> Cancelling all further sync action. You are already up to date.
>>>


If I wait a bit, I can sync again, but this message is concerning me:

Code:
Please note: common gentoo-netiquette says you should not sync more
than once a day.  Users who abuse the rsync.gentoo.org rotation
may be added to a temporary ban list.


What is the best way to deal with this?
Back to top
View user's profile Send private message
IQgryn
l33t
l33t


Joined: 05 Sep 2005
Posts: 764
Location: WI, USA

PostPosted: Fri Mar 10, 2006 9:31 am    Post subject: Reply with quote

If you have another machine, which keeps the full tree, you could set it up as a portage mirror, and sync with it as often as you needed (it would only sync with the master servers every 24 hours). Check that out at http://gentoo-wiki.com/HOWTO_Local_Rsync_Mirror. Otherwise, as long as you don't do it too often, you won't be banned (keep it to 7 a week average, or maybe 10 in a week if you lay off the next). If you know you'll need to make a lot of changes, download a tarball of the portage tree, and extract it somewhere else on your machine. You can use rsync to update your tree from another folder on the same machine (just be sure to specify the excludes file as emerge synce normally would). Let me know if any of that is confusing; I'm running on less sleep than I prefer. :?
Back to top
View user's profile Send private message
kernelcowboy
Guru
Guru


Joined: 14 Feb 2004
Posts: 391
Location: New Plymouth, New Zealand

PostPosted: Fri Mar 10, 2006 8:15 pm    Post subject: Reply with quote

all good ideas, thanks!

i backed-up my portage tree locally before running rm -rf and prlock, but unfortunately, i didn't sync it beforehand. long story, but i'm out of luck there.

Is is possible to just copy bits of the tree into place. for example, my laptop is up-to-date. instead of making it a portage mirror, can i just copy ebuilds over manually and stick them in the tree on the server?
Back to top
View user's profile Send private message
Oyarsa
n00b
n00b


Joined: 01 Jul 2002
Posts: 73
Location: Mars

PostPosted: Sat Apr 01, 2006 10:35 pm    Post subject: How usefull is this script really? Reply with quote

Is anyone out there still using this script and finding it usefull? It just seams to create problems for me.
For example: I perform a emerge sync and then try to emerge -u system. I often get error
messages due to missing or ~86 masked packages which are new dependencies to my existing installed
packages. This requires a re-sync without the rsync_excludes file which loads all those ebuilds I was trying
to keep off my machine in the first place. How are folks getting around this?
_________________
Dew knot trussed yore spell chequer two fined awl ewer miss steaks.
Back to top
View user's profile Send private message
kernelcowboy
Guru
Guru


Joined: 14 Feb 2004
Posts: 391
Location: New Plymouth, New Zealand

PostPosted: Sat Apr 01, 2006 11:11 pm    Post subject: Re: How usefull is this script really? Reply with quote

Oyarsa wrote:
Is anyone out there still using this script and finding it usefull?


I do. Yes, I like it. I use it in a virtual server enviro; there are lots of good reasons to use it.

Oyarsa wrote:
It just seams to create problems for me.
For example: I perform a emerge sync and then try to emerge -u system. I often get error
messages due to missing or ~86 masked packages which are new dependencies to my existing installed
packages. This requires a re-sync without the rsync_excludes file which loads all those ebuilds I was trying
to keep off my machine in the first place. How are folks getting around this?


Is the Warning part of the docs at the top of script not addressing this for you?

If you are updating the system, you should probably not exclude any of the dependencies unless you really really know what you're doing. :P
Back to top
View user's profile Send private message
Oyarsa
n00b
n00b


Joined: 01 Jul 2002
Posts: 73
Location: Mars

PostPosted: Sun Apr 02, 2006 1:56 am    Post subject: Re: How usefull is this script really? Reply with quote

kernelcowboy wrote:
If you are updating the system, you should probably not exclude any of the dependencies unless you really really know what you're doing. :P


That's the whole point. If I'm not going to update my system, I have no need to sync in the first place.
In that case I might as well get rid of portage altogether.

My understanding is that the purpose of this script is to cut down on the number of files transfered by only
considering packages that are actually installed on the machine. Where I seem to be running into trouble
is when updates to installed packages list new packages as dependencies and the ebuilds for these new
dependencies are out of date or missing altogether. What I don't understand is how this script can be
usefull if this is a common occurence. I get the impression from your post that I need to maually edit the
rsync_excludes file that is created and remove the excludes that can potentially cause problems. Is this
correct? Maybe I'm missing something -- do I needing to run the script again after every sync?
_________________
Dew knot trussed yore spell chequer two fined awl ewer miss steaks.
Back to top
View user's profile Send private message
kernelcowboy
Guru
Guru


Joined: 14 Feb 2004
Posts: 391
Location: New Plymouth, New Zealand

PostPosted: Sun Apr 02, 2006 2:21 am    Post subject: Re: How usefull is this script really? Reply with quote

Oyarsa wrote:
kernelcowboy wrote:
If you are updating the system, you should probably not exclude any of the dependencies unless you really really know what you're doing. :P


That's the whole point. If I'm not going to update my system, I have no need to sync in the first place.
In that case I might as well get rid of portage altogether.

My understanding is that the purpose of this script is to cut down on the number of files transfered by only
considering packages that are actually installed on the machine. Where I seem to be running into trouble
is when updates to installed packages list new packages as dependencies and the ebuilds for these new
dependencies are out of date or missing altogether. What I don't understand is how this script can be
usefull if this is a common occurence.


Is the system a server with specific responsibility? I think the idea is, build up the server, when your happy, run this script to reduce the storage requirement. Since it's a server, you probably shouldn't update it too often. If it works, don't fix it. (i'm sure some people will disagree.) of course if a feature or bug fix is available, you'll update. but, probably just update that package. Do it deeply, so you keep things square. I do this in a virtual server environment, so I keep the bandwidth usage down, and disk usage down. two things i pay for. the other virtual users on the box probably benefit too. otherwise, i don't see a need for this script. clearly, it takes a bit of work, but if the benefit is real dollars, it's likely worth it.

Oyarsa wrote:
I get the impression from your post that I need to maually edit the
rsync_excludes file that is created and remove the excludes that can potentially cause problems. Is this
correct? Maybe I'm missing something -- do I needing to run the script again after every sync?


I didn't write this script, nor do i really understand python much at all. But, my understanding is this. You run it, it looks at your current install; creates a list to exclude all packages you don't currently have need for. following the rest of the scripts instructions, you'll have updated emerge sync to only consider those packages you care about.

when you need something new, you'll have the script remove it from the exclude list.

Code:
prlock.py dev-python/wxPython


is how this is done, easily. (assuming wxPython is the package you needed, but didn't already have.)

Sometimes I find that I have to emerge sync too often with this script. I try to limit this. If you try to emerge a package with dependencies, you may only see the first dependency. You do a prlock.py theDep then emerge sync, then emerge again. you'll then see the next set of dependencies. you'll ned to run prlock.py again. but, when you emerge sync again, you'll get told off for abusing the servers.

I haven't worked out a good way to fix this yet. I've been copying ebuilds from another system, and just emerge sync'ing later to get it all proper. seems to work. Perhaps there's a way to get portage to tell you all the dependencies you'll need to get, and if we cross ref that through prlock.py somehow (perhaps i'll learn python :)), we can get a list stuff to emerge. That way, we only need to emerg sync once. (Provided you don't think up something else you want too soon.)
Back to top
View user's profile Send private message
lost+found
Guru
Guru


Joined: 15 Nov 2004
Posts: 509
Location: North~Sea~Coa~s~~t~~~

PostPosted: Thu Jun 01, 2006 6:17 pm    Post subject: Script that creates rsync excludes for Portage. (REPOST) Reply with quote

Hi, I made this shell script to use the rsync_excludes feature of Portage some time ago. It still works for me. But, be warned: Emerge will complain more sooner than later about missing ebuilds!
CHEEEEEEEEEERS.

Code:
#!/bin/sh

# WARNING: Use rsync_excludes at your own risk. Make backups. :-)

# This simple script creates an /etc/portage/rsync_excludes file,
# for only syncing Portage (i.e. emerge --sync) to packages you
# have been interested in. It checks the contents of /var/db/pkg/
# for that. You will get a speed gain and can free some space,
# at the cost of Emerge complaining sooner or later about missing
# ebuilds. Duh, add them manually to your existing rsync_excludes
# file. To make it work, empty your /usr/portage directory (your
# distfiles and packages can stay) or create another dir, and
# use PORTDIR="/your-dir" in /etc/make.conf. Always add
# RSYNC_EXCLUDEFROM="/etc/portage/rsync_excludes" to /etc/make.conf,
# and set your profile and architecture below:
#
# Like in: /usr/portage/profiles/$PROFILE/$ARCH/
PROFILE="default-linux"
ARCH="x86"
#
# Good luck!


# Portage default location for the rsync_excludes file.
OUTPUT="/etc/portage/rsync_excludes"


# Backup existing rsync_excludes.
if [ -e $OUTPUT ]; then
mv $OUTPUT $OUTPUT.old || exit
fi

# Create an empty rsync_excludes.
touch $OUTPUT || exit

# Add the categories/packages in /var/db/pkg/*/* (ebuilds etc.)
for i in /var/db/pkg/*/*; do
CAT="`echo $i|sed 's:/var/db/pkg/\([^ ]*\)/.*:\1:'`"
PKG="`echo $i|sed 's:/var/db/pkg/.*/\([^ ]*\)-[0123456789].*:\1:'`"
grep "$CAT" $OUTPUT > /dev/null || (echo "+ $CAT/" >> $OUTPUT)
echo -e "+ $CAT/$PKG/\n+ $CAT/$PKG/**" >> $OUTPUT
done

# I think we need these...
echo "+ eclass/" >> $OUTPUT
echo "+ eclass/**" >> $OUTPUT
echo "+ metadata/" >> $OUTPUT
echo "+ metadata/*/" >> $OUTPUT

# Add metadata for the categories/packages in /var/db/pkg/*/*
for i in /var/db/pkg/*/*; do
CAT="`echo $i|sed 's:/var/db/pkg/\([^ ]*\)/.*:\1:'`"
PKG="`echo $i|sed 's:/var/db/pkg/.*/\([^ ]*\)-[0123456789].*:\1:'`"
grep "/$CAT" $OUTPUT > /dev/null || (echo "+ metadata/cache/$CAT/" >> $OUTPUT)
echo "+ metadata/cache/$CAT/$PKG-*" >> $OUTPUT
done

# ...and these too.
echo "- metadata/cache/**" >> $OUTPUT
echo "+ metadata/dtd/**" >> $OUTPUT
echo "+ metadata/glsa/**" >> $OUTPUT
echo "+ metadata/*" >> $OUTPUT
echo "+ profiles/" >> $OUTPUT
echo "+ profiles/base/" >> $OUTPUT
echo "+ profiles/base/**" >> $OUTPUT
echo "+ profiles/desc/" >> $OUTPUT
echo "+ profiles/desc/**" >> $OUTPUT
echo "+ profiles/$PROFILE/" >> $OUTPUT
echo "+ profiles/$PROFILE/$ARCH/" >> $OUTPUT
echo "+ profiles/$PROFILE/$ARCH/**" >> $OUTPUT
echo "- profiles/$PROFILE/*/" >> $OUTPUT
echo "+ profiles/$PROFILE/*" >> $OUTPUT
echo "+ profiles/updates/" >> $OUTPUT
echo "+ profiles/updates/**" >> $OUTPUT
echo "- profiles/*/" >> $OUTPUT
echo "+ profiles/*" >> $OUTPUT
echo "+ scripts/" >> $OUTPUT
echo "+ scripts/**" >> $OUTPUT
echo "- **/" >> $OUTPUT
echo "+ *" >> $OUTPUT


Notes:
Do a second emerge --sync, if you get this error:
Code:
>>> Updating Portage cache:
 Traceback (most recent call last):
   File "/usr/bin/emerge", line 2705, in ?
     oldcat = portage.catsplit(cp_list[0])[0]
 IndexError: list index out of range

Just correct your symlink if you get this error:
Code:
!!! ARCH is not set... Are you missing the /etc/make.profile symlink?
 !!! Is the symlink correct? Is your portage tree complete?

You can even delete the contents of /var/cache/edb, as current Portage versions will recreate the cache from your smaller Portage tree ("emerge --metadata"). Never touch /var/db/pkg b.t.w.!!!

Since Portage 2.1:
Code:
WARNING: usage of RSYNC_EXCLUDEFROM is deprecated, use PORTAGE_RSYNC_EXTRA_OPTS instead
/etc/make.conf change:
Code:
PORTAGE_RSYNC_EXTRA_OPTS="--exclude-from=/etc/portage/rsync_excludes"


Script doesn't work with overlays.


Last edited by lost+found on Sat Nov 18, 2006 3:31 pm; edited 2 times in total
Back to top
View user's profile Send private message
bur
Apprentice
Apprentice


Joined: 20 Feb 2004
Posts: 229

PostPosted: Wed Jun 28, 2006 12:12 am    Post subject: Keeping the portage tree small by using a whitelist Reply with quote

When syncing the portage tree all available packages are updated, no matter that probably most of them will never be used by you. That wastes time and disk space on your box and ressources and traffic on the rsync-servers. I found this thread on the german board which inspired me on trying to make a whitelist that defines what parts of the portage tree will be updated when syncing and which won't. In the end I managed to shrink my /usr/portage/ directory from 121,224 files (396 MB) to 71,385 files (136 MB).

First you need to tell emerge that you want to use an exclude list for rsync:
/etc/make.conf:

PORTAGE_RSYNC_EXTRA_OPTS="--exclude-from=/etc/portage/rsync_excludes"



Now for the important part, we need to define what should be synced. To do this first find out what packages are installed on your PC.
Code:

buren ~ # ls /var/db/pkg/
app-admin       dev-java     media-libs    perl-core    x11-base
app-arch        dev-lang     media-sound   sys-apps     x11-drivers
[...]


This will print a list of all parts of the portage tree from which you emerged packages. Some of the directories might be empty, you should delete them before proceeding. Now for each non-empty directory in /var/db/pkg put an entry to /etc/portage/rsync_excludes like this:
/etc/portage/rsync_excludes:

+ app-admin**
+ app-arch**

The '+' tells emerge to include these parts of the portage tree when syncing. The '**' is a placeholder that matches everything. '*' would only match until it encounters a '/' resulting in a broken tree.


Now that we included or whitelisted all packages we want to be updated, we need to blacklist everything else. This is done by adding every part of the tree to the blacklist. The '+'-entries will overrule the blacklist, it is important though that they come before the blacklistings, otherwise they won't overrule them.
/etc/portage/rsync_excludes:

app-**
dev-**
games-**
gnome-**
gnustep-**
kde-**
mail-**
media-**
net-**
perl-**
rox-**
sci-**
sec-**
sys-**
www-**
x11-**
xfce-**


If we would just leave it at that the tree would still get corrupted as the blacklist above would exclude important parts of 'eclasses' and other vital parts of portage. So we need to whitelist those system-specific parts, too:
/etc/portage/rsync_excludes:

+ eclass**
+ licenses**
+ profiles**
+ scripts**
+ virtual**

Important: Like before, these '+'-entries need to be placed above the blacklisting. Best to put it at the top of the file. So your rsync_excludes should look like this:
/etc/portage/rsync_excludes:

#important parts of portage
+ eclass**
+ licenses**
+ profiles**
+ scripts**
+ virtual**

#whitelist all used parts of the tree
+ app-admin**
+ app-arch**
+ app-benchmarks**
+ app-crypt**
+ app-editors**
+ app-i18n**
+ app-misc**
+ app-portage**
+ app-shells**
+ app-text**
+ dev-db**
+ dev-java**
+ dev-lang**
+ dev-libs**
+ dev-perl**
+ dev-python**
+ dev-tex**
+ dev-util**
+ kde-base**
+ media-fonts**
+ media-gfx**
+ media-libs**
+ media-sound**
+ media-video**
+ net-analyzer**
+ net-dns**
+ net-firewall**
+ net-ftp**
+ net-libs**
+ net-misc**
+ net-nds**
+ net-print**
+ perl-core**
+ sys-apps**
+ sys-boot**
+ sys-devel**
+ sys-fs**
+ sys-kernel**
+ sys-libs**
+ sys-process**
+ www-client**
+ x11-apps**
+ x11-base**
+ x11-drivers**
+ x11-libs**
+ x11-misc**
+ x11-proto**
+ x11-terms**
+ x11-wm**

#blacklist everything
app-**
dev-**
games-**
gnome-**
gnustep-**
kde-**
mail-**
media-**
net-**
perl-**
rox-**
sci-**
sec-**
sys-**
www-**
x11-**
xfce-**



Now each time you want to emerge a package from a part of the tree you haven't used before, simply add it to the file. For example if I was to emerge net-im/gimp, I would add '+ net-im**' to rsync_excludes and sync the tree. If you unmerge a package and it was the only one of its kind you can remove the specific part from the whitelist. Example: I only have ethereal in net-analyzer, so if I unmerge ethereal, I can remove the '+ net-analyzer**' line. You can also delete the specific directory from /usr/portage - in this case 'rm -r /usr/portage/net-analyzer'.

It can happen that a new version of a package requires a dependency that resides in a blacklisted part of the tree. In that case portage will complain, specifying what package it misses. Just add it to the whitelist-section, resync and start the emerge again.

If you want an even smaller tree, instead of using the rather generic whitelist entries I use, you can specify the exact package you use. For example instead of '+ www-client**' you would use '+ www-client/mozilla**' and '+ www-client/opera**' if you had Firefox and Opera installed.


The method described works well for me, but if you're unsure you should backup your current portage tree and delete it if everything turns out as working well:
Code:

mv /usr/portage /usr/portage.full

If the slim tree for whatever reason is broken, you can turn back to the old "full" tree:
Code:

rm -r /usr/portage
mv /usr/portage.full /usr/portage



Any comments or ideas how to improve this are welcome. :)
Back to top
View user's profile Send private message
think4urs11
Bodhisattva
Bodhisattva


Joined: 25 Jun 2003
Posts: 6659
Location: above the cloud

PostPosted: Wed Jun 28, 2006 6:00 am    Post subject: Reply with quote

in addition to the above: https://forums.gentoo.org/viewtopic-t-55031.html

partly dupes it.
_________________
Nothing is secure / Security is always a trade-off with usability / Do not assume anything / Trust no-one, nothing / Paranoia is your friend / Think for yourself
Back to top
View user's profile Send private message
bur
Apprentice
Apprentice


Joined: 20 Feb 2004
Posts: 229

PostPosted: Thu Jun 29, 2006 1:27 am    Post subject: Reply with quote

I was also thinking about using a script to set up the rsync_excludes file. This would make keeping it up-to-date much easier, also generating it as a whitelist that specifically whitelists single packages instead of whole "branches" (do you call it that? i mean things like kde-base, app-arch, sys-apps,...).
Back to top
View user's profile Send private message
curtis119
Bodhisattva
Bodhisattva


Joined: 10 Mar 2003
Posts: 2160
Location: Toledo, Ohio,USA, North America, Earth, SOL System, Milky Way, The Universe, The Cosmos, and Beyond.

PostPosted: Thu Jun 29, 2006 2:25 am    Post subject: Reply with quote

I merged the above three posts from a duplicate.
_________________
Gentoo: it's like wiping your ass with silk.
Back to top
View user's profile Send private message
lost+found
Guru
Guru


Joined: 15 Nov 2004
Posts: 509
Location: North~Sea~Coa~s~~t~~~

PostPosted: Thu Jun 29, 2006 9:25 am    Post subject: Re: Keeping the portage tree small by using a whitelist Reply with quote

bur wrote:
... Example: I only have ethereal in net-analyzer, so if I unmerge ethereal, I can remove the '+ net-analyzer**' line. You can also delete the specific directory from /usr/portage - in this case 'rm -r /usr/portage/net-analyzer'...


If you want to remove parts from the rsync_excludes file, you should *always* remove them from your Portage tree manually. (ebuilds and metadata). Keeping parts in the Portage tree unupdated is not wise. Portage may reuse outdated ebuilds without warning this way.
Back to top
View user's profile Send private message
polyacryl
n00b
n00b


Joined: 14 Sep 2003
Posts: 50

PostPosted: Mon Jul 10, 2006 8:38 am    Post subject: script to generate simple /etc/portage/rsync_excludes Reply with quote

Hello.

I wrote a simple script to generate the rsync_excludes file. As long as you don't want to mask single packages but whole categories (the stuff in /var/db/pkg/) it should be adequate.
Code:

#!/bin/zsh
#
# generates /etc/portage/rsync_excludes
# comments to asdf@uni-koblenz.de

db=(/var/db/pkg/*)
ex=/etc/portage/rsync_excludes

# remove empty directories in /var/db/pkg/
rmdir $db 2>/dev/null

# whitelist system stuff
cat << EOF > $ex
+ eclass**
+ licenses**
+ profiles**
+ scripts**
+ virtual**

EOF

# whitelist used categories
for i in $db
do
        echo `basename $i` | awk '{print "+ "$NF"**"}' >> $ex
done

# blacklist everything else
cat << EOF >> $ex

**
EOF
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Unsupported Software All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum