Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Portage & Programming
  • Search

Minimalistic Portage Tree for low-diskspace machines

Problems with emerge or ebuilds? Have a basic programming question about C, PHP, Perl, BASH or something else?
Post Reply
Advanced search
21 posts • Page 1 of 1
Author
Message
rauar
n00b
n00b
Posts: 37
Joined: Sat Mar 15, 2003 7:28 pm

Minimalistic Portage Tree for low-diskspace machines

  • Quote

Post by rauar » Sat Aug 16, 2003 12:26 pm

Hey ppl,

as far as I know the portage tree is always completely replicated on each gentoo box and uses at least on my box 250MB of data.

This is for usual boxes no problem as most machines have enough diskspace available. On the other hand there're a lot of boxes with spare diskspace as well. (My inet router: P120, 90MB RAM, 900MB hdd). Other tasks are getting pretty complicated (I had to upgrade my gcc, 250MB free disk space is not enough, the build fails :( ).

So I came up with this idea:

Why replicate the complete portage tree on each machine ? Let's set up a central read-only portage tree with all the ebuild scripts and stuff. Data which needs to be written into the tree could really be done on a stripped down version of the local portage tree. Perhaps a dumb (xml?) file with all installed packages and versions would be enough. It looks to me that the portage tree holds 57000 files currently and takes up 250mb. In a compressed format (tar cfvj) it only takes 11mb of space. Looks like there's room for more optimization here.

Probably this would give several MBs back. But this needs some more investigation. I dunno....


Any comments ?

PS: Don't tell me to by a larger disk and that disks are getting much cheaper and cheaper. Cheap hardware is no excuse for inefficiencies.


Cheerz Al
Last edited by rauar on Sat Aug 16, 2003 7:06 pm, edited 2 times in total.
Top
Wedge_
Advocate
Advocate
User avatar
Posts: 3614
Joined: Thu Aug 08, 2002 4:00 pm
Location: Scotland

  • Quote

Post by Wedge_ » Sat Aug 16, 2003 1:01 pm

This thread may be helpful :)
Per Ardua Ad Astra
The Earth is the cradle of the mind, but we cannot live forever in a cradle - Konstantin E. Tsiolkovsky
Gentoo Radeon FAQ
Top
rauar
n00b
n00b
Posts: 37
Joined: Sat Mar 15, 2003 7:28 pm

  • Quote

Post by rauar » Sat Aug 16, 2003 6:52 pm

Wedge_ wrote:This thread may be helpful :)
Hmm. Caching on a gateway doesn't solve the local space issue as there's still a complete portage tree on each box.

Looks to me that a filesystem structure as portage tree "database" is simple but mostly a waste of space. Currently 57000 files in the tree take up to 250mb. This is really a lot. Remember the compressed tar with 11mb. I bet the xml version would take much less than the 250mb before.

Cheerz Al
Top
R0B_IX
Tux's lil' helper
Tux's lil' helper
Posts: 83
Joined: Sun Jun 15, 2003 1:13 pm

  • Quote

Post by R0B_IX » Sat Aug 16, 2003 7:24 pm

I just had two thoughts on this, not sure if any will interest you, but, couldn't you...

1. Compress the entire tree, and then make some scripts to uncompress the tree when running an emerge option, and then recompress it afterwords? Assuming you had that 250 megs free, then it should work just fine.

2. You could put the tree on a cd-rw, and then, similar to the previous script, have a script that "tricks" the emerge function into thinking that it is writing to the cd when doing an emerge sync, when you are actually redirecting the files somewhere else, and then you can update the cd-rw at your convenience with the latest additions/subtractions. A little more difficult than the first option, but with this option, you would never need to have the 250 megs free after the initial burn (assuming the portage tree isn't 100% redone, which I am sure will at least slightly annoying a bunch of people ;). Anyways, post back if you think of any more ideas.
Laptop:Sager 5670
CPU:3.06ghz w/HT enabled
RAM:512 megs PC2100
Video:ATI Radeon Mobility 9000
HD1:40 gig 5400 RPM 2.5" drive
HD2:20 gig 5400 RPM external 2.5" USB2 drive
OS of choice:Windows 3.1 (just kidding, it's gentoo)
Top
rauar
n00b
n00b
Posts: 37
Joined: Sat Mar 15, 2003 7:28 pm

  • Quote

Post by rauar » Sat Aug 16, 2003 7:52 pm

R0B_IX wrote:I just had two thoughts on this, not sure if any will interest you, but, couldn't you...

1. Compress the entire tree, and then make some scripts to uncompress the tree when running an emerge option, and then recompress it afterwords? Assuming you had that 250 megs free, then it should work just fine.
This would give back the space for normal work. However emerge operations would still require the same amount of free space due to the uncompression of the tree. This is actually the real problem on my box as I had to update my gcc which needs obviously about 250mb for extraction of the distfile to /var/tmp and patching. I'll see if the build process will require additionally a lot more space.
R0B_IX wrote:
2. You could put the tree on a cd-rw, and then, similar to the previous script, have a script that "tricks" the emerge function into thinking that it is writing to the cd when doing an emerge sync, when you are actually redirecting the files somewhere else, and then you can update the cd-rw at your convenience with the latest additions/subtractions. A little more difficult than the first option, but with this option, you would never need to have the 250 megs free after the initial burn (assuming the portage tree isn't 100% redone, which I am sure will at least slightly annoying a bunch of people ;). Anyways, post back if you think of any more ideas.
This workaround is somehow again a solution by just adding additional space (hdd or cd, doesn't matter).

Again, as far as I can judge on the implementation of the portage tree I'd say: The tree itself uses too much space compared to the functionality. I guess it could seriously be done without the waste of space.

I think there's a real overhead due to the filesystem. I dunno how much file system space is really needed for a single file entry in the file system. But consider this:

Assumed a file entry on the disk needs 4k (I've really no idea if this is the case) and there are 57000 files. This would lead to a space consumption of 228MB. This is really scary as this is in the range of the real size of the current portage tree on my boxes. The missing 20-30MB are probably the file sizes of the ebuild scripts theirself.

Now consider this:

Code: Select all

aphrodite root # find /usr/portage/ > test.txt
aphrodite root # ls -alF
total 3216
drwx------   21 root     root         4096 Aug 16 21:13 ./
drwxr-xr-x   19 root     root         4096 Jul 20 18:53 ../
-rw-r--r--    1 root     root      2902064 Aug 16 21:13 structure.txt
aphrodite root #
The structural information of the portage tree in a plain textfile takes nearly 3mb. Nice and small.

Still missing: the ebuild scripts, manifest files, Changelog files, Digest files, patch files... all summed up with find:

Code: Select all

-rw-r--r--    1 root     root       218197 Aug 16 21:19 Changelogs.txt
-rw-r--r--    1 root     root       212925 Aug 16 21:19 Manifest.txt
-rw-r--r--    1 root     root       634942 Aug 16 21:19 digest.txt
-rw-r--r--    1 root     root       572587 Aug 16 21:16 ebuilds.txt
-rw-r--r--    1 root     root        92642 Aug 16 21:18 patches.txt
-rw-r--r--    1 root     root      2902064 Aug 16 21:13 structure.txt
aphrodite root #

Perhaps I forgot some file types but the real sum of the above stuff is not more than 5mb including the structural layout.

Now compare 250mb against 5mb. If this could really be implemented in a more efficient manner the gain would be factor 50 !


Cheerz Al[/quote]
Top
Genone
Retired Dev
Retired Dev
User avatar
Posts: 9656
Joined: Fri Mar 14, 2003 6:02 pm
Location: beyond the rim

  • Quote

Post by Genone » Sat Aug 16, 2003 8:13 pm

You only measured the size of the filenames, not the file contents :wink:
You also forgot some important stuff like eclasses and profiles. The complete tree really uses about 200 MB. One way to save space is to export the tree on one box over NFS and mount it remotely on the other boxes.
Top
rauar
n00b
n00b
Posts: 37
Joined: Sat Mar 15, 2003 7:28 pm

  • Quote

Post by rauar » Sat Aug 16, 2003 10:30 pm

Genone wrote:You only measured the size of the filenames, not the file contents :wink:
You also forgot some important stuff like eclasses and profiles.
No, I did.

du -h shows for

/usr/portage/metadata : 41MB
/usr/portage/licenses: 2.5MB
/usr/portage/eclass: 412kB
/usr/portage/profiles: 848kB
/usr/portage/sec-policy: 208kB
/usr/portage/*: 105kB

Ok, metadata is comparably big. But still the sum of all directories is <50MB. I've considered ALL files. And I still argue that the real data of all these files is really below 50MB.

I've cut every file in the portage tree to length 1 for testing purposes. The directory size of the tree is still 227MB. So please tell me where is the space gone ? I think it's gone into the directory structure and file entries...

I really can't imagine that portage holds THAT much (250MB) data in its tree. Seriously.
Top
TheJabberwokk
Apprentice
Apprentice
Posts: 196
Joined: Tue Aug 13, 2002 10:42 am
Location: Gloucester
Contact:
Contact TheJabberwokk
Website

an idea

  • Quote

Post by TheJabberwokk » Sun Aug 17, 2003 12:09 am

the rsync makes a copy of the ebuilds files from the gentoo mirror, simply mount the site as an ftp to /usr/portage. this takes care of your space problem.
Top
rauar
n00b
n00b
Posts: 37
Joined: Sat Mar 15, 2003 7:28 pm

Re: an idea

  • Quote

Post by rauar » Sun Aug 17, 2003 11:46 am

TheJabberwokk wrote:the rsync makes a copy of the ebuilds files from the gentoo mirror, simply mount the site as an ftp to /usr/portage. this takes care of your space problem.
Ok, and what about the metadata inside the portage tree ? I doubt that the maintainers of the gentoo mirrors want me to write my own stuff into the public gentoo mirrors :D

Cheerz Al
Top
slais-sysweb
Apprentice
Apprentice
User avatar
Posts: 221
Joined: Fri Jun 14, 2002 8:01 pm
Location: London
Contact:
Contact slais-sysweb
Website

Re: Minimalistic Portage Tree for low-diskspace machines

  • Quote

Post by slais-sysweb » Sun Aug 17, 2003 9:39 pm

rauar wrote:Hey ppl,

as far as I know the portage tree is always completely replicated on each gentoo box and uses at least on my box 250MB of data.
You can reduce the size of the portage tree considerably by setting up an /etc/portage/RSYNC_EXCLUDES file and enabling it in /etc/make.conf
The difficult part is listing all the things to exclude. For a server some thing can obviously be omited such as games and xfree but the structure keeps changing so it is difficult to keep an exclude file up-to-date. What I would find uselful would be a commented out default file for RSYNC_EXCLUDES that would enable me to build a system with the bare minimum portage tree at stage 1. (A rsync_INcludes file might be easier) Running emerge with a minimalist portage tree will of course throw up errors of the no package found type, but is is easy enough to get into the habit of running emerge -pv first, editing the exclude list and then doing emerge sync if required.
--
djc
sysweb SLAIS UCL
Top
Genone
Retired Dev
Retired Dev
User avatar
Posts: 9656
Joined: Fri Mar 14, 2003 6:02 pm
Location: beyond the rim

  • Quote

Post by Genone » Tue Aug 19, 2003 3:26 am

raurar, my apologies, you were right. I just run

Code: Select all

find /usr/portage -type d | xargs -i cat {} >> /tmp/portage-tree.content
(moving distfiles and packages out of /usr/portage before)
to see how much space all the files really use, it was below 50 MB. Later I noticed that emerge sync also gives you that information. So it seems that there are really over 150 MB wasted for filesystem structure information in the portage tree.
Not much of an issue for most people, but if you calculate that there are about 200.000 copies of the tree you end up with about 30 TeraByte of wasted diskspace 8O And I guess that number will increase with every new ebuild.
Top
christsong84
Veteran
Veteran
User avatar
Posts: 1003
Joined: Sun Apr 06, 2003 10:04 pm
Location: GMT-8 (Spokane)

  • Quote

Post by christsong84 » Tue Aug 19, 2003 3:43 am

make some kind of share (NFS or SAMBA...your choice) on a computer with enough space and mount it under /usr/portage and another temporary share under /usr/tmp/portage for the build files...it will slow the build process down a bit I think but it technically should work...

just an untested theory...it's really simple...just a thought :D
while(true) {self.input(sugar);} :twisted:
Top
rauar
n00b
n00b
Posts: 37
Joined: Sat Mar 15, 2003 7:28 pm

  • Quote

Post by rauar » Tue Aug 19, 2003 11:45 am

Genone wrote: Not much of an issue for most people, but if you calculate that there are about 200.000 copies of the tree you end up with about 30 TeraByte of wasted diskspace 8O And I guess that number will increase with every new ebuild.
You're right. This won't affect most people. But as the gentoo distribution aims at flexible customization and therefore is useful for running linux on slow boxes the current portage implementation doesn't really fit into this concept.

In the last gentoo newsletter there's been mentioned another portage thread (porting portage to c++ and so on...). Looks like there are more people being unhappy with the current implementation :)

BTW: 900MB of disk space is a lot. At least for a linux installation without X and pure router functionality. About 600MB are occupied by installed (and needed) packages. Sad but true that portage takes 80% of the free disk space. I deleted app-*, x11-* and net-* in the portage tree only to get my gcc compiled. (on a P120 with 96MB :Q ).

Cheerz Al
Top
eikketk
Apprentice
Apprentice
User avatar
Posts: 270
Joined: Tue Jun 03, 2003 1:34 pm
Location: Belgium
Contact:
Contact eikketk
Website

  • Quote

Post by eikketk » Tue Aug 19, 2003 1:06 pm

I didnt check wether its true /usr/portage takes 250Mb, but I think I can trust the author?
If so, this really is a problem. Im running Gentoo on a very old box, 1.2Gb of HD space, and sometimes, eg during gcc compile, ive got low disk space. I havent got X or anything installed...

Maybe a solution:
Couldnt it be possible to create a sort of db (or maybe xml) which is downloaded every rsync, not the ebuild files, eclasses and whatever. In this db, all dependencies for every aviable package is stored, and their architecture (the "~x86" stuff). This file could be created using a script on the cvs dir or something I guess. If a user does emerge rsync, only this db file is downloaded.
Now if a user does emerge prog, that prog is searched in the db, all dependencies are checked, a list of all the packages to emerge is built, and now, all required ebuild, digest, .... files are downloaded. Then a normal emerge starts.

Now after this there are two possibilities: or the downloaded files get deleted, or you keep them for later use, and give the user an option (like emerge --cleanfiles) to remove them.

Any comments? Or is this just not useable?

Greetz, Ikke
Top
rauar
n00b
n00b
Posts: 37
Joined: Sat Mar 15, 2003 7:28 pm

  • Quote

Post by rauar » Tue Aug 19, 2003 2:19 pm

eikketk wrote:I didnt check wether its true /usr/portage takes 250Mb, but I think I can trust the author?
If so, this really is a problem. Im running Gentoo on a very old box, 1.2Gb of HD space, and sometimes, eg during gcc compile, ive got low disk space. I havent got X or anything installed...
This was exactly the same situation on my box and the reason for creating this thread.
eikketk wrote:
Maybe a solution:
Couldnt it be possible to create a sort of db (or maybe xml) which is downloaded every rsync, not the ebuild files, eclasses and whatever. In this db, all dependencies for every aviable package is stored, and their architecture (the "~x86" stuff). This file could be created using a script on the cvs dir or something I guess. If a user does emerge rsync, only this db file is downloaded.
Now if a user does emerge prog, that prog is searched in the db, all dependencies are checked, a list of all the packages to emerge is built, and now, all required ebuild, digest, .... files are downloaded. Then a normal emerge starts.
Ok. So this is somehow similar to what I suggested earlier.

I think there are two optimizations to portage possible:

1) Avoid local mirrors of the portage tree with rsync. Instead of an one time complete rsync, use an public/global portage tree for dependancy resolution and ebuild download. Local information like installed packages and so on stay on the local box (of course). --> saves space and time (a lot)

2) Use a different data format for the complete tree. Filesystem as "database" is pretty much inefficient (space overhead?).

Cheerz Al[/quote]
Top
eikketk
Apprentice
Apprentice
User avatar
Posts: 270
Joined: Tue Jun 03, 2003 1:34 pm
Location: Belgium
Contact:
Contact eikketk
Website

  • Quote

Post by eikketk » Tue Aug 19, 2003 9:06 pm

Just text, so not space consuming. Altough I tought in a DB one can store loads of info on very small space (in kb I mean)?
Top
ebrostig
Bodhisattva
Bodhisattva
User avatar
Posts: 3152
Joined: Sat Jul 20, 2002 12:44 am
Location: Orlando, Fl

  • Quote

Post by ebrostig » Thu Aug 21, 2003 6:17 pm

I have an easy solution for your problem...

It's going to cost you a whopping $20 or so...

Buy a used HD, like a 5 or 10GB. I'm sure you can get one almost for free.

That you don't have enough free disk space is not a Portage problem as I see it.

Erik
'Yes, Firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'
Top
TheJabberwokk
Apprentice
Apprentice
Posts: 196
Joined: Tue Aug 13, 2002 10:42 am
Location: Gloucester
Contact:
Contact TheJabberwokk
Website

Re: an idea

  • Quote

Post by TheJabberwokk » Thu Aug 21, 2003 8:24 pm

rauar wrote:
TheJabberwokk wrote:the rsync makes a copy of the ebuilds files from the gentoo mirror, simply mount the site as an ftp to /usr/portage. this takes care of your space problem.
Ok, and what about the metadata inside the portage tree ? I doubt that the maintainers of the gentoo mirrors want me to write my own stuff into the public gentoo mirrors :D

Cheerz Al
details!
Top
tecknojunky
Veteran
Veteran
User avatar
Posts: 1937
Joined: Sat Oct 19, 2002 6:50 am
Location: Montréal
Contact:
Contact tecknojunky
Website

  • Quote

Post by tecknojunky » Fri Aug 22, 2003 7:39 am

Like meny said, NFS mounting /usr/portage, /usr/local/portage (optional, if you have custom builds) and /var/tmp/portage to a box with a high capacity storage, should resolve this. You'll need to sync once and it will be synced for all.
(7 of 9) Installing star-trek/species-8.4.7.2::talax.
Top
eikketk
Apprentice
Apprentice
User avatar
Posts: 270
Joined: Tue Jun 03, 2003 1:34 pm
Location: Belgium
Contact:
Contact eikketk
Website

  • Quote

Post by eikketk » Mon Aug 25, 2003 8:16 pm

Mmm. Right. What if I tell you I can't have such a machine? Ive got a paypal account if you want to :lol:
Top
BradB
Apprentice
Apprentice
User avatar
Posts: 190
Joined: Tue Jun 18, 2002 2:54 am
Location: Christchurch NZ

  • Quote

Post by BradB » Mon Aug 25, 2003 9:13 pm

I had the brilliant thought that the easy solution would be to have /usr/portage mounted on a compressed filesystem volume. But when I when looking for such a filesystem, I could really only find readonly ones. Does anybody else know of read/write compressed filesystems for linux?

Brad
Microsoft - bringing the pain right into your home since 1982
Top
Post Reply

21 posts • Page 1 of 1

Return to “Portage & Programming”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic