Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
New compressed filesystems based on FUSE (portage.space-- )
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Unsupported Software
View previous topic :: View next topic  
Author Message
1bitmemory
n00b
n00b


Joined: 08 Jun 2004
Posts: 17

PostPosted: Fri Nov 11, 2005 6:09 pm    Post subject: New compressed filesystems based on FUSE (portage.space-- ) Reply with quote

The FUSE project (http://fuse.sf.net) has spawned a few interesing filesystem for those of us who are short of disk space.

The new breed of Linux filesystems lack of support for file compression. The excuse is that disk space is cheap. But for those of us with aging system it is always nice to squeeze a little more space out of our disks anyway.

3 FUSE based compressing filesystems exist:

- lzofs from Neuron
- fusecompress from Milan Svoboda's
- compFUSEd from Johan Parent


Why not use this to store your portage for example? It's not like I need all of the +120K files (+-400MB) every single day... So I did (still do) it and it saves me half the space to store the entire portage tree 8)



1bm
Back to top
View user's profile Send private message
krani1
Tux's lil' helper
Tux's lil' helper


Joined: 21 Jun 2004
Posts: 76

PostPosted: Fri Nov 11, 2005 9:22 pm    Post subject: Reply with quote

thanks! for you, what's the best one??
Back to top
View user's profile Send private message
bonbons
Apprentice
Apprentice


Joined: 04 Sep 2004
Posts: 250

PostPosted: Fri Nov 11, 2005 9:30 pm    Post subject: Reply with quote

With squashfs you drop to about 10% of original size with portage (~30MB for the image), only disadvantage, it's read-only, thus you need a system with more space for the sync and image generation. Metadata in /var/cache uses about 90MB (uncompressed) so in total you have about 120MB data instead of 500MB, which is still a very good win.
Back to top
View user's profile Send private message
1bitmemory
n00b
n00b


Joined: 08 Jun 2004
Posts: 17

PostPosted: Sat Nov 12, 2005 11:19 am    Post subject: Reply with quote

@krani1

I only tried compFUSEd myself. It offers several compression formats and even has an example on how to put the Gentoo portage on it!!!

But Neuron's lzofs has been tried by others in this forum (search for compressing filesystem).

The 3rd one I do not know

@bonbons

These 3 support read-write operations. But as you said with squashfs you probably squeeze out much more since compression is not done on a per file basis.


1bm
Back to top
View user's profile Send private message
1bitmemory
n00b
n00b


Joined: 08 Jun 2004
Posts: 17

PostPosted: Sat Nov 12, 2005 8:40 pm    Post subject: Reply with quote

Tested compFUSEd by putting the portage tree on it. It all worked fine except that the size gain is limited (see below).

Code:

localhost ~ # tar -cf - /usr/portage --exclude=distfiles | wc -c
tar: Removing leading `/' from member names
501934080
localhost ~ # tar -cf - /usr/portage_cf --exclude=distfiles | wc -c                                                   
tar: Removing leading `/' from member names
465059840


That's some 35MB difference on a total of 500MB. The reason seems to be that the portage files are mostly too small to be compressible. It is more worthwhile to put the kernel code on such a filesystem. So my initial posting is not entirely correct :(
Back to top
View user's profile Send private message
adsmith
Veteran
Veteran


Joined: 26 Sep 2004
Posts: 1386
Location: NC, USA

PostPosted: Sat Nov 12, 2005 8:40 pm    Post subject: Reply with quote

Since the portage tree is essentially write-once (per week, anyway), I have started putting it in squashfs. It's works quite well, and the whole tree is only 24M. One problem is that, during the sync update, you need the full 500M, but once the sync is over, you're back to just 24M again. It is also network sharable using network block devices.

If anyone is interested, I can provide info and some simple scripts.
Back to top
View user's profile Send private message
makomk
n00b
n00b


Joined: 15 Jul 2005
Posts: 46
Location: Not all there

PostPosted: Sat Nov 12, 2005 9:04 pm    Post subject: Reply with quote

adsmith wrote:
Since the portage tree is essentially write-once (per week, anyway), I have started putting it in squashfs. It's works quite well, and the whole tree is only 24M. One problem is that, during the sync update, you need the full 500M, but once the sync is over, you're back to just 24M again. It is also network sharable using network block devices.


In theory, I don't see any reason why a squashfs containing the portage tree couldn't be generated directly from a portage snapshot tarball, without extracting it to a filesystem first. (emerge-delta-webrsync would probably be needed to avoid downloading the whole thing each time, though.) Of course, it's not something I'm volunteering to implement...
Back to top
View user's profile Send private message
adsmith
Veteran
Veteran


Joined: 26 Sep 2004
Posts: 1386
Location: NC, USA

PostPosted: Sat Nov 12, 2005 9:06 pm    Post subject: Reply with quote

In theory, yes, you're right. However, as far as I can tell, mksquashfs only deals with live filesystems.

I wonder which is less cruel to the gentoo servers --- rsyncing once a week, or grabbing a tarball (or the squashfs image itself??) once a week? the compressed images are big files, but rsync takes a lot of CPU/mem.

hmm... an xdelta of the squashfs image itself would be tasty.
Back to top
View user's profile Send private message
krani1
Tux's lil' helper
Tux's lil' helper


Joined: 21 Jun 2004
Posts: 76

PostPosted: Sat Nov 12, 2005 11:22 pm    Post subject: Reply with quote

adsmith wrote:
...
If anyone is interested, I can provide info and some simple scripts.


go on!! :D :D maybe not in this post, but make a simple tutorial for us :) 8)
Back to top
View user's profile Send private message
makomk
n00b
n00b


Joined: 15 Jul 2005
Posts: 46
Location: Not all there

PostPosted: Sat Nov 12, 2005 11:25 pm    Post subject: Reply with quote

adsmith wrote:
In theory, yes, you're right. However, as far as I can tell, mksquashfs only deals with live filesystems.

I wonder which is less cruel to the gentoo servers --- rsyncing once a week, or grabbing a tarball (or the squashfs image itself??) once a week? the compressed images are big files, but rsync takes a lot of CPU/mem.

hmm... an xdelta of the squashfs image itself would be tasty.


I know mksquashfs would need modifying - hence why I'm not volunteering to implement it :). But yes, you're right - there's nothing to stop someone (Gentoo or an individual) distributing .squashfs images of portage. Now why didn't I think of that...

And as for the least cruel option on the servers, it's probably daily deltas between the tarballs/squashfs files (bdelta might be better than xdelta - I think there's some issues with xdelta and 64-bit systems). IIRC, the deltas emerge-delta-rsync uses consume even less bandwidth than rsync, but they're created with a tarball-specific program, diffball, so YMMV.

Edit: link to information on patch size
Edit 2: have just hacked up mksquashfs to use AVFS to access files. It *seems* to work, generating a ~30Mb file direct from the uncompressed Portage tarball (just under 200Mb!) in under 4 minutes. Now I just have to check the output is correct and tidy up the build process. Downloading a .squashfs directly would still probably be better, though...


Last edited by makomk on Sun Nov 13, 2005 12:56 am; edited 2 times in total
Back to top
View user's profile Send private message
adsmith
Veteran
Veteran


Joined: 26 Sep 2004
Posts: 1386
Location: NC, USA

PostPosted: Sat Nov 12, 2005 11:29 pm    Post subject: Reply with quote

Okay, I'll put up a little top/howto stub in the tips/tricks section tonight. Once it's up, I'll put the link here.
https://forums.gentoo.org/viewtopic-p-2873232.html#2873232

For a short while, if people want to play with it, I am willing to distribute daily squash images, since I have access to lots of bandwidth through my university. Then, if interst grows beyond the 3 or 4 of us, we can think about ways to efficiently do diffs.
Back to top
View user's profile Send private message
makomk
n00b
n00b


Joined: 15 Jul 2005
Posts: 46
Location: Not all there

PostPosted: Sun Nov 13, 2005 2:01 am    Post subject: Reply with quote

Open-source software rocks. I now have a version of mksquashfs that accepts tarballs (and in theory certain other archive types) and it seems to work too. It was very easy really - all I had to do was modify mksquashfs to use AVFS for its I/O. Instructions for building are something like:

Code:
tar xzf squashfs2.1-r2.tar.gz
cd squashfs2.1-r2
zcat squashfs2.1-r2-avfs.diff.gz|patch -p1
cd squashfs-tools
tar xzf avfs-0.9.6.tar.gz
make


The makefile's a bit of a hack, but it does the job. You can get squashfs from the usual place, and AVFS here - now all I have to do is find some way of distributing the patch (3k compressed). Note that I advise you only use *uncompressed* tarballs, if you want to get the job done in a reasonable time.
Back to top
View user's profile Send private message
bonbons
Apprentice
Apprentice


Joined: 04 Sep 2004
Posts: 250

PostPosted: Sun Nov 13, 2005 4:05 pm    Post subject: Reply with quote

1bitmemory wrote:
These 3 support read-write operations. But as you said with squashfs you probably squeeze out much more since compression is not done on a per file basis.
Does any one of those write compressed data into a block device (or a single file)?

Otherwise much space gets lost because of the block-aligned files of master filesystem (e.g. compressed fs which just offer transparent compression/decompression of files in a native fs)
Back to top
View user's profile Send private message
adsmith
Veteran
Veteran


Joined: 26 Sep 2004
Posts: 1386
Location: NC, USA

PostPosted: Sun Nov 13, 2005 4:10 pm    Post subject: Reply with quote

squashfs does, but (almost by definition) any FUSE filesystem will simply go per-file
Back to top
View user's profile Send private message
neuron
Advocate
Advocate


Joined: 28 May 2002
Posts: 2371

PostPosted: Sun Nov 13, 2005 4:33 pm    Post subject: Reply with quote

compressing a full block device AND having it mounted read/write is damn near impossible and will fragment incredibly easily.

It's simple enough to do with encryption, because you can seek and write in blocks you know the size of, it gets ridiculusly much more advanced when the block sizes change.


Last edited by neuron on Sun Nov 13, 2005 4:47 pm; edited 1 time in total
Back to top
View user's profile Send private message
adsmith
Veteran
Veteran


Joined: 26 Sep 2004
Posts: 1386
Location: NC, USA

PostPosted: Sun Nov 13, 2005 4:35 pm    Post subject: Reply with quote

I presume you meant read-write?
Back to top
View user's profile Send private message
neuron
Advocate
Advocate


Joined: 28 May 2002
Posts: 2371

PostPosted: Sun Nov 13, 2005 4:47 pm    Post subject: Reply with quote

adsmith wrote:
I presume you meant read-write?


mhm :p, edited.
Back to top
View user's profile Send private message
neuron
Advocate
Advocate


Joined: 28 May 2002
Posts: 2371

PostPosted: Sat Nov 19, 2005 2:49 am    Post subject: Reply with quote

another one to add to the list : http://www.opensolaris.org/os/community/zfs/

also, a reiser4 lzo plugin is in the works.

(and fusecompress is making good progress, but it has some bugs right now which makes it ignore the compression altoghether ;), (0.5.1))
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Unsupported Software All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum