Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
tar, bzip2 multicore goodness
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks
View previous topic :: View next topic  
Author Message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 5675
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Sat Aug 02, 2008 8:05 pm    Post subject: tar, bzip2 multicore goodness Reply with quote

Hi everyone,

the basics for using tar are provided here:

http://www.shell-fu.org/lister.php?tag=tar

knowing these basics one can combine p7zip and tar to following command:

Code:
time (nice -20 tar -cp / -X /root/stage4.excl | 7z a -si -tbzip2 /bak/system/stage4-amd64_Final-11-030808.tbz2)


(this should create a stage4-tarball using bzip2-format with maximal compression and multiple cpu-cores, for you convenience it also shows the time it took to do so)

the command for extraction would be:

Code:
7z e -so -tbzip2 /bak/system/stage4-amd64_Final-11-030808.tbz2 | tar -xp -C /test/


if anything of the above is incorrect please post

I'm currently testing those commands & update this thread accordingly what I experience

update1:
now the sample commands' syntax should be correct
_________________
Unofficial minimal livecd x86/amd64 w/reiser4+truecrypt (by Neo2)
2.6.37.2_plus_v1: BFS, CFS,THP,compaction, zcache or TOI
Hardcore Linux user since 2004 :D


Last edited by kernelOfTruth on Sun Aug 03, 2008 8:23 pm; edited 1 time in total
Back to top
View user's profile Send private message
prizident
n00b
n00b


Joined: 06 Dec 2006
Posts: 42

PostPosted: Sun Aug 03, 2008 12:54 am    Post subject: Reply with quote

there is also a tool pbzip2, which also can handle multiple cores
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 5675
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Sun Aug 03, 2008 7:01 am    Post subject: Reply with quote

yes, the problem with that seems to be:

Quote:
Decompressing non-pbzip2 Created Archives

pbzip2 can only decompress archives in parallel that have been compressed with pbzip2. For example, extracting linux-2.6.23.8.tar.bz2 as found on kernel.org with pbzip2 takes roughly twice as long on a dual core system when compared against bzip2.


http://gentoo-wiki.com/HOWTO_Speed_up_decompression_with_pbzip2
_________________
Unofficial minimal livecd x86/amd64 w/reiser4+truecrypt (by Neo2)
2.6.37.2_plus_v1: BFS, CFS,THP,compaction, zcache or TOI
Hardcore Linux user since 2004 :D
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 5675
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Sun Aug 03, 2008 9:27 pm    Post subject: Reply with quote

here the output of my first multi-core created stage4 tarball :P

Quote:
time (nice -20 tar -cp / -X /root/stage4.excl | 7z a -si -tbzip2 /bak/system/stage4-amd64_Final-11-030808.tbz2)
tar: Removing leading `/' from member names

7-Zip 4.58 beta Copyright (c) 1999-2008 Igor Pavlov 2008-05-05
p7zip Version 4.58 (locale=en_US.utf8,Utf16=on,HugeFiles=on,2 CPUs)
Creating archive /bak/system/stage4-amd64_Final-11-030808.tbz2

...

Everything is Ok

real 51m2.510s
user 76m58.764s
sys 2m22.778s


(this is 51 minutes instead of 80 or more minutes )
_________________
Unofficial minimal livecd x86/amd64 w/reiser4+truecrypt (by Neo2)
2.6.37.2_plus_v1: BFS, CFS,THP,compaction, zcache or TOI
Hardcore Linux user since 2004 :D
Back to top
View user's profile Send private message
fangorn
Veteran
Veteran


Joined: 31 Jul 2004
Posts: 1886

PostPosted: Mon Aug 04, 2008 8:00 am    Post subject: Reply with quote

This is working great. Thank you.

For convenience I packed this into two scripts tbz2 and utbz2. If someone is interested, here they are.

Code:
#!/bin/bash

if [ $# -le 1 ] ; then
   echo "Usage: $0 <archive_file> source1 [source2 [...]]"
   exit;
fi

dest=$1
shift

nice -20 tar -cp $@ | nice -20 7z a -si -tbzip2 $dest


Code:
#!/bin/bash

dest=""
if [ $# -lt 1 ] ; then
   echo "Usage: $0 <archive_file> [destination_directory]"
   exit;
fi
if [ ! -f $1 ] ; then
   echo "Usage: $0 <archive_file> [destination_directory] "
   exit;
fi
   
if [ ! -z $2 ] ; then
   if [ -d $2 ] ; then
      dest="-C "$2
   else
      echo "Directory $2 does not exist. Do you want to create it (y/n)"
      read a
      if [ $a = "y" ] || [ $a = "Y" ] ; then
         mkdir -p $2
         dest="-C "$2
      else
         exit;
      fi   
   fi
fi

7z e -so -tbzip2 $1 | tar -xp $dest

_________________
Video Encoding scripts collection | Project page
Back to top
View user's profile Send private message
Zucca
Apprentice
Apprentice


Joined: 14 Jun 2007
Posts: 201
Location: Helsinki, Finland

PostPosted: Thu Aug 14, 2008 3:29 pm    Post subject: Reply with quote

This might make compressing even more effective:
Code:
time (nice -20 tar -cp / -X /root/stage4.excl | 7z a -si -tbzip2 -md=32m -mx=9 -mpass=10 -mmt=5 /bak/system/stage4.tbz2)

I haven't tested much it.
It's slower, yes. On my test 7min --> 12min difference on a test archive.
_________________
Threading support for your bash scripts.
Back to top
View user's profile Send private message
shentino
n00b
n00b


Joined: 21 Nov 2009
Posts: 52

PostPosted: Mon Aug 27, 2012 6:54 pm    Post subject: what if Reply with quote

What if each bzip2 block were forked into its own thread for decompression, and then all the thawed blocks were simply reassembled in the correct order?
Back to top
View user's profile Send private message
mv
Advocate
Advocate


Joined: 20 Apr 2005
Posts: 4004

PostPosted: Mon Aug 27, 2012 7:19 pm    Post subject: Reply with quote

GNU tar has the option --use-compress-program. So you could just write a script which calls "exec 7z" with appropriate parameters and use that option. I can imagine (depending on the implementation in GNU tar which I did not check) that this could be slightly faster than using the shell for piping.
Back to top
View user's profile Send private message
mattst88
Developer
Developer


Joined: 28 Oct 2004
Posts: 362

PostPosted: Sat Sep 15, 2012 6:06 am    Post subject: Reply with quote

prizident wrote:
there is also a tool pbzip2, which also can handle multiple cores


Please use lbzip2 instead.

kernelOfTruth wrote:
yes, the problem with that seems to be:

Quote:
Decompressing non-pbzip2 Created Archives

pbzip2 can only decompress archives in parallel that have been compressed with pbzip2. For example, extracting linux-2.6.23.8.tar.bz2 as found on kernel.org with pbzip2 takes roughly twice as long on a dual core system when compared against bzip2.


http://gentoo-wiki.com/HOWTO_Speed_up_decompression_with_pbzip2


lbzip2 does not have this limitation. Use it instead.
_________________
My 1U Dual 833 MHz Alphaserver DS20L
The AlphaLinux.org Wiki
Back to top
View user's profile Send private message
Ant P.
Advocate
Advocate


Joined: 18 Apr 2009
Posts: 2272
Location: UK

PostPosted: Sat Sep 15, 2012 5:07 pm    Post subject: Reply with quote

Anyone tried app-arch/lrzip on these stage4 files? It usually gives 1GB/min at maximum settings for me.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 5675
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Tue Sep 18, 2012 5:29 pm    Post subject: Reply with quote

mattst88 wrote:
prizident wrote:
there is also a tool pbzip2, which also can handle multiple cores


Please use lbzip2 instead.

kernelOfTruth wrote:
yes, the problem with that seems to be:

Quote:
Decompressing non-pbzip2 Created Archives

pbzip2 can only decompress archives in parallel that have been compressed with pbzip2. For example, extracting linux-2.6.23.8.tar.bz2 as found on kernel.org with pbzip2 takes roughly twice as long on a dual core system when compared against bzip2.


http://gentoo-wiki.com/HOWTO_Speed_up_decompression_with_pbzip2


lbzip2 does not have this limitation. Use it instead.


awesome - thanks ! :)

now I only need the liveCD creators to include it ^^

for my PC it should be no problem since I use an alternative emergency system but on my laptop there's not enough space on the harddrive to do so ...

some more info on the *zip compressors:

gziptest.sh part 2: multi-threaded compression benchmarks
_________________
Unofficial minimal livecd x86/amd64 w/reiser4+truecrypt (by Neo2)
2.6.37.2_plus_v1: BFS, CFS,THP,compaction, zcache or TOI
Hardcore Linux user since 2004 :D
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 7666
Location: Somewhere over Atlanta, Georgia

PostPosted: Thu Oct 18, 2012 3:52 pm    Post subject: Reply with quote

Just delivered a .tar.bz packaged with pbzip2 to a far east factory partner that they could not unpack with WinRAR. It appears I had installed pbzip2 since I had last delivered anything to the factory. Another strike against pbzip2.

- John
_________________
This space intentionally left blank.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum