Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Discussion & Documentation Documentation, Tips & Tricks
  • Search

tar, bzip2 multicore goodness

Unofficial documentation for various parts of Gentoo Linux. Note: This is not a support forum.
Post Reply
Advanced search
12 posts • Page 1 of 1
Author
Message
kernelOfTruth
Watchman
Watchman
User avatar
Posts: 6111
Joined: Tue Dec 20, 2005 10:34 pm
Location: Vienna, Austria; Germany; hello world :)
Contact:
Contact kernelOfTruth
Website

tar, bzip2 multicore goodness

  • Quote

Post by kernelOfTruth » Sat Aug 02, 2008 8:05 pm

Hi everyone,

the basics for using tar are provided here:

http://www.shell-fu.org/lister.php?tag=tar

knowing these basics one can combine p7zip and tar to following command:

Code: Select all

time (nice -20 tar -cp / -X /root/stage4.excl | 7z a -si -tbzip2 /bak/system/stage4-amd64_Final-11-030808.tbz2)
(this should create a stage4-tarball using bzip2-format with maximal compression and multiple cpu-cores, for you convenience it also shows the time it took to do so)

the command for extraction would be:

Code: Select all

7z e -so -tbzip2 /bak/system/stage4-amd64_Final-11-030808.tbz2 | tar -xp -C /test/
if anything of the above is incorrect please post

I'm currently testing those commands & update this thread accordingly what I experience

update1:
now the sample commands' syntax should be correct
Last edited by kernelOfTruth on Sun Aug 03, 2008 8:23 pm, edited 1 time in total.
https://github.com/kernelOfTruth/ZFS-fo ... scCD-4.9.0
https://github.com/kernelOfTruth/pulsea ... zer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Top
prizident
n00b
n00b
Posts: 42
Joined: Wed Dec 06, 2006 9:21 pm

  • Quote

Post by prizident » Sun Aug 03, 2008 12:54 am

there is also a tool pbzip2, which also can handle multiple cores
Top
kernelOfTruth
Watchman
Watchman
User avatar
Posts: 6111
Joined: Tue Dec 20, 2005 10:34 pm
Location: Vienna, Austria; Germany; hello world :)
Contact:
Contact kernelOfTruth
Website

  • Quote

Post by kernelOfTruth » Sun Aug 03, 2008 7:01 am

yes, the problem with that seems to be:
Decompressing non-pbzip2 Created Archives

pbzip2 can only decompress archives in parallel that have been compressed with pbzip2. For example, extracting linux-2.6.23.8.tar.bz2 as found on kernel.org with pbzip2 takes roughly twice as long on a dual core system when compared against bzip2.
http://gentoo-wiki.com/HOWTO_Speed_up_d ... ith_pbzip2
https://github.com/kernelOfTruth/ZFS-fo ... scCD-4.9.0
https://github.com/kernelOfTruth/pulsea ... zer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Top
kernelOfTruth
Watchman
Watchman
User avatar
Posts: 6111
Joined: Tue Dec 20, 2005 10:34 pm
Location: Vienna, Austria; Germany; hello world :)
Contact:
Contact kernelOfTruth
Website

  • Quote

Post by kernelOfTruth » Sun Aug 03, 2008 9:27 pm

here the output of my first multi-core created stage4 tarball :P
time (nice -20 tar -cp / -X /root/stage4.excl | 7z a -si -tbzip2 /bak/system/stage4-amd64_Final-11-030808.tbz2)
tar: Removing leading `/' from member names

7-Zip 4.58 beta Copyright (c) 1999-2008 Igor Pavlov 2008-05-05
p7zip Version 4.58 (locale=en_US.utf8,Utf16=on,HugeFiles=on,2 CPUs)
Creating archive /bak/system/stage4-amd64_Final-11-030808.tbz2

...

Everything is Ok

real 51m2.510s
user 76m58.764s
sys 2m22.778s
(this is 51 minutes instead of 80 or more minutes )
https://github.com/kernelOfTruth/ZFS-fo ... scCD-4.9.0
https://github.com/kernelOfTruth/pulsea ... zer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Top
fangorn
Veteran
Veteran
User avatar
Posts: 1886
Joined: Sat Jul 31, 2004 1:31 pm
Contact:
Contact fangorn
Website

  • Quote

Post by fangorn » Mon Aug 04, 2008 8:00 am

This is working great. Thank you.

For convenience I packed this into two scripts tbz2 and utbz2. If someone is interested, here they are.

Code: Select all

#!/bin/bash

if [ $# -le 1 ] ; then 
   echo "Usage: $0 <archive_file> source1 [source2 [...]]"
   exit; 
fi

dest=$1
shift 

nice -20 tar -cp $@ | nice -20 7z a -si -tbzip2 $dest 

Code: Select all

#!/bin/bash

dest=""
if [ $# -lt 1 ] ; then 
   echo "Usage: $0 <archive_file> [destination_directory]"
   exit; 
fi
if [ ! -f $1 ] ; then
   echo "Usage: $0 <archive_file> [destination_directory] "
   exit; 
fi
   
if [ ! -z $2 ] ; then
   if [ -d $2 ] ; then
      dest="-C "$2
   else 
      echo "Directory $2 does not exist. Do you want to create it (y/n)"
      read a
      if [ $a = "y" ] || [ $a = "Y" ] ; then
         mkdir -p $2
         dest="-C "$2
      else
         exit;
      fi	 
   fi
fi

7z e -so -tbzip2 $1 | tar -xp $dest
Video Encoding scripts collection | Project page
Top
Zucca
Administrator
Administrator
User avatar
Posts: 4825
Joined: Thu Jun 14, 2007 10:31 pm
Location: Rasi, Finland
Contact:
Contact Zucca
Website

  • Quote

Post by Zucca » Thu Aug 14, 2008 3:29 pm

This might make compressing even more effective:

Code: Select all

time (nice -20 tar -cp / -X /root/stage4.excl | 7z a -si -tbzip2 -md=32m -mx=9 -mpass=10 -mmt=5 /bak/system/stage4.tbz2)
I haven't tested much it.
It's slower, yes. On my test 7min --> 12min difference on a test archive.
..: Zucca :..

Code: Select all

0100100100100000011000010110110100100000
0100111001100001010011100010000100100000
0100100100100000011000010110110100100000
0110000100100000011011010110000101101110
00100001
Top
shentino
n00b
n00b
Posts: 63
Joined: Sat Nov 21, 2009 3:26 am

what if

  • Quote

Post by shentino » Mon Aug 27, 2012 6:54 pm

What if each bzip2 block were forked into its own thread for decompression, and then all the thawed blocks were simply reassembled in the correct order?
Top
mv
Watchman
Watchman
User avatar
Posts: 6795
Joined: Wed Apr 20, 2005 12:12 pm

  • Quote

Post by mv » Mon Aug 27, 2012 7:19 pm

GNU tar has the option --use-compress-program. So you could just write a script which calls "exec 7z" with appropriate parameters and use that option. I can imagine (depending on the implementation in GNU tar which I did not check) that this could be slightly faster than using the shell for piping.
Top
mattst88
Developer
Developer
User avatar
Posts: 426
Joined: Thu Oct 28, 2004 1:25 am
Contact:
Contact mattst88
Website

  • Quote

Post by mattst88 » Sat Sep 15, 2012 6:06 am

prizident wrote:there is also a tool pbzip2, which also can handle multiple cores
Please use lbzip2 instead.
kernelOfTruth wrote:yes, the problem with that seems to be:
Decompressing non-pbzip2 Created Archives

pbzip2 can only decompress archives in parallel that have been compressed with pbzip2. For example, extracting linux-2.6.23.8.tar.bz2 as found on kernel.org with pbzip2 takes roughly twice as long on a dual core system when compared against bzip2.
http://gentoo-wiki.com/HOWTO_Speed_up_d ... ith_pbzip2
lbzip2 does not have this limitation. Use it instead.
My Wiki page
Top
Ant P.
Watchman
Watchman
Posts: 6920
Joined: Sat Apr 18, 2009 7:18 pm
Contact:
Contact Ant P.
Website

  • Quote

Post by Ant P. » Sat Sep 15, 2012 5:07 pm

Anyone tried app-arch/lrzip on these stage4 files? It usually gives 1GB/min at maximum settings for me.
Top
kernelOfTruth
Watchman
Watchman
User avatar
Posts: 6111
Joined: Tue Dec 20, 2005 10:34 pm
Location: Vienna, Austria; Germany; hello world :)
Contact:
Contact kernelOfTruth
Website

  • Quote

Post by kernelOfTruth » Tue Sep 18, 2012 5:29 pm

mattst88 wrote:
prizident wrote:there is also a tool pbzip2, which also can handle multiple cores
Please use lbzip2 instead.
kernelOfTruth wrote:yes, the problem with that seems to be:
Decompressing non-pbzip2 Created Archives

pbzip2 can only decompress archives in parallel that have been compressed with pbzip2. For example, extracting linux-2.6.23.8.tar.bz2 as found on kernel.org with pbzip2 takes roughly twice as long on a dual core system when compared against bzip2.
http://gentoo-wiki.com/HOWTO_Speed_up_d ... ith_pbzip2
lbzip2 does not have this limitation. Use it instead.
awesome - thanks ! :)

now I only need the liveCD creators to include it ^^

for my PC it should be no problem since I use an alternative emergency system but on my laptop there's not enough space on the harddrive to do so ...

some more info on the *zip compressors:

gziptest.sh part 2: multi-threaded compression benchmarks
https://github.com/kernelOfTruth/ZFS-fo ... scCD-4.9.0
https://github.com/kernelOfTruth/pulsea ... zer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Top
John R. Graham
Administrator
Administrator
User avatar
Posts: 10905
Joined: Tue Mar 08, 2005 3:39 pm
Location: Somewhere over Winder, Georgia, USA

  • Quote

Post by John R. Graham » Thu Oct 18, 2012 3:52 pm

Just delivered a .tar.bz packaged with pbzip2 to a far east factory partner that they could not unpack with WinRAR. It appears I had installed pbzip2 since I had last delivered anything to the factory. Another strike against pbzip2.

- John
I can confirm that I have received between 0 and 499 National Security Letters.
Top
Post Reply

12 posts • Page 1 of 1

Return to “Documentation, Tips & Tricks”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Authors
Gentoo is a trademark of the Gentoo Foundation, Inc. and of Förderverein Gentoo e.V.
The contents of this document, unless otherwise expressly stated, are licensed under the CC-BY-SA-4.0 license.
The Gentoo Name and Logo Usage Guidelines apply.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy