View previous topic :: View next topic |
Author |
Message |
audiodef Watchman
Joined: 06 Jul 2005 Posts: 6640 Location: The soundosphere
|
Posted: Mon Jul 01, 2013 5:35 pm Post subject: Stage 4 backup increases in size every month |
|
|
I make a stage 4 backup of my server every month, and every month the tarball increases in size significantly despite the fact that I'm not adding anything to the system - just doing regular Portage updates. I want to find out where this size increase is coming from and if necessary put that info in my .excl file.
Last month, the tarball was 2.2G. This month it's 2.5G.
Here's my script:
Code: |
#!/bin/bash
tar cvjf /home/audiodef/Portage/stage4.tar.bz2 / -X /home/audiodef/Portage/stage4.excl
|
Here's stage4.excl:
Code: |
.bash_history
/mnt/*
/tmp/*
/proc/*
/sys/*
/dev/*
/etc/mtab
/etc/ssh/ssh_host_*
/usr/src/*
/usr/portage/*
/data/radio/main/*
/var/www/gentoostudio/htdocs/src/stage4.tar.bz2
/home/audiodef/stage4.tar.bz2
|
Note that the web sites I run (/var/www) should not account for any noticeable increase, as I'm just not adding anywhere near enough to any one of the sites I run in a given month to account for that. The same goes for website data stored in MySQL databases - nothing gets added from month to month that should account for this kind of increase in the size of the tarball.
Where else in my system could there be files and dir that grow in size over time? _________________ decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN |
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10590 Location: Somewhere over Atlanta, Georgia
|
Posted: Mon Jul 01, 2013 6:06 pm Post subject: |
|
|
Probably want to also exclude /var/tmp. Maybe you've got a lot of failed ebuilds in /var/tmp/portage.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
The Doctor Moderator
Joined: 27 Jul 2010 Posts: 2678
|
Posted: Tue Jul 02, 2013 2:22 am Post subject: |
|
|
Another idea. Do you have anything like spotify that catches large amounts of data? _________________ First things first, but not necessarily in that order.
Apologies if I take a while to respond. I'm currently working on the dematerialization circuit for my blue box. |
|
Back to top |
|
|
audiodef Watchman
Joined: 06 Jul 2005 Posts: 6640 Location: The soundosphere
|
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10590 Location: Somewhere over Atlanta, Georgia
|
Posted: Tue Jul 02, 2013 2:39 pm Post subject: |
|
|
Well then, let me recommend something radical, like looking at the contents of the stage4, maybe even comparing it with the contents of a previous stage4.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
audiodef Watchman
Joined: 06 Jul 2005 Posts: 6640 Location: The soundosphere
|
|
Back to top |
|
|
defer- Tux's lil' helper
Joined: 11 Jun 2007 Posts: 140 Location: Finland
|
Posted: Tue Jul 02, 2013 3:41 pm Post subject: |
|
|
Maybe you should browse your filesystem with sys-fs/ncdu to find out which files or directories take up the space. _________________ https://github.com/defer- |
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10590 Location: Somewhere over Atlanta, Georgia
|
Posted: Tue Jul 02, 2013 5:43 pm Post subject: |
|
|
audiodef wrote: | Is there a way to diff two bzip files? | Since what you probably want to diff is the sizes, if you have room to extract them, then you can summarize the contents by top level directory like so: Code: | cd the-directory-where-you-untarred-the stage4
find . -maxdepth 1 -mindepth 1 -type d | grep -Ev '\./(dev|mnt|tmp|sys|proc)' | xargs du -hs | Do that on each of the stage4 tarballs and you'll have a summary of what's grown and by how much, which will point you to the place to investigate further.
If you really want to diff the contents, rather than the aggregate sizes, you could always use diff : Code: | diff -ubB <(tar -tjvf path-to-stage4-number-1) <(tar -tjvf path-to-stage4-number-2) | less | Of course, you'll have to adjust the compression-related command line options to tar.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
defer- Tux's lil' helper
Joined: 11 Jun 2007 Posts: 140 Location: Finland
|
Posted: Tue Jul 02, 2013 9:42 pm Post subject: |
|
|
John R. Graham wrote: | If you really want to diff the contents, rather than the aggregate sizes, you could always use diff : Code: | diff -ubB <(tar -tjvf path-to-stage4-number-1) <(tar -tjvf path-to-stage4-number-2) | less | Of course, you'll have to adjust the compression-related command line options to tar. |
I have this feeling that this kind of piping will eat some memory _________________ https://github.com/defer- |
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10590 Location: Somewhere over Atlanta, Georgia
|
Posted: Wed Jul 03, 2013 11:41 am Post subject: |
|
|
Those commands just emit the directory of the tarballs, not the actual files. Probably should have said that or else table of contents to be more clear.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
audiodef Watchman
Joined: 06 Jul 2005 Posts: 6640 Location: The soundosphere
|
|
Back to top |
|
|
ppurka Advocate
Joined: 26 Dec 2004 Posts: 3256
|
Posted: Wed Jul 03, 2013 2:13 pm Post subject: |
|
|
Are you adding new updates to the same tarball? If the libraries change their version numbers then I don't think tar will delete the old versioned files. _________________ emerge --quiet redefined | E17 vids: I, II | Now using kde5 | e is unstable :-/ |
|
Back to top |
|
|
audiodef Watchman
Joined: 06 Jul 2005 Posts: 6640 Location: The soundosphere
|
|
Back to top |
|
|
ppurka Advocate
Joined: 26 Dec 2004 Posts: 3256
|
Posted: Wed Jul 03, 2013 3:43 pm Post subject: |
|
|
audiodef wrote: | ppurka wrote: | Are you adding new updates to the same tarball? If the libraries change their version numbers then I don't think tar will delete the old versioned files. |
Doesn't Portage remove outdated versions? | Of course, portage removes outdated versions. But, depending on your tar options, you might still be keeping the old versions of your files if you are writing to the same tarred file. One such option (is it the only one?) is the -r option. For instance, here you can see that a.txt is not deleted: Code: | /tmp» mkdir a; touch a/{a,b}.txt
/tmp» tar -rf a.tar a
/tmp» tar -tf a.tar
a/
a/b.txt
a/a.txt
/tmp» touch a/c.txt; rm a/a.txt
/tmp» tar -rf a.tar a
/tmp» tar -tf a.tar
a/
a/b.txt
a/a.txt
a/
a/c.txt
a/b.txt |
_________________ emerge --quiet redefined | E17 vids: I, II | Now using kde5 | e is unstable :-/ |
|
Back to top |
|
|
audiodef Watchman
Joined: 06 Jul 2005 Posts: 6640 Location: The soundosphere
|
|
Back to top |
|
|
audiodef Watchman
Joined: 06 Jul 2005 Posts: 6640 Location: The soundosphere
|
Posted: Wed Aug 07, 2013 4:35 pm Post subject: |
|
|
John R. Graham wrote: | Since what you probably want to diff is the sizes, if you have room to extract them, then you can summarize the contents by top level directory like so: Code: | cd the-directory-where-you-untarred-the stage4
find . -maxdepth 1 -mindepth 1 -type d | grep -Ev '\./(dev|mnt|tmp|sys|proc)' | xargs du -hs | Do that on each of the stage4 tarballs and you'll have a summary of what's grown and by how much, which will point you to the place to investigate further.
|
This helped. I see a difference of 0.6G in /usr between two tarballs done update/month apart. The rest are either the same size or have minor increases. So I need to figure out what in /usr grows every time I update this tarball. _________________ decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN |
|
Back to top |
|
|
666threesixes666 Veteran
Joined: 31 May 2011 Posts: 1248 Location: 42.68n 85.41w
|
Posted: Wed Aug 07, 2013 4:43 pm Post subject: |
|
|
1 guess, /usr/portage/distfiles.... link it to a tmpfs so upon reboot the tarballs are wiped away. |
|
Back to top |
|
|
GFCCAE6xF Apprentice
Joined: 06 Aug 2012 Posts: 295
|
Posted: Wed Aug 07, 2013 4:53 pm Post subject: |
|
|
666threesixes666 wrote: | 1 guess, /usr/portage/distfiles.... link it to a tmpfs so upon reboot the tarballs are wiped away. |
It's much better just to use 'eclean -d distfiles' so you keep only what you need there rather then constantly wipe everything and have to potentially keep re-downloading things for simple rebuilds. |
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10590 Location: Somewhere over Atlanta, Georgia
|
Posted: Wed Aug 07, 2013 5:20 pm Post subject: |
|
|
Or, perhaps even better, exclude /usr/portage/distfiles from your backup as Portage will automatically retrieve what it needs.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10590 Location: Somewhere over Atlanta, Georgia
|
Posted: Wed Aug 07, 2013 5:24 pm Post subject: |
|
|
666threesixes666 wrote: | 1 guess, /usr/portage/distfiles.... link it to a tmpfs so upon reboot the tarballs are wiped away. | Bad netiquette. I agree with rorgoroth. The bandwidth that supports our favorite distro is largely donated. We should all work to minimize the cost of that donation.
(I realize that my last post and this one are somewhat mutually contradictory, but in audiodef's case, he'd only waste bandwidth in the case of a catastrophic event and not as a part of daily operations.)
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
audiodef Watchman
Joined: 06 Jul 2005 Posts: 6640 Location: The soundosphere
|
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10590 Location: Somewhere over Atlanta, Georgia
|
Posted: Wed Aug 07, 2013 7:40 pm Post subject: |
|
|
Why ask us to keep guessing when you can just look?
The technique I outlined earlier can be used to dig deeper. Now that you know it's usr, then
Code: | cd the-directory-where-you-untarred-the stage4/usr
find . -maxdepth 1 -mindepth 1 -type d | xargs du -hs | will show you the sizes of the directories within usr.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
yoshi314 l33t
Joined: 30 Dec 2004 Posts: 850 Location: PL
|
|
Back to top |
|
|
audiodef Watchman
Joined: 06 Jul 2005 Posts: 6640 Location: The soundosphere
|
Posted: Fri Aug 09, 2013 3:16 pm Post subject: |
|
|
Thanks, John. I plan to do that next.
Yoshi, there's only one regular user and nothing in /home/(user), since I keep it that way so a nice default install is ready when the stage 4 is installed.
Now that I've paid more attention, I've noticed that from June to July to August, the tarball has decreased in size, which puzzles me as much as the increase did. I'll have to keep monitoring this and delve into /usr to see what's fluctuating so much. This isn't a system I actually use - just one I prepare to be used, and it lives on a hard drive that gets swapped in only for the purpose of updating the system and tarball. _________________ decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN |
|
Back to top |
|
|
ryao Retired Dev
Joined: 27 Feb 2012 Posts: 132
|
Posted: Fri Aug 09, 2013 6:55 pm Post subject: Re: Stage 4 backup increases in size every month |
|
|
audiodef wrote: | I make a stage 4 backup of my server every month, and every month the tarball increases in size significantly despite the fact that I'm not adding anything to the system - just doing regular Portage updates. I want to find out where this size increase is coming from and if necessary put that info in my .excl file.
Last month, the tarball was 2.2G. This month it's 2.5G.
Here's my script:
Code: |
#!/bin/bash
tar cvjf /home/audiodef/Portage/stage4.tar.bz2 / -X /home/audiodef/Portage/stage4.excl
|
Here's stage4.excl:
Code: |
.bash_history
/mnt/*
/tmp/*
/proc/*
/sys/*
/dev/*
/etc/mtab
/etc/ssh/ssh_host_*
/usr/src/*
/usr/portage/*
/data/radio/main/*
/var/www/gentoostudio/htdocs/src/stage4.tar.bz2
/home/audiodef/stage4.tar.bz2
|
Note that the web sites I run (/var/www) should not account for any noticeable increase, as I'm just not adding anywhere near enough to any one of the sites I run in a given month to account for that. The same goes for website data stored in MySQL databases - nothing gets added from month to month that should account for this kind of increase in the size of the tarball.
Where else in my system could there be files and dir that grow in size over time? |
/var/log perhaps?
By the way, if you were using ZFS, you could just do snapshots and send/recv. |
|
Back to top |
|
|
|