Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Stage 4 backup increases in size every month
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Mon Jul 01, 2013 5:35 pm    Post subject: Stage 4 backup increases in size every month Reply with quote

I make a stage 4 backup of my server every month, and every month the tarball increases in size significantly despite the fact that I'm not adding anything to the system - just doing regular Portage updates. I want to find out where this size increase is coming from and if necessary put that info in my .excl file.

Last month, the tarball was 2.2G. This month it's 2.5G.

Here's my script:

Code:

#!/bin/bash
tar cvjf /home/audiodef/Portage/stage4.tar.bz2 / -X /home/audiodef/Portage/stage4.excl


Here's stage4.excl:
Code:

.bash_history
/mnt/*
/tmp/*
/proc/*
/sys/*
/dev/*
/etc/mtab
/etc/ssh/ssh_host_*
/usr/src/*
/usr/portage/*
/data/radio/main/*
/var/www/gentoostudio/htdocs/src/stage4.tar.bz2
/home/audiodef/stage4.tar.bz2


Note that the web sites I run (/var/www) should not account for any noticeable increase, as I'm just not adding anywhere near enough to any one of the sites I run in a given month to account for that. The same goes for website data stored in MySQL databases - nothing gets added from month to month that should account for this kind of increase in the size of the tarball.

Where else in my system could there be files and dir that grow in size over time?
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10587
Location: Somewhere over Atlanta, Georgia

PostPosted: Mon Jul 01, 2013 6:06 pm    Post subject: Reply with quote

Probably want to also exclude /var/tmp. Maybe you've got a lot of failed ebuilds in /var/tmp/portage.

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
The Doctor
Moderator
Moderator


Joined: 27 Jul 2010
Posts: 2678

PostPosted: Tue Jul 02, 2013 2:22 am    Post subject: Reply with quote

Another idea. Do you have anything like spotify that catches large amounts of data?
_________________
First things first, but not necessarily in that order.

Apologies if I take a while to respond. I'm currently working on the dematerialization circuit for my blue box.
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Tue Jul 02, 2013 2:36 pm    Post subject: Reply with quote

No spotify or anything like that. The server has one user account and even that pretty much only exists to allow ssh access. I'll see if excluding /var/tmp helps.

EDIT: du -h /var/tmp only shows 92M, not enough to account for this kind of increase. I'm excluding it anyway, but there's got to be something else somewhere.
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10587
Location: Somewhere over Atlanta, Georgia

PostPosted: Tue Jul 02, 2013 2:39 pm    Post subject: Reply with quote

Well then, let me recommend something radical, like looking at the contents of the stage4, maybe even comparing it with the contents of a previous stage4. :P

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Tue Jul 02, 2013 3:21 pm    Post subject: Reply with quote

D'oh! :P

Is there a way to diff two bzip files?
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
defer-
Tux's lil' helper
Tux's lil' helper


Joined: 11 Jun 2007
Posts: 140
Location: Finland

PostPosted: Tue Jul 02, 2013 3:41 pm    Post subject: Reply with quote

Maybe you should browse your filesystem with sys-fs/ncdu to find out which files or directories take up the space.
_________________
https://github.com/defer-
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10587
Location: Somewhere over Atlanta, Georgia

PostPosted: Tue Jul 02, 2013 5:43 pm    Post subject: Reply with quote

audiodef wrote:
Is there a way to diff two bzip files?
Since what you probably want to diff is the sizes, if you have room to extract them, then you can summarize the contents by top level directory like so:
Code:
cd the-directory-where-you-untarred-the stage4
find . -maxdepth 1 -mindepth 1 -type d | grep -Ev '\./(dev|mnt|tmp|sys|proc)' | xargs du -hs
Do that on each of the stage4 tarballs and you'll have a summary of what's grown and by how much, which will point you to the place to investigate further.

If you really want to diff the contents, rather than the aggregate sizes, you could always use diff :wink: :
Code:
diff -ubB <(tar -tjvf path-to-stage4-number-1) <(tar -tjvf path-to-stage4-number-2) | less
Of course, you'll have to adjust the compression-related command line options to tar.

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
defer-
Tux's lil' helper
Tux's lil' helper


Joined: 11 Jun 2007
Posts: 140
Location: Finland

PostPosted: Tue Jul 02, 2013 9:42 pm    Post subject: Reply with quote

John R. Graham wrote:
If you really want to diff the contents, rather than the aggregate sizes, you could always use diff :wink: :
Code:
diff -ubB <(tar -tjvf path-to-stage4-number-1) <(tar -tjvf path-to-stage4-number-2) | less
Of course, you'll have to adjust the compression-related command line options to tar.

I have this feeling that this kind of piping will eat some memory :D
_________________
https://github.com/defer-
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10587
Location: Somewhere over Atlanta, Georgia

PostPosted: Wed Jul 03, 2013 11:41 am    Post subject: Reply with quote

Those commands just emit the directory of the tarballs, not the actual files. Probably should have said that or else table of contents to be more clear. :wink:

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Wed Jul 03, 2013 2:04 pm    Post subject: Reply with quote

Cool. I'll check out these options. :)
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
ppurka
Advocate
Advocate


Joined: 26 Dec 2004
Posts: 3256

PostPosted: Wed Jul 03, 2013 2:13 pm    Post subject: Reply with quote

Are you adding new updates to the same tarball? If the libraries change their version numbers then I don't think tar will delete the old versioned files.
_________________
emerge --quiet redefined | E17 vids: I, II | Now using kde5 | e is unstable :-/
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Wed Jul 03, 2013 2:27 pm    Post subject: Reply with quote

ppurka wrote:
Are you adding new updates to the same tarball? If the libraries change their version numbers then I don't think tar will delete the old versioned files.


Doesn't Portage remove outdated versions?
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
ppurka
Advocate
Advocate


Joined: 26 Dec 2004
Posts: 3256

PostPosted: Wed Jul 03, 2013 3:43 pm    Post subject: Reply with quote

audiodef wrote:
ppurka wrote:
Are you adding new updates to the same tarball? If the libraries change their version numbers then I don't think tar will delete the old versioned files.


Doesn't Portage remove outdated versions?
Of course, portage removes outdated versions. But, depending on your tar options, you might still be keeping the old versions of your files if you are writing to the same tarred file. One such option (is it the only one?) is the -r option. For instance, here you can see that a.txt is not deleted:
Code:
/tmp» mkdir a; touch a/{a,b}.txt
/tmp» tar -rf a.tar a
/tmp» tar -tf a.tar
a/
a/b.txt
a/a.txt
/tmp» touch a/c.txt; rm a/a.txt
/tmp» tar -rf a.tar a
/tmp» tar -tf a.tar
a/
a/b.txt
a/a.txt
a/
a/c.txt
a/b.txt

_________________
emerge --quiet redefined | E17 vids: I, II | Now using kde5 | e is unstable :-/
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Wed Aug 07, 2013 4:28 pm    Post subject: Reply with quote

Aha. No, completely new tar files each time.
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Wed Aug 07, 2013 4:35 pm    Post subject: Reply with quote

John R. Graham wrote:
Since what you probably want to diff is the sizes, if you have room to extract them, then you can summarize the contents by top level directory like so:
Code:
cd the-directory-where-you-untarred-the stage4
find . -maxdepth 1 -mindepth 1 -type d | grep -Ev '\./(dev|mnt|tmp|sys|proc)' | xargs du -hs
Do that on each of the stage4 tarballs and you'll have a summary of what's grown and by how much, which will point you to the place to investigate further.



This helped. I see a difference of 0.6G in /usr between two tarballs done update/month apart. The rest are either the same size or have minor increases. So I need to figure out what in /usr grows every time I update this tarball.
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
666threesixes666
Veteran
Veteran


Joined: 31 May 2011
Posts: 1248
Location: 42.68n 85.41w

PostPosted: Wed Aug 07, 2013 4:43 pm    Post subject: Reply with quote

1 guess, /usr/portage/distfiles.... link it to a tmpfs so upon reboot the tarballs are wiped away.
Back to top
View user's profile Send private message
GFCCAE6xF
Apprentice
Apprentice


Joined: 06 Aug 2012
Posts: 295

PostPosted: Wed Aug 07, 2013 4:53 pm    Post subject: Reply with quote

666threesixes666 wrote:
1 guess, /usr/portage/distfiles.... link it to a tmpfs so upon reboot the tarballs are wiped away.


It's much better just to use 'eclean -d distfiles' so you keep only what you need there rather then constantly wipe everything and have to potentially keep re-downloading things for simple rebuilds.
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10587
Location: Somewhere over Atlanta, Georgia

PostPosted: Wed Aug 07, 2013 5:20 pm    Post subject: Reply with quote

Or, perhaps even better, exclude /usr/portage/distfiles from your backup as Portage will automatically retrieve what it needs.

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10587
Location: Somewhere over Atlanta, Georgia

PostPosted: Wed Aug 07, 2013 5:24 pm    Post subject: Reply with quote

666threesixes666 wrote:
1 guess, /usr/portage/distfiles.... link it to a tmpfs so upon reboot the tarballs are wiped away.
Bad netiquette. I agree with rorgoroth. The bandwidth that supports our favorite distro is largely donated. We should all work to minimize the cost of that donation. :wink:

(I realize that my last post and this one are somewhat mutually contradictory, but in audiodef's case, he'd only waste bandwidth in the case of a catastrophic event and not as a part of daily operations.)

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Wed Aug 07, 2013 5:43 pm    Post subject: Reply with quote

It's got to be something else, because /usr/portage/* has been excluded from my tarballs all along. /usr/src grows considerably when there's a kernel source update, but I've ruled that out, also (next tarball is still larger than the last when there is no kernel source update). Is there anything else that normally can be expected to grow somewhere in /usr after an update?
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10587
Location: Somewhere over Atlanta, Georgia

PostPosted: Wed Aug 07, 2013 7:40 pm    Post subject: Reply with quote

Why ask us to keep guessing when you can just look? :wink:

The technique I outlined earlier can be used to dig deeper. Now that you know it's usr, then
Code:
cd the-directory-where-you-untarred-the stage4/usr
find . -maxdepth 1 -mindepth 1 -type d | xargs du -hs
will show you the sizes of the directories within usr.

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
yoshi314
l33t
l33t


Joined: 30 Dec 2004
Posts: 850
Location: PL

PostPosted: Thu Aug 08, 2013 7:04 pm    Post subject: Reply with quote

one more idea - what about user's home files?
_________________
~amd64
shrink your /usr/portage with squashfs+aufs
Back to top
View user's profile Send private message
audiodef
Watchman
Watchman


Joined: 06 Jul 2005
Posts: 6639
Location: The soundosphere

PostPosted: Fri Aug 09, 2013 3:16 pm    Post subject: Reply with quote

Thanks, John. I plan to do that next.

Yoshi, there's only one regular user and nothing in /home/(user), since I keep it that way so a nice default install is ready when the stage 4 is installed.

Now that I've paid more attention, I've noticed that from June to July to August, the tarball has decreased in size, which puzzles me as much as the increase did. I'll have to keep monitoring this and delve into /usr to see what's fluctuating so much. This isn't a system I actually use - just one I prepare to be used, and it lives on a hard drive that gets swapped in only for the purpose of updating the system and tarball.
_________________
decibel Linux: https://decibellinux.org
Github: https://github.com/Gentoo-Music-and-Audio-Technology
Facebook: https://www.facebook.com/decibellinux
Discord: https://discord.gg/73XV24dNPN
Back to top
View user's profile Send private message
ryao
Retired Dev
Retired Dev


Joined: 27 Feb 2012
Posts: 132

PostPosted: Fri Aug 09, 2013 6:55 pm    Post subject: Re: Stage 4 backup increases in size every month Reply with quote

audiodef wrote:
I make a stage 4 backup of my server every month, and every month the tarball increases in size significantly despite the fact that I'm not adding anything to the system - just doing regular Portage updates. I want to find out where this size increase is coming from and if necessary put that info in my .excl file.

Last month, the tarball was 2.2G. This month it's 2.5G.

Here's my script:

Code:

#!/bin/bash
tar cvjf /home/audiodef/Portage/stage4.tar.bz2 / -X /home/audiodef/Portage/stage4.excl


Here's stage4.excl:
Code:

.bash_history
/mnt/*
/tmp/*
/proc/*
/sys/*
/dev/*
/etc/mtab
/etc/ssh/ssh_host_*
/usr/src/*
/usr/portage/*
/data/radio/main/*
/var/www/gentoostudio/htdocs/src/stage4.tar.bz2
/home/audiodef/stage4.tar.bz2


Note that the web sites I run (/var/www) should not account for any noticeable increase, as I'm just not adding anywhere near enough to any one of the sites I run in a given month to account for that. The same goes for website data stored in MySQL databases - nothing gets added from month to month that should account for this kind of increase in the size of the tarball.

Where else in my system could there be files and dir that grow in size over time?


/var/log perhaps?

By the way, if you were using ZFS, you could just do snapshots and send/recv.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum