Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Question about backup+gzip
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Off the Wall
View previous topic :: View next topic  
Author Message
disi
Veteran
Veteran


Joined: 28 Nov 2003
Posts: 1354
Location: Out There ...

PostPosted: Thu Nov 29, 2012 6:31 pm    Post subject: Question about backup+gzip Reply with quote

I wonder if anyone has experience with that. Sorry if the question sounds stupid, I haven't made an attempt yet.
This is probably the most knowledgeable forum out there :P

The idea is to set up a 'cheap' storage of virtual machines on ESXi for offside backup.
How much time and how much workload is it to pipe ~4TB through gzip and put it via USB 2.0 onto an external storage once a week?

//edit: I am not worried about snapshot backup, it is the full backup if that is reasonable...
_________________
Gentoo on Uptime Project - Larry is a cow
Back to top
View user's profile Send private message
notageek
Tux's lil' helper
Tux's lil' helper


Joined: 05 Jun 2008
Posts: 120
Location: Bangalore, India

PostPosted: Thu Nov 29, 2012 6:51 pm    Post subject: Reply with quote

19.41 hours.
Back to top
View user's profile Send private message
disi
Veteran
Veteran


Joined: 28 Nov 2003
Posts: 1354
Location: Out There ...

PostPosted: Thu Nov 29, 2012 6:59 pm    Post subject: Reply with quote

notageek wrote:
19.41 hours.


No! seriously? It is like the 72 virgins, the number is so weird, it must be true :cry:

//edit: that is an HP ProLiant DL360
_________________
Gentoo on Uptime Project - Larry is a cow
Back to top
View user's profile Send private message
notageek
Tux's lil' helper
Tux's lil' helper


Joined: 05 Jun 2008
Posts: 120
Location: Bangalore, India

PostPosted: Thu Nov 29, 2012 7:11 pm    Post subject: Reply with quote

I'm glad you asked. Since you didn't tell us what kind of data you have, I assumed you mostly have jpeg, mp3 or avi. In short, porn. Since these are already compressed formats, the compression you're likely to see even from gzip is minimal. So, I assumed it to be 4TB.

Now a USB 2.0 can transfer at 480Mbits/s max. Ignoring effects of filesystem, other hardware and taking a leap of faith in assuming you'll get 480Mbits/s speed, you'd be transferring 3.355e+7Mbits of data. Which could take 69895.8333333s on USB2.0.

69895.8333333s/60/60 = 19.41 hrs.

You're welcome.
Back to top
View user's profile Send private message
disi
Veteran
Veteran


Joined: 28 Nov 2003
Posts: 1354
Location: Out There ...

PostPosted: Thu Nov 29, 2012 7:27 pm    Post subject: Reply with quote

this is really crap then... :?

//edit: not that they all have porn in Spain :P
_________________
Gentoo on Uptime Project - Larry is a cow
Back to top
View user's profile Send private message
notageek
Tux's lil' helper
Tux's lil' helper


Joined: 05 Jun 2008
Posts: 120
Location: Bangalore, India

PostPosted: Thu Nov 29, 2012 7:28 pm    Post subject: Reply with quote

Use Gig Ethernet cards instead. (And make a GigE network.)
Back to top
View user's profile Send private message
energyman76b
Advocate
Advocate


Joined: 26 Mar 2003
Posts: 2025
Location: Germany

PostPosted: Thu Nov 29, 2012 8:09 pm    Post subject: Reply with quote

notageek wrote:
Use Gig Ethernet cards instead. (And make a GigE network.)


not that 'standard' gige is any faster than usb....

if your data is already compressed (video, mp3, jpeg etc), you won't gain anything from compression - you will lose on compression. Files on average become bigger. So if you pipe everything through gzip and half of the stuff is compressible and the rest not... well you come out with maybe a tiny bit of gain.

If it needs to be done in less than the 20h you already got as answer go either SCSI/SAS/ESATA or USB3. Whatever is cheaper for you. With U320 and ~250mb/sec (muhahaha) you will still need 4.5h...

Why laughing? because seeking will destroy even that number... (there is a reason why modern desktop hdd have a problem feeding old 10mb/sec DLT drives... they are quick until they have to seek... and everything hits rock bottom...).
_________________
AidanJT wrote:

Libertardian denial of reality is wholly unimpressive and unconvincing, and simply serves to demonstrate what a bunch of delusional fools they all are.

Satan's got perfectly toned abs and rocks a c-cup.
Back to top
View user's profile Send private message
Akkara
Administrator
Administrator


Joined: 28 Mar 2006
Posts: 5126
Location: &akkara

PostPosted: Fri Nov 30, 2012 6:46 am    Post subject: Reply with quote

Before you start your backup, do a recursive checksum of everything (md5sum or whatever else you prefer). Put this into a file which you also back up.

After your backup is done, checksum the backup and diff the two checksum files. (You may have to sort them - the traversal order often ends up different).

Find any differences? No? Consider yourself lucky. Yes? Examine *carefully* before doing anything rash: the error might have occurred during your 1st checksumming and the file is in fact OK. In that case, update both checksum files.

Yes, this will nearly triple the time required to do the backup and check. No, I don't know of a shortcut (short of using zfs), at least not if you value your data. (If you don't care as much, ignore this advice.)

On consumer-grade equipment, expect to see a random error every 10TB or so.
_________________
echo 'long long long x;' | gcc -x c -c -
Back to top
View user's profile Send private message
disi
Veteran
Veteran


Joined: 28 Nov 2003
Posts: 1354
Location: Out There ...

PostPosted: Fri Nov 30, 2012 8:19 am    Post subject: Reply with quote

notageek wrote:
Use Gig Ethernet cards instead. (And make a GigE network.)

This is probably the best solution anyway, some Terra-Station or something with nfs support. I tested an 'off the shelf' external USB drive yesterday and it wouldn't be recognized by ESXi, it creates the node and then destroys it straight away. There are only a few USB-drives supported and known to work.

Akkara wrote:
Before you start your backup, do a recursive checksum of everything (md5sum or whatever else you prefer). Put this into a file which you also back up.

After your backup is done, checksum the backup and diff the two checksum files. (You may have to sort them - the traversal order often ends up different).

Find any differences? No? Consider yourself lucky. Yes? Examine *carefully* before doing anything rash: the error might have occurred during your 1st checksumming and the file is in fact OK. In that case, update both checksum files.

Yes, this will nearly triple the time required to do the backup and check. No, I don't know of a shortcut (short of using zfs), at least not if you value your data. (If you don't care as much, ignore this advice.)

On consumer-grade equipment, expect to see a random error every 10TB or so.


Thanks for the tip! I haven't gone into scripting yet but this is something to consider, as long as it stays below 24h :) ESXi doesn't have e.g. rsync per default...
_________________
Gentoo on Uptime Project - Larry is a cow
Back to top
View user's profile Send private message
Boris27
Guru
Guru


Joined: 05 Nov 2003
Posts: 562
Location: Almelo, The Netherlands

PostPosted: Fri Nov 30, 2012 9:50 am    Post subject: Reply with quote

You are storing Virtual Machine images? Those compress decently. You really need to take a look at your dataset and determine if compression is worth it, because as said before, already compressed data usually increases in size a bit.
_________________
we are microsoft, lower your firewalls and surrender your pc's. we will add your biological and technological distinctiveness to our own. your culture will adapt and service us. resistance is futile.
Back to top
View user's profile Send private message
disi
Veteran
Veteran


Joined: 28 Nov 2003
Posts: 1354
Location: Out There ...

PostPosted: Fri Nov 30, 2012 1:16 pm    Post subject: Reply with quote

Just ran a test and compressed two images parallel:
40GB Windows 2008 R2 OS installed, compressed size 3GB with gzip
real 35m 52.15s
user 2m 55.89s
sys 0m 0.00s

40GB Windows 2008 R2 OS installed, compressed size 3.8GB with lzop
real 7m 30.38s
user 2m 1.49s
sys 0m 0.00s
_________________
Gentoo on Uptime Project - Larry is a cow
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Off the Wall All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum