View previous topic :: View next topic |
Author |
Message |
disi Veteran


Joined: 28 Nov 2003 Posts: 1354 Location: Out There ...
|
Posted: Thu Nov 29, 2012 6:31 pm Post subject: Question about backup+gzip |
|
|
I wonder if anyone has experience with that. Sorry if the question sounds stupid, I haven't made an attempt yet.
This is probably the most knowledgeable forum out there
The idea is to set up a 'cheap' storage of virtual machines on ESXi for offside backup.
How much time and how much workload is it to pipe ~4TB through gzip and put it via USB 2.0 onto an external storage once a week?
//edit: I am not worried about snapshot backup, it is the full backup if that is reasonable... _________________ Gentoo on Uptime Project - Larry is a cow |
|
Back to top |
|
 |
notageek Tux's lil' helper


Joined: 05 Jun 2008 Posts: 131 Location: MA, USA
|
Posted: Thu Nov 29, 2012 6:51 pm Post subject: |
|
|
19.41 hours. _________________ "Defeat is a state of mind. No one is ever defeated, until defeat has been accepted as a reality." -- Bruce Lee |
|
Back to top |
|
 |
disi Veteran


Joined: 28 Nov 2003 Posts: 1354 Location: Out There ...
|
Posted: Thu Nov 29, 2012 6:59 pm Post subject: |
|
|
notageek wrote: | 19.41 hours. |
No! seriously? It is like the 72 virgins, the number is so weird, it must be true
//edit: that is an HP ProLiant DL360 _________________ Gentoo on Uptime Project - Larry is a cow |
|
Back to top |
|
 |
notageek Tux's lil' helper


Joined: 05 Jun 2008 Posts: 131 Location: MA, USA
|
Posted: Thu Nov 29, 2012 7:11 pm Post subject: |
|
|
I'm glad you asked. Since you didn't tell us what kind of data you have, I assumed you mostly have jpeg, mp3 or avi. In short, porn. Since these are already compressed formats, the compression you're likely to see even from gzip is minimal. So, I assumed it to be 4TB.
Now a USB 2.0 can transfer at 480Mbits/s max. Ignoring effects of filesystem, other hardware and taking a leap of faith in assuming you'll get 480Mbits/s speed, you'd be transferring 3.355e+7Mbits of data. Which could take 69895.8333333s on USB2.0.
69895.8333333s/60/60 = 19.41 hrs.
You're welcome. _________________ "Defeat is a state of mind. No one is ever defeated, until defeat has been accepted as a reality." -- Bruce Lee |
|
Back to top |
|
 |
disi Veteran


Joined: 28 Nov 2003 Posts: 1354 Location: Out There ...
|
|
Back to top |
|
 |
notageek Tux's lil' helper


Joined: 05 Jun 2008 Posts: 131 Location: MA, USA
|
Posted: Thu Nov 29, 2012 7:28 pm Post subject: |
|
|
Use Gig Ethernet cards instead. (And make a GigE network.) _________________ "Defeat is a state of mind. No one is ever defeated, until defeat has been accepted as a reality." -- Bruce Lee |
|
Back to top |
|
 |
energyman76b Advocate


Joined: 26 Mar 2003 Posts: 2045 Location: Germany
|
Posted: Thu Nov 29, 2012 8:09 pm Post subject: |
|
|
notageek wrote: | Use Gig Ethernet cards instead. (And make a GigE network.) |
not that 'standard' gige is any faster than usb....
if your data is already compressed (video, mp3, jpeg etc), you won't gain anything from compression - you will lose on compression. Files on average become bigger. So if you pipe everything through gzip and half of the stuff is compressible and the rest not... well you come out with maybe a tiny bit of gain.
If it needs to be done in less than the 20h you already got as answer go either SCSI/SAS/ESATA or USB3. Whatever is cheaper for you. With U320 and ~250mb/sec (muhahaha) you will still need 4.5h...
Why laughing? because seeking will destroy even that number... (there is a reason why modern desktop hdd have a problem feeding old 10mb/sec DLT drives... they are quick until they have to seek... and everything hits rock bottom...). _________________ Study finds stunning lack of racial, gender, and economic diversity among middle-class white males
I identify as a dirty penismensch. |
|
Back to top |
|
 |
Akkara Administrator


Joined: 28 Mar 2006 Posts: 6396 Location: &akkara
|
Posted: Fri Nov 30, 2012 6:46 am Post subject: |
|
|
Before you start your backup, do a recursive checksum of everything (md5sum or whatever else you prefer). Put this into a file which you also back up.
After your backup is done, checksum the backup and diff the two checksum files. (You may have to sort them - the traversal order often ends up different).
Find any differences? No? Consider yourself lucky. Yes? Examine *carefully* before doing anything rash: the error might have occurred during your 1st checksumming and the file is in fact OK. In that case, update both checksum files.
Yes, this will nearly triple the time required to do the backup and check. No, I don't know of a shortcut (short of using zfs), at least not if you value your data. (If you don't care as much, ignore this advice.)
On consumer-grade equipment, expect to see a random error every 10TB or so. _________________ The reason there appears to be no god in the world, is because he's overwhelmed constantly pulling all those miracles that are needed to keep all the software we're using, mostly working. |
|
Back to top |
|
 |
disi Veteran


Joined: 28 Nov 2003 Posts: 1354 Location: Out There ...
|
Posted: Fri Nov 30, 2012 8:19 am Post subject: |
|
|
notageek wrote: | Use Gig Ethernet cards instead. (And make a GigE network.) |
This is probably the best solution anyway, some Terra-Station or something with nfs support. I tested an 'off the shelf' external USB drive yesterday and it wouldn't be recognized by ESXi, it creates the node and then destroys it straight away. There are only a few USB-drives supported and known to work.
Akkara wrote: | Before you start your backup, do a recursive checksum of everything (md5sum or whatever else you prefer). Put this into a file which you also back up.
After your backup is done, checksum the backup and diff the two checksum files. (You may have to sort them - the traversal order often ends up different).
Find any differences? No? Consider yourself lucky. Yes? Examine *carefully* before doing anything rash: the error might have occurred during your 1st checksumming and the file is in fact OK. In that case, update both checksum files.
Yes, this will nearly triple the time required to do the backup and check. No, I don't know of a shortcut (short of using zfs), at least not if you value your data. (If you don't care as much, ignore this advice.)
On consumer-grade equipment, expect to see a random error every 10TB or so. |
Thanks for the tip! I haven't gone into scripting yet but this is something to consider, as long as it stays below 24h ESXi doesn't have e.g. rsync per default... _________________ Gentoo on Uptime Project - Larry is a cow |
|
Back to top |
|
 |
Boris27 Guru


Joined: 05 Nov 2003 Posts: 562 Location: Almelo, The Netherlands
|
Posted: Fri Nov 30, 2012 9:50 am Post subject: |
|
|
You are storing Virtual Machine images? Those compress decently. You really need to take a look at your dataset and determine if compression is worth it, because as said before, already compressed data usually increases in size a bit. _________________ we are microsoft, lower your firewalls and surrender your pc's. we will add your biological and technological distinctiveness to our own. your culture will adapt and service us. resistance is futile. |
|
Back to top |
|
 |
disi Veteran


Joined: 28 Nov 2003 Posts: 1354 Location: Out There ...
|
Posted: Fri Nov 30, 2012 1:16 pm Post subject: |
|
|
Just ran a test and compressed two images parallel:
40GB Windows 2008 R2 OS installed, compressed size 3GB with gzip
real 35m 52.15s
user 2m 55.89s
sys 0m 0.00s
40GB Windows 2008 R2 OS installed, compressed size 3.8GB with lzop
real 7m 30.38s
user 2m 1.49s
sys 0m 0.00s _________________ Gentoo on Uptime Project - Larry is a cow |
|
Back to top |
|
 |
|