My current backup schematic and folder structure is basically causing me some devastating eye soreness by having to look at the date and backup routine in order to decipher what the heck it is (alpha.0, alpha.1, beta.0, beta.1, beta.2, etc.).
So that is the problem I am running into. My own attempt at a solution is going to be to safely remove all the "duplicate entries" of hard links within the directory structure, and then to get a much clearer viewpoint, of the incremental saving that was done when attempting to backup the system.
My goal is ultimately to safely remove folder by folder in this directory structure all the way back to the "beginning of time" -- for the folder, keeping in tact the necessary files as they change, and even some of the very old ones -- but not necessarily every file. Hence, the result will be that I will not only clear the clutter of much to much hard-links in the folder structure, something that I'm not sure if there is another way to do such as with querying it from within the shell, or some file manager, etc. But, another thing that I hope to accomplish is by reducing the actual files (original backups) to reduce the overall size of the backup to about 1 GB, when currently the footprint or space utilization from rsnapshot would appear to be about 10-15 GB, or way too much for me!! I am certain there are entire folders that I will remove, and in some rare cases it is even possible that files or folders will need to be left alone, but probably not for any reason other than it being the location of the file so to speak (of the newest file, which I could then copy or save elsewhere, and then perhaps choose whether or not recovering an older version of the file is an option I'd like to go with...my gut says no to that, and I feel it should save that effort at least for another run in with some other similar disaster.) To keep this more manageable, I believe my questions are the following - and this will help greatly, so thanks in advance:
If I remove any group of old files, what is the best all around approach to doing so, that I could make sure to eliminate both the hard linked file and the file that originated the copy for the remaining linked ones. And how might I do this safely, so as not to jeopardize losing the actual file first, but only after removing the links to the file, leaving the original as the only one left. That would be the holy grail if I can get there. I don't think it is too difficult, but I am still putting it to the forum to check to see if my logic is somehow not working well at this time!
That and the documentation for rsnapshot has got me hella confused. Does that make sense? For example, if I have a folder in the backup with source directory /usr/sbin, then I would like to be able to remove all links to the files but not the originating file backup that would be located in the backup directory specified in rsnapshot.conf. I will probably no longer keep the incremental backup for anything other than a folder like /etc and /home, but for now I have gone a bit further (and fallen short at the same time, forgetting to backup the /home directory, I basically only have the backups for /etc and /var that I want). I think given the situation, I would like to create or save a copy somewhere, but then immediately free up the majority of the associated space with having every past version of every document also saved. Right? Isn't that what that means?
At this point I am open to learning about the backups, and salvage what I can out of the previous files. At least in terms of my missing a chance with the /home folder I have a monthly backup that I run which is done bit for bit (a bare metal recovery type of tool) so that I can always go in there later back to a long time ago and fetch those files if needed. Woop woop.
But that is a way that I am sort of leaning with the majority of backups, since it seems more reasonable almost at this stage to be recovering (a la Windows system recovery) then performing all the extensive work involved in bringing up the actual file location from the specific backup run that has the file with rsnapshot. In terms of the shell, I wonder if there is some simple utility that I am missing (almost "ls"-like ) that could show me which files are hard-linked and which are the original backups? I assume there is. Any help would be appreciated. Then the thing I really want help with is how to, after finding the original files/folders do something to identify all the links to it, and whether or not those links can be removed without removing a similarly named file that is not a link but another version that should not get removed.
Any suggestions? Thank you.
Thanks and sorry if none of this seems to make perfect sense. It's late and I'm looking for a way to get started on this later once I have a better idea of the preferred method of reviewing the files. The backups were done using rsnapshot, and cronjobs for a sequence of routine backups, as I mentioned already (I think).
Put another way...I need help with Removing hard links from backups (I want original onlies)
Update:
After running a simple command for the hardlinks to a file, I am now left with the basic point of confusion about which of these files is the actual file and is the oldest? There must be some other way of separating that file (other than the time), I would imagine...unless there isn't really!
Code: Select all
playby backups # find . -samefile weekly.0/localhost/bin/chgrp
./daily.3/localhost/bin/chgrp
./daily.6/localhost/bin/chgrp
./weekly.0/localhost/bin/chgrp
./monthly.0/localhost/bin/chgrp
./daily.4/localhost/bin/chgrp
./weekly.1/localhost/bin/chgrp
./daily.5/localhost/bin/chgrp
./weekly.3/localhost/bin/chgrp
./daily.2/localhost/bin/chgrp
./weekly.2/localhost/bin/chgrp



