View previous topic :: View next topic |
Author |
Message |
SarahS93 l33t
Joined: 21 Nov 2013 Posts: 693
|
Posted: Wed Dec 13, 2017 8:53 am Post subject: 3 folders, same files in it, same fs, but diffent size - why |
|
|
i have 3 hdds, on earch hd i have a folder with the same files in it, but hey have a diferent size - why?
how could that be?
63060432 /mnt/1/one
63058768 /mnt/2/two
63058764 /mnt/3/three
all 3 drives have an ext4 fs
/mnt/1/one was my "origianl" i copy the files (for backup) to two and three
to show which files are differnt, i create md5sum files and compare these files with comm
the md5sums are all the same
the files are rar files - all rar files are ok, i can extract them
whats going wrong here?
Last edited by SarahS93 on Wed Dec 13, 2017 9:22 am; edited 1 time in total |
|
Back to top |
|
|
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9679 Location: almost Mile High in the USA
|
Posted: Wed Dec 13, 2017 9:22 am Post subject: |
|
|
Weird.
How are you getting these sizes, from du?
If it's from du, then possibly the sparse file generation was different for each file, some sparse blocks got expanded and others did not?
If these numbers are from ls -l, then it's even weirder and perhaps the only reason why they match is because they're cached?
Two files with differing "ls -l" sizes will generate different md5sums, though hash collisions could happen (unlikely). You could try some other hash routine like sha512sum or something.
If you can unmount and remount the disks or flush caches, and then regenerate md5sum and/or diff, it could expose something. _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
Fitzcarraldo Advocate
Joined: 30 Aug 2008 Posts: 2034 Location: United Kingdom
|
|
Back to top |
|
|
Ant P. Watchman
Joined: 18 Apr 2009 Posts: 6920
|
Posted: Wed Dec 13, 2017 8:54 pm Post subject: |
|
|
When you ls -l a directory, the size is the directory's own metadata size, not the files within.
A fragmented file will take more space in the directory than a non-fragmented one, the pointers to file extents have to be stored somewhere. Directories also build up gaps in the index over time, this isn't usually a problem but it's something fsck can clean up if it grows too big. |
|
Back to top |
|
|
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9679 Location: almost Mile High in the USA
|
Posted: Thu Dec 14, 2017 1:20 am Post subject: |
|
|
Oh... did not realize these 1/one 2/two 3/three were possibly directories and not actual files, and the numbers were number of bytes the directory entries used (and thinking they were fresh file uncompresses from rar).
If you "ls -ld" a directory (file type d), it shows how much disk space that directory structure is using ignoring the contents of the files it contains. The problem is that directories don't automatically shrink down when files get deleted within that directory, so if you fill a directory with lots of files and then delete them, the directory will still contain all the blocks needed to store the original contents of the directory until the whole directory is rewritten. E2fsck -D the filesystem should optimize directories (requiring unmount), or you can manually rewrite by creating a new directory and move everything to the new directory (requiring you to kill all processes with handles to the directory, else you risk corruption).
However upon initial creation of all the files in the directory, the directory's entry should be the same size; but as you work on one and rsync to the others, the directory entries' apparent size will deviate from each other.
You shouldn't need to worry about the directory entry's size unless you're so strapped for disk space and need to reclaim that... which isn't really that much relative to the number of files you have in that directory.
However one thing that sort of bugs me is that I thought a directory should use multiples of your block size on ext4... 63058768 is divisible by 16 but not a multiple of 512 or higher (4096 being a typical ext4fs block size). If you're using 16K blocks, it may make sense but having a directory entry use 63GB (or even 63MB) is monstrous... For reiserfs with tail packing, then perhaps it may be OK... _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20067
|
Posted: Wed Feb 07, 2018 1:47 am Post subject: |
|
|
Moved from Multimedia to Other Things Gentoo. Not related to Multimedia. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|