Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
3 folders, same files in it, same fs, but diffent size - why
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
SarahS93
l33t
l33t


Joined: 21 Nov 2013
Posts: 693

PostPosted: Wed Dec 13, 2017 8:53 am    Post subject: 3 folders, same files in it, same fs, but diffent size - why Reply with quote

i have 3 hdds, on earch hd i have a folder with the same files in it, but hey have a diferent size - why?
how could that be?

63060432 /mnt/1/one
63058768 /mnt/2/two
63058764 /mnt/3/three

all 3 drives have an ext4 fs
/mnt/1/one was my "origianl" i copy the files (for backup) to two and three

to show which files are differnt, i create md5sum files and compare these files with comm
the md5sums are all the same

the files are rar files - all rar files are ok, i can extract them

whats going wrong here?


Last edited by SarahS93 on Wed Dec 13, 2017 9:22 am; edited 1 time in total
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9677
Location: almost Mile High in the USA

PostPosted: Wed Dec 13, 2017 9:22 am    Post subject: Reply with quote

Weird.

How are you getting these sizes, from du?

If it's from du, then possibly the sparse file generation was different for each file, some sparse blocks got expanded and others did not?

If these numbers are from ls -l, then it's even weirder and perhaps the only reason why they match is because they're cached?

Two files with differing "ls -l" sizes will generate different md5sums, though hash collisions could happen (unlikely). You could try some other hash routine like sha512sum or something.

If you can unmount and remount the disks or flush caches, and then regenerate md5sum and/or diff, it could expose something.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Fitzcarraldo
Advocate
Advocate


Joined: 30 Aug 2008
Posts: 2034
Location: United Kingdom

PostPosted: Wed Dec 13, 2017 9:43 am    Post subject: Reply with quote

Would the following Super User thread hold the answer?:

Why are the sizes of 2 directories different if the data within the directories is identical? They are identical ext4 partitions and disks
_________________
Clevo W230SS: amd64, VIDEO_CARDS="intel modesetting nvidia".
Compal NBLB2: ~amd64, xf86-video-ati. Dual boot Win 7 Pro 64-bit.
OpenRC udev elogind & KDE on both.

Fitzcarraldo's blog
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Wed Dec 13, 2017 8:54 pm    Post subject: Reply with quote

When you ls -l a directory, the size is the directory's own metadata size, not the files within.
A fragmented file will take more space in the directory than a non-fragmented one, the pointers to file extents have to be stored somewhere. Directories also build up gaps in the index over time, this isn't usually a problem but it's something fsck can clean up if it grows too big.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9677
Location: almost Mile High in the USA

PostPosted: Thu Dec 14, 2017 1:20 am    Post subject: Reply with quote

Oh... did not realize these 1/one 2/two 3/three were possibly directories and not actual files, and the numbers were number of bytes the directory entries used (and thinking they were fresh file uncompresses from rar).

If you "ls -ld" a directory (file type d), it shows how much disk space that directory structure is using ignoring the contents of the files it contains. The problem is that directories don't automatically shrink down when files get deleted within that directory, so if you fill a directory with lots of files and then delete them, the directory will still contain all the blocks needed to store the original contents of the directory until the whole directory is rewritten. E2fsck -D the filesystem should optimize directories (requiring unmount), or you can manually rewrite by creating a new directory and move everything to the new directory (requiring you to kill all processes with handles to the directory, else you risk corruption).

However upon initial creation of all the files in the directory, the directory's entry should be the same size; but as you work on one and rsync to the others, the directory entries' apparent size will deviate from each other.

You shouldn't need to worry about the directory entry's size unless you're so strapped for disk space and need to reclaim that... which isn't really that much relative to the number of files you have in that directory.

However one thing that sort of bugs me is that I thought a directory should use multiples of your block size on ext4... 63058768 is divisible by 16 but not a multiple of 512 or higher (4096 being a typical ext4fs block size). If you're using 16K blocks, it may make sense but having a directory entry use 63GB (or even 63MB) is monstrous... For reiserfs with tail packing, then perhaps it may be OK...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Wed Feb 07, 2018 1:47 am    Post subject: Reply with quote

Moved from Multimedia to Other Things Gentoo. Not related to Multimedia.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum