Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
ext4 - what the f@*k is going on ...
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Tue Apr 01, 2014 12:38 pm    Post subject: ext4 - what the f@*k is going on ... Reply with quote

Hello Forum,
I experience a very strange situation with ext4 as follows: I do have two LVM logical volumes of equal size, both formatted with an ext4 filesystem with identical parameters. The first is used as the root filesystem for a XEN guest and the other is used as a backup device (i.e. I syns the former from time to time to the latter). Backups are created using rsync with the following options (mnt/target and /mnt/source are the mount points for the two filesystems):
Code:
rsync -qaAxXH --delete /mnt/source /mnt/target

Now I would expect that both filesystems are identical after the rsync command; a quick check for me was to look at the output of df, and to my surprise, there were differences between the two filesystems (I have left out other mounted filesystems in the output below):
Code:
 # df
Filesystem                     1K-blocks     Used Available Use% Mounted on
/dev/mapper/VGpool-master.ROOT   8125880  2657944   5032124  35% /mnt/source
/dev/mapper/VGpool-backup.ROOT   8125880  2680540   5009528  35% /mnt/target
So the total number of 1K-blocks is identiacl (that's expected), but the used size differs. Now clearly, some of that could be attributed to directory "files" that were holding entries that got deleted; in that case, the directory size would not shrink again after deleting the files and copying such a directory over to the backup would not create an equal size directory but only one that's large enough to hold the current files. As the difference was somehow larger than I expected and it also was the other way round than I would have expected it (in my view the size of directories for the backup disk should usually not be larger), I started to investigate and clearly, there were (expected and explainable) size mismatches for certain directories.

But I soon encountered a difference I absolutely can't make any sense of, in fact I am rather puzzeled: The first listing below is for the backup drive (mounted to /mnt/target), the second listing is for the XEN guest root filesystem (mounted to /mnt/source):
Code:
vm-host locale # pwd
/mnt/target/usr/lib64/locale
vm-host locale # ls -l
total 1572
-rw-r--r-- 1 root root 1607632 Feb  2 02:27 locale-archive
vm-host locale # du -s .
1576    .
Code:
vm-host locale # pwd
/mnt/source/usr/lib64/locale
vm-host locale # ls -l
total 1500
-rw-r--r-- 1 root root 1607632 Feb  2 02:27 locale-archive
vm-host locale # du -s .
1504    .
While the size for both files is identical (and they also contain the same data [checked with md5sum] the obviously use a different number of 1K-blocks (the total line from ls -l). It also does not seem to be a bug with the "ls" command as the "du" command also comes up with identical values: Its output for both directories is consistently 4k larger because "du ." also accounts for the directory itself which, on ext4 with a 4k blocksize, is a minumum of 4k in size. So that's actually consistent.

If one does the math it gets even stranger because the first file in /mnt/source seems to store 1,071.75 bytes (i.e. 1,607,632/1,500) per 1K (1,024 bytes) block while the file at /mnt/target only stores 1,022,67 bytes (i.e. 1,607,632/1,572) per 1K (1,024 bytes) block. As there is no compression on the filesystem especially the first number just seems to be impossible while the second number, on the face of it, seems to be reasonable (given that there's probably some overhead somewhere and the size of the file is not a multiple of the block size).

So as a last resort I used a forced fsck on both (unmounted) filesystems to rule out any logical errors - and that didn't reveal any problems but also confirmed the difference in used blocks (the first output is for the backup drive, the second output is for the XEN guest root drive):
Code:
vm-host / # fsck -f /dev/mapper/VGpool-master.ROOT
fsck from util-linux 2.22.2
e2fsck 1.42.7 (21-Jan-2013)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
ROOT: 160981/524288 files (0.1% non-contiguous), 730168/2097152 blocks
Code:
vm-host / # fsck -f /dev/mapper/VGpool-backup.ROOT
fsck from util-linux 2.22.2
e2fsck 1.42.7 (21-Jan-2013)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
ROOT: 160981/524288 files (0.1% non-contiguous), 735817/2097152 blocks
BTW fsck also shows that the number of files stored on both filesystems is identical (160,981) which seems to confirm that the rsync operation actually does work as expected.

I can't rule out that I do have an error in my thought process and there is the one and obvious logical explanation but I currently do have my doubts. Therefore I call on the combined intelligence of this forum to get to the grounds of this issue.

Many thanks in advance Atom2
Back to top
View user's profile Send private message
Telemin
l33t
l33t


Joined: 25 Aug 2005
Posts: 753
Location: Glasgow, UK

PostPosted: Tue Apr 01, 2014 1:31 pm    Post subject: Reply with quote

Hi there,

What makes you expect that the filesystems should be identical? The method you are using makes no guarantee of this. I would suggest that assuming your backups are correct and all files are checksum identical that it is simply a difference of how files are allocated on your two filesystems, remember your backup drive does not see the same level of fragmentation that your in use drive does, as there are less frequent changes made to it.

For example:
Code:

# dd if=/dev/zero of=testimg_a bs=1M count=1024
# dd if=/dev/zero of=testimg_b bs=1M count=1024

# mount testimg_a /mnt/a/
# mount testimg_b /mnt/b/

# bzcat /usr/share/doc/dhcpcd-6.3.2/README.bz2 >> /mnt/a/a
# bzcat /usr/share/doc/gccmakedep-1.0.2-r1/ChangeLog.bz2 >> /mnt/a/a

# bzcat /usr/share/doc/python-2.7.6/HISTORY.bz2 >> /mnt/a/b
# bzcat /usr/share/doc/python-3.2.5-r3/HISTORY.bz2 >> /mnt/a/b

# bzcat /usr/share/doc/glibc-2.18-r1/ChangeLog.bz2 >> /mnt/a/c
# bzcat /usr/share/doc/glibc-2.18-r1/ChangeLog.1.bz2 >> /mnt/a/c
# bzcat /usr/share/doc/glibc-2.18-r1/ChangeLog.2.bz2 >> /mnt/a/c

# rsync -av --delete /mnt/a /mnt/b

# df
Filesystem     1K-blocks      Used Available Use% Mounted on
/dev/loop0        999320      2712    927796   1% /mnt/a
/dev/loop1        999320      2720    927788   1% /mnt/b


-Telemin-
_________________
The Geek formerly known as -Freestyling-
When you feel your problem has been solved please add [Solved] to the topic title.
Please adopt an unanswered post
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Tue Apr 01, 2014 2:36 pm    Post subject: Reply with quote

Hi Telemin,
thanks for jumping in.
Telemin wrote:
What makes you expect that the filesystems should be identical? The method you are using makes no guarantee of this.
I am not expecting this, clearly there will be differences - like for instance different sized directories. But I would expect that two identical sized files containing the same data require an identical number of disk blocks to store their data.
Telemin wrote:
I would suggest that assuming your backups are correct and all files are checksum identical that it is simply a difference of how files are allocated on your two filesystems, remember your backup drive does not see the same level of fragmentation that your in use drive does, as there are less frequent changes made to it.
But if fragmentaton were to play a role, than I'd rather expect that the more fragmented file/disk requires more space/blocks to store data than the less fragmented file/disk (due to possibly some overhead resulting from the fragmentation). Apart from the fact that in your example, however, fragmentation should not really play any part at all as both filesystem seemed to be new and therefore unfragmented (although I couldn't spot an mkfs step in your listing), your backup created with rsync also takes more space than the original filesystem. That does indeed still sound odd to me ... or I am still missing an important piece of the jigsaw.
If I remember correctly, in the old days two files with identical size required an identical number of disk blocks - regardless of their fragmentation (fragmentation was only an issue of mapping contiguous logical file data blocks to non-contiguous physical on disk data blocks; a block however was a block and was able to store a block size's worth of data). Has this changed with more modern file systems like ext4?

Telemin wrote:
For example:
Code:

# dd if=/dev/zero of=testimg_a bs=1M count=1024
# dd if=/dev/zero of=testimg_b bs=1M count=1024

# mount testimg_a /mnt/a/
# mount testimg_b /mnt/b/

# bzcat /usr/share/doc/dhcpcd-6.3.2/README.bz2 >> /mnt/a/a
# bzcat /usr/share/doc/gccmakedep-1.0.2-r1/ChangeLog.bz2 >> /mnt/a/a

# bzcat /usr/share/doc/python-2.7.6/HISTORY.bz2 >> /mnt/a/b
# bzcat /usr/share/doc/python-3.2.5-r3/HISTORY.bz2 >> /mnt/a/b

# bzcat /usr/share/doc/glibc-2.18-r1/ChangeLog.bz2 >> /mnt/a/c
# bzcat /usr/share/doc/glibc-2.18-r1/ChangeLog.1.bz2 >> /mnt/a/c
# bzcat /usr/share/doc/glibc-2.18-r1/ChangeLog.2.bz2 >> /mnt/a/c

# rsync -av --delete /mnt/a /mnt/b

# df
Filesystem     1K-blocks      Used Available Use% Mounted on
/dev/loop0        999320      2712    927796   1% /mnt/a
/dev/loop1        999320      2720    927788   1% /mnt/b
Would you mind sharing the ls -ls output for the two directoris /mnt/a and /mnt/b? Is your difference also due to files a, b, and c using more blocks to store the same data in /mnt/b than the originals in /mnt/a?

Furthermore I still don't get how any filesystem is able to store more than 1,024 bytes in a single 1K block (provided there's no compression going on somewhere hidden in the back).

Thanks Atom2
Back to top
View user's profile Send private message
broken_chaos
Guru
Guru


Joined: 18 Jan 2006
Posts: 370
Location: Ontario, Canada

PostPosted: Wed Apr 02, 2014 3:26 pm    Post subject: Reply with quote

Atom2 wrote:
Furthermore I still don't get how any filesystem is able to store more than 1,024 bytes in a single 1K block (provided there's no compression going on somewhere hidden in the back).

Unless you explicitly set the block size to 1K, ext4 blocks default to 4K on most archs. du using a '1K block size' is more just du 'showing in kilobytes' than anything to do with actual block sizes.

You may wish to check the output of tune2fs -l for both filesystems. Also using ls -la rather than ls -l would give more information, including hidden files and information on the directories (.) themselves.
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Thu Apr 03, 2014 5:57 pm    Post subject: Reply with quote

Hi broken_chaos, thanks for replying.
broken_chaos wrote:
Unless you explicitly set the block size to 1K, ext4 blocks default to 4K on most archs. du using a '1K block size' is more just du 'showing in kilobytes' than anything to do with actual block sizes.
You are right that the blocksize defaults to 4K and is also actually 4K in my case. But that's something I have already said in my initial post:
Atom2 wrote:
While the size for both files is identical (and they also contain the same data [checked with md5sum] the obviously use a different number of 1K-blocks (the total line from ls -l). It also does not seem to be a bug with the "ls" command as the "du" command also comes up with identical values: Its output for both directories is consistently 4k larger because "du ." also accounts for the directory itself which, on ext4 with a 4k blocksize, is a minumum of 4k in size. So that's actually consistent.

broken_chaos wrote:
You may wish to check the output of tune2fs -l for both filesystems.
Even if the on-disk blocksize is 4K (and tune2fs has confirmed that) it does not explain the difference in sizes displayed in 1K-units (let's not call those blocks then) by the "ls" command nor does it explain how one would be able to store more than 1,024 bytes in a single 1K-unit. Let me do the math again:
Code:
FOR THE XEN GUEST FILESYSTEM:
size-of-file:        1607632 bytes    (as shown by ls and also confirmed by wc -c)
number-of-1K-units:     1500 1K-units (as shown by ls' total output [first line]; cofirmed by ls -las for local-archive)
size-of-file / number-of-1K-units = bytes-per-1K-unit: 1607632 / 1500 = 1071,75 => that does not work out

FOR THE BACKUP FILESYSTEM:
size-of-file:        1607632 bytes    (as shown by ls and also confirmed by wc -c)
number-of-1K-units:     1572 1K-units (as shown by ls' total output [first line]; cofirmed by ls -las for local-archive)
size-of-file / number-of-1K-units = bytes-per-1K-unit: 1607632 / 1572 = 1022,67 => that looks reasonable

broken_chaos wrote:
Also using ls -la rather than ls -l would give more information, including hidden files and information on the directories (.) themselves.
I had checked that before, but for the sake of completness, here's the output of ls -la; and I have even added the -s option to show 1K-units usages next to every entry (knowing that the on-disk block-size is actually 4K), first for the XEN guest and then for the backup drive:
Code:
vm-host locale # ls -las
total 1528
   4 drwxr-xr-x  2 root root    4096 Feb  2 02:27 .
  24 drwxr-xr-x 42 root root   24576 Apr  1 18:02 ..
   0 -rw-r--r--  1 root root       0 Feb  2 02:27 .keep_sys-libs_glibc-2.2
1500 -rw-r--r--  1 root root 1607632 Feb  2 02:27 locale-archive
Code:
vm-host locale # ls -las
total 1600
   4 drwxr-xr-x  2 root root    4096 Feb  2 02:27 .
  24 drwxr-xr-x 42 root root   24576 Mar 31 01:01 ..
   0 -rw-r--r--  1 root root       0 Feb  2 02:27 .keep_sys-libs_glibc-2.2
1572 -rw-r--r--  1 root root 1607632 Feb  2 02:27 locale-archive

Either I am still missing something very fundamental or there's a bug lurking somewhere ...

Atom2
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum