View previous topic :: View next topic |
Author |
Message |
Shining Arcanine Veteran
Joined: 24 Sep 2009 Posts: 1110
|
Posted: Thu May 13, 2010 7:53 pm Post subject: |
|
|
devsk wrote: | My first BTRFS corruption:
Code: | 22:33:13 devsk@localhost /var/db/pkg
$ l
total 0
drwxr-xr-x 1 root root 1678 2010-05-08 11:28 ./
drwxr-xr-x 1 root root 6 2010-05-11 19:04 ../
-rw-rw-rw- 1 root root 0 2010-03-30 19:19 ½ð_
-rw-rw-rw- 1 root root 0 2010-03-30 19:19 <80>^O¹×Ñ^L÷ÇKbYö^Cª
drwxr-xr-x 1 root root 824 2010-04-01 12:33 app-admin/
drwxr-xr-x 1 root root 494 2010-04-23 22:27 app-arch/ | How the heck did this happen? Can I do online check of the filesystem while mounted? |
What kernel are you using? |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 2995 Location: Bay Area, CA
|
Posted: Thu May 13, 2010 8:14 pm Post subject: |
|
|
Shining Arcanine wrote: | devsk wrote: | My first BTRFS corruption:
Code: | 22:33:13 devsk@localhost /var/db/pkg
$ l
total 0
drwxr-xr-x 1 root root 1678 2010-05-08 11:28 ./
drwxr-xr-x 1 root root 6 2010-05-11 19:04 ../
-rw-rw-rw- 1 root root 0 2010-03-30 19:19 ½ð_
-rw-rw-rw- 1 root root 0 2010-03-30 19:19 <80>^O¹×Ñ^L÷ÇKbYö^Cª
drwxr-xr-x 1 root root 824 2010-04-01 12:33 app-admin/
drwxr-xr-x 1 root root 494 2010-04-23 22:27 app-arch/ | How the heck did this happen? Can I do online check of the filesystem while mounted? |
What kernel are you using? | 2.6.33.2 |
|
Back to top |
|
|
Shining Arcanine Veteran
Joined: 24 Sep 2009 Posts: 1110
|
Posted: Fri May 14, 2010 2:09 am Post subject: |
|
|
Linux Kernel 2.6.33.4 is out. Although there were no btrfs patches in it, I believe that Linux Kernel 2.6.33.3 did have a btrfs patch. The bug that was fixed seems like a significant issue:
Quote: | commit 92ee813c7f2241000f9d35e71b01273cd871482b
Author: Wu Fengguang <fengguang.wu@intel.com>
Date: Tue Apr 6 14:34:53 2010 -0700
readahead: fix NULL filp dereference
commit 70655c06bd3f25111312d63985888112aed15ac5 upstream.
btrfs relocate_file_extent_cluster() calls us with NULL filp:
[ 4005.426805] BUG: unable to handle kernel NULL pointer dereference at 00000021
[ 4005.426818] IP: [<c109a130>] page_cache_sync_readahead+0x18/0x3e
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Yan Zheng <yanzheng@21cn.com>
Reported-by: Kirill A. Shutemov <kirill@shutemov.name>
Tested-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
You probably should upgrade. |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 2995 Location: Bay Area, CA
|
Posted: Fri May 14, 2010 2:20 am Post subject: |
|
|
Quote: | You probably should upgrade. | Yeah, I have been waiting a bit for next upgrade. I think 2.6.34 is about to be released. So, I might as well jump to that. |
|
Back to top |
|
|
Shining Arcanine Veteran
Joined: 24 Sep 2009 Posts: 1110
|
Posted: Fri May 14, 2010 3:35 am Post subject: |
|
|
devsk wrote: | Quote: | You probably should upgrade. | Yeah, I have been waiting a bit for next upgrade. I think 2.6.34 is about to be released. So, I might as well jump to that. |
It is a drop-in replacement for the old kernel. My usual ritual for upgrading involves emerging the latest sources and running a command along the lines of:
nano -w /boot/grub/grub.conf && eselect kernel linux-2.6.33.4 && cd /usr/src/linux && cp ../linux-2.6.33.3/.config . && make -j3 oldconfig && make -j3 && make -j3 modules_install && make -j3 firmware_install && cp arch/x86/boot/bzImage /boot/kernel-2.6.33.4 && module-rebuild
I upgraded kernels on two systems today. make oldconfig is a hug time saver. Kernel upgrades only take like 10 minutes with very little work on my part because of it. |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 2995 Location: Bay Area, CA
|
Posted: Fri May 14, 2010 4:15 am Post subject: |
|
|
Shining Arcanine wrote: | devsk wrote: | Quote: | You probably should upgrade. | Yeah, I have been waiting a bit for next upgrade. I think 2.6.34 is about to be released. So, I might as well jump to that. |
It is a drop-in replacement for the old kernel. My usual ritual for upgrading involves emerging the latest sources and running a command along the lines of:
nano -w /boot/grub/grub.conf && eselect kernel linux-2.6.33.4 && cd /usr/src/linux && cp ../linux-2.6.33.3/.config . && make -j3 oldconfig && make -j3 && make -j3 modules_install && make -j3 firmware_install && cp arch/x86/boot/bzImage /boot/kernel-2.6.33.4 && module-rebuild
I upgraded kernels on two systems today. make oldconfig is a hug time saver. Kernel upgrades only take like 10 minutes with very little work on my part because of it. | yeah, I have a similar build script (only it doesn't take fixed versions but finds them using uname and emerge etc.) but new kernel means reboot, which this machine can't afford. It needs to be over the weekend. |
|
Back to top |
|
|
echoblack n00b
Joined: 15 May 2010 Posts: 1
|
Posted: Sat May 15, 2010 8:51 am Post subject: |
|
|
Aloha guys,
Nice thread. I'm not a Gentoo user but I have read all your install guides and love the documentation you guys have for "Hardened Gentoo" Super nice setup.... But I'm an Archlinux user because I like Archlinux.
---------------------
Any way, I just got done reading this tread and I think I may know what your problem with file-system corruption may be.
Quote: | [quote=devsk]My first BTRFS corruption:
How the heck did this happen? Can I do online check of the filesystem while mounted? |
I read that UbuntuFourms HowTo thread, and one of the Ubuntu Admin's said this....
Quote: | [quote=jdong Ubuntu forums Admin]Note one very very important correction: btrfsck is NOT an online checker. IT IS OFFLINE ONLY, contrary to what the documentation said at one point. Running btrfsck online can lead to filesystem corruption! |
So, ya that sounds like what your problem was.
devsk Said, Can I do online check of the filesystem while mounted?
Nope. You can not. At lest not with btrfsck. However, jdong did go on to say... "Later git releases of btrfs-tools does include a mounted filesystem check, but in Karmic it does not" So, maybe this new version is already in the version you are running. However, I guess it may still be broken.
Kind of ironic that checking for file-system corruption is what caused the corruption :p
Anyway, Btrfs looks stable enough for my single SSD system. I am switching as soon as 2.6.34 is out. If interested, you should check out this BTRFS mkinitcpio hook that estofme wrote.
http://bbs.archlinux.org/viewtopic.php?id=88195 |
|
Back to top |
|
|
regomodo Guru
Joined: 25 Mar 2008 Posts: 445
|
Posted: Fri May 21, 2010 8:18 pm Post subject: |
|
|
skellr wrote: | The only thing i can think of is that it's refering to "mount" bieng an unknown command. i don't see a mount/umount symlink to busybox so try this out...
init: | #!/bin/busybox sh
# Mount the /proc and /sys filesystems.
busybox mount -t proc none /proc
busybox mount -t sysfs none /sys
# Btrfs stuff
echo "Trying to run btrfsctl"
/sbin/btrfsctl -a
# Mount the root filesystem.
busybox mount -o ro /dev/sde7 /mnt/root
# Clean up.
busybox umount /proc
busybox umount /sys
# Boot the real thing.
exec switch_root /mnt/root /sbin/init |
|
I was under the impression that busybox brought in the "mount" command along with it in it's binary. |
|
Back to top |
|
|
skellr l33t
Joined: 18 Jun 2005 Posts: 975 Location: The Village, Portmeirion
|
Posted: Fri May 21, 2010 8:41 pm Post subject: |
|
|
regomodo wrote: | I was under the impression that busybox brought in the "mount" command along with it in it's binary. |
It depends on the busybox configuration. |
|
Back to top |
|
|
BenderBendingRodriguez Tux's lil' helper
Joined: 19 Feb 2010 Posts: 101
|
Posted: Fri May 21, 2010 9:33 pm Post subject: As of now btrfs looks promising |
|
|
As of now btrfs is my main filesystem for everything (though as always i keep backups on external hard disk )
My netbook gentoo box sometimes freezes (though not anymore from 2.6.34 gentoo sources) and after the freeze i had to hard reset it and guess what, the system wasn't working fine when booting (lots of boot failures) but i could still log in and reboot as su (if i used Magic Sys Rq combination without logging in and rebooting i had those failures the next boot) and the next boot was completely fine with no lost files (that's copy-on-write magic i suppose) and booting just fine and fast (thought not much faster than ext4 but to be sure i'd have to use bootchart).
BTW here's my fstab
cat /etc/fstab
# /etc/fstab: static file system information.
#
# noatime turns off atimes for increased performance (atimes normally aren't
# needed; notail increases performance of ReiserFS (at the expense of storage
# efficiency). It's safe to drop the noatime options if you want and to
# switch between notail / tail freely.
#
# The root filesystem should have a pass number of either 0 or 1.
# All other filesystems should have a pass number of 0 or greater than 1.
#
# See the manpage fstab(5) for more information.
#
# <fs> <mountpoint> <type> <opts> <dump/pass>
# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.
/dev/sda1 /boot ext4 noauto,relatime 1 2
/dev/sda5 / btrfs defaults,relatime 0 0
/dev/sda6 /usr btrfs defaults,relatime 1 1
/dev/sda7 /var btrfs defaults,relatime 1 1
/dev/sda8 /tmp btrfs defaults,relatime 1 1
/dev/sda10 /home btrfs defaults,relatime 1 1
/dev/sda9 none swap sw 0 0
#/dev/cdrom /mnt/cdrom auto noauto,ro 0 0
#/dev/fd0 /mnt/floppy auto noauto 0 0
# glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for
# POSIX shared memory (shm_open, shm_unlink).
# (tmpfs is a dynamically expandable/shrinkable ramdisk, and will
# use almost no memory if not populated with files)
shm /dev/shm tmpfs nodev,nosuid 0 0 |
|
Back to top |
|
|
platojones Veteran
Joined: 23 Oct 2002 Posts: 1602 Location: Just over the horizon
|
Posted: Fri May 21, 2010 11:13 pm Post subject: Re: As of now btrfs looks promising |
|
|
BenderBendingRodriguez wrote: | As of now btrfs is my main filesystem for everything (though as always i keep backups on external hard disk )
|
Very nice. Good thing you have everything backed up too. Is it noticeably faster or about the same? |
|
Back to top |
|
|
regomodo Guru
Joined: 25 Mar 2008 Posts: 445
|
Posted: Sat May 22, 2010 9:40 am Post subject: |
|
|
skellr wrote: | regomodo wrote: | I was under the impression that busybox brought in the "mount" command along with it in it's binary. |
It depends on the busybox configuration. |
Well, the wiki page doesn't go into much detail other making sure it's static.
In any case, my initramfs only complains on the "btrfsctl" command, not "mount" |
|
Back to top |
|
|
skellr l33t
Joined: 18 Jun 2005 Posts: 975 Location: The Village, Portmeirion
|
Posted: Sat May 22, 2010 12:18 pm Post subject: |
|
|
The "unknown command" error? Heh, it doesn't seem to make much sense as you have the full path written in your init. What happens when you chroot into the initramfs directory and try to run it?
Code: | tegan ~ # env -i TERM=$TERM chroot initramfs2 /bin/busybox sh
/ # echo $PATH
/sbin:/usr/sbin:/bin:/usr/bin
/ # btrfsctl
no valid commands given
usage: btrfsctl [ -d file|dir] [ -s snap_name subvol|tree ]
[-r size] [-A device] [-a] [-c]
-d filename: defragments one file
-d directory: defragments the entire Btree
-s snap_name dir: creates a new snapshot of dir
-S subvol_name dir: creates a new subvolume
-r [+-]size[gkm]: resize the FS by size amount
-A device: scans the device file for a Btrfs filesystem
-a: scans all devices for Btrfs filesystems
-c: forces a single FS sync
Btrfs Btrfs v0.19
/ # btrfsctl -a
Scanning for Btrfs filesystems
/ # |
|
|
Back to top |
|
|
regomodo Guru
Joined: 25 Mar 2008 Posts: 445
|
Posted: Sat May 22, 2010 1:05 pm Post subject: |
|
|
skellr wrote: | The "unknown command" error? Heh, it doesn't seem to make much sense as you have the full path written in your init. What happens when you chroot into the initramfs directory and try to run it?
Code: | tegan ~ # env -i TERM=$TERM chroot initramfs2 /bin/busybox sh
/ # echo $PATH
/sbin:/usr/sbin:/bin:/usr/bin
/ # btrfsctl
no valid commands given
usage: btrfsctl [ -d file|dir] [ -s snap_name subvol|tree ]
[-r size] [-A device] [-a] [-c]
-d filename: defragments one file
-d directory: defragments the entire Btree
-s snap_name dir: creates a new snapshot of dir
-S subvol_name dir: creates a new subvolume
-r [+-]size[gkm]: resize the FS by size amount
-A device: scans the device file for a Btrfs filesystem
-a: scans all devices for Btrfs filesystems
-c: forces a single FS sync
Btrfs Btrfs v0.19
/ # btrfsctl -a
Scanning for Btrfs filesystems
/ # |
|
Forgot about doing that
Code: | funtoo-pc src # env -i TERM=$TERM chroot initramfs/ /bin/busybox sh
/ # echo $PATH
/sbin:/usr/sbin:/bin:/usr/bin
/ # btrfsctl
sh: btrfsctl: not found
/ # /sbin/btrfsctl
sh: /sbin/btrfsctl: not found
/ #funtoo-pc src # ls initramfs/sbin/
btrfsctl
funtoo-pc src #
|
Hmm, weird. |
|
Back to top |
|
|
skellr l33t
Joined: 18 Jun 2005 Posts: 975 Location: The Village, Portmeirion
|
Posted: Sat May 22, 2010 1:54 pm Post subject: |
|
|
Yeah it is. It seems I have the same issue when it's in the sbin directory. Try moving it into bin/.
sbin/ has the same perms as bin/
edit: uid=0 gid=0 should be good enough.
Code: | / # id
uid=0 gid=0 groups=0,1,2,3,4,6,10,11,26,27 |
|
|
Back to top |
|
|
BenderBendingRodriguez Tux's lil' helper
Joined: 19 Feb 2010 Posts: 101
|
Posted: Sat May 22, 2010 2:06 pm Post subject: Re: As of now btrfs looks promising |
|
|
platojones wrote: | BenderBendingRodriguez wrote: | As of now btrfs is my main filesystem for everything (though as always i keep backups on external hard disk )
|
Very nice. Good thing you have everything backed up too. Is it noticeably faster or about the same? |
I can't notice any big speedups but keep in mind that it as a netbook disk which has only 5400rpm. But definitely can't notice any slowdowns so definitely it ist as fast as ext4 but to prove that we would need some benchmarks |
|
Back to top |
|
|
skellr l33t
Joined: 18 Jun 2005 Posts: 975 Location: The Village, Portmeirion
|
Posted: Sat May 22, 2010 3:28 pm Post subject: |
|
|
skellr wrote: | Yeah it is. It seems I have the same issue when it's in the sbin directory. Try moving it into bin/.
sbin/ has the same perms as bin/
edit: uid=0 gid=0 should be good enough.
Code: | / # id
uid=0 gid=0 groups=0,1,2,3,4,6,10,11,26,27 |
|
I tried to make a limited test case for this error but it's not happening. All I did the first time was move btrfsctl from bin to sbin and received the same error as you. Now it won't do it again. it works fine from sbin
Start over and make a new initramfs from scratch? |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 2995 Location: Bay Area, CA
|
Posted: Sat May 22, 2010 7:17 pm Post subject: |
|
|
Out of blue a whole directory (my portage overlay) is missing in one of my folders. I just noticed it today when I fired 'esearch star' and it said "WARNING: One or more repositories have missing repo_name entries". And I had specifically created that file. I go into the overlay folder and its empty....WOW! WTF! I look at my backup (boy, am I happy to keep them) and the directory structure is same and the overlay folder is there.
This is the second time I am getting eaten by a silent failure for some unknown reason. And I don't know what happened. The scary part is that no warnings or errors were issued by the FS.
Two related events:
1. I had an unclean shutdown yesterday morning but I fscked the FS using livecd and it did not crib about anything.
2. I upgraded from 2.6.24_rc7 to 2.6.34 yesterday evening.
I don't know which one triggered it.
I moved my rootfs to BTRFS a long time ago (much earlier than most folks here) but I am not so sure now.
PS: Before someone starts saying that I deleted the folder by mistake, I log every command I type in a log file with no limits or rotations on it. |
|
Back to top |
|
|
platojones Veteran
Joined: 23 Oct 2002 Posts: 1602 Location: Just over the horizon
|
Posted: Sat May 22, 2010 7:25 pm Post subject: |
|
|
devsk wrote: | Out of blue a whole directory (my portage overlay) is missing in one of my folders. I just noticed it today when I fired 'esearch star' and it said "WARNING: One or more repositories have missing repo_name entries". And I had specifically created that file. I go into the overlay folder and its empty....WOW! WTF! I look at my backup (boy, am I happy to keep them) and the directory structure is same and the overlay folder is there.
This is the second time I am getting eaten by a silent failure for some unknown reason. And I don't know what happened. The scary part is that no warnings or errors were issued by the FS.
Two related events:
1. I had an unclean shutdown yesterday morning but I fscked the FS using livecd and it did not crib about anything.
2. I upgraded from 2.6.24_rc7 to 2.6.34 yesterday evening.
I don't know which one triggered it.
I moved my rootfs to BTRFS a long time ago (much earlier than most folks here) but I am not so sure now.
PS: Before someone starts saying that I deleted the folder by mistake, I log every command I type in a log file with no limits or rotations on it. |
Beta testing new filesystems is only for people with very good backup solutions (obviously backed up on to a stable FS) or a dedicated, throwaway test box. I'm pretty sure BTRFS hasn't had nearly enough testing. But, then again, if nobody tests it, it doesn't get stable. |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 2995 Location: Bay Area, CA
|
Posted: Sat May 22, 2010 8:02 pm Post subject: |
|
|
platojones wrote: | devsk wrote: | Out of blue a whole directory (my portage overlay) is missing in one of my folders. I just noticed it today when I fired 'esearch star' and it said "WARNING: One or more repositories have missing repo_name entries". And I had specifically created that file. I go into the overlay folder and its empty....WOW! WTF! I look at my backup (boy, am I happy to keep them) and the directory structure is same and the overlay folder is there.
This is the second time I am getting eaten by a silent failure for some unknown reason. And I don't know what happened. The scary part is that no warnings or errors were issued by the FS.
Two related events:
1. I had an unclean shutdown yesterday morning but I fscked the FS using livecd and it did not crib about anything.
2. I upgraded from 2.6.24_rc7 to 2.6.34 yesterday evening.
I don't know which one triggered it.
I moved my rootfs to BTRFS a long time ago (much earlier than most folks here) but I am not so sure now.
PS: Before someone starts saying that I deleted the folder by mistake, I log every command I type in a log file with no limits or rotations on it. |
Beta testing new filesystems is only for people with very good backup solutions (obviously backed up on to a stable FS) or a dedicated, throwaway test box. I'm pretty sure BTRFS hasn't had nearly enough testing. But, then again, if nobody tests it, it doesn't get stable. | Why do you think I have backups?..
Its just a reminder that we should not start getting a warm and fuzzy feeling about this new FS and forget that upstream marks it beta quality and experimental for a reason. I, for one, had started believing that it was ready for replacing ext3/4. But apparently not just yet. |
|
Back to top |
|
|
platojones Veteran
Joined: 23 Oct 2002 Posts: 1602 Location: Just over the horizon
|
Posted: Sat May 22, 2010 8:07 pm Post subject: |
|
|
Quote: | Why do you think I have backups?.. |
That was my point...you do! Just thought I would re-iterate that just in case somebody who doesn't bother with backups decides that it's a good idea to become an new FS beta tester. |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 2995 Location: Bay Area, CA
|
Posted: Sun May 23, 2010 4:15 pm Post subject: |
|
|
This latest problem does raise a few questions:
1. Do you listen to your 20000 songs collection and watch the 100 movies collection every day? How about every week? How do you know all of them are still there? Not just count but with contents. This is not really btrfs question. It applies to all filesystems.
2. How deep is btrfsck? Why did it not find that the directory checksum for overlay folder is now not same as before because all sub-folders are gone missing now? This means that it overwrote the checksum on newer access (when I went into the folder to find the horror) without checking if it was valid, hence making the new checksum valid. May be there is implicit assumption that if the file system is mounted and online, things can't go wrong (which is bad!!) and current checksum is the "right" checksum. How the heck and when are the problems supposed to be detected using checksums then?
I thought block-level checksum was the answer to these questions but what good is a checksum if its not used and verified against? |
|
Back to top |
|
|
Dont Panic Guru
Joined: 20 Jun 2007 Posts: 322 Location: SouthEast U.S.A.
|
Posted: Sun May 23, 2010 10:45 pm Post subject: |
|
|
The development of the btrfs file system utilities doesn't seem to be getting as much attention by the developers as the btrfs file system itself.
If I'm reading the Mailing List correctly, the btrfsck utility doesn't currently fix anything. It just reports errors.
As I understand it, the btrfsck utility cannot be run on a mounted file system, even mounted read only.
I see some talk of adding btrfs support to Grub-2. But how will you ever run a start-up fsck of your file system if you can't at least mount it read-only first? |
|
Back to top |
|
|
devsk Advocate
Joined: 24 Oct 2003 Posts: 2995 Location: Bay Area, CA
|
Posted: Sun May 23, 2010 10:54 pm Post subject: |
|
|
Dont Panic wrote: | The development of the btrfs file system utilities doesn't seem to be getting as much attention by the developers as the btrfs file system itself.
If I'm reading the Mailing List correctly, the btrfsck utility doesn't currently fix anything. It just reports errors.
As I understand it, the btrfsck utility cannot be run on a mounted file system, even mounted read only.
I see some talk of adding btrfs support to Grub-2. But how will you ever run a start-up fsck of your file system if you can't at least mount it read-only first? | I ran it on the device without mounting the FS. It did not find anything.
I sure hope that there is some checking done inline during the normal run of the FS as well. Otherwise, I don't see how checksums help at all. If I don't fsck for a month, my corruptions will go undetected for a month.
And if there is checking done inline, why did that not find that overlay folder was gone and the checksum doesn't match anymore? Having checksums is one thing. What to do with them and when, is the hard part and that's where the most smarts are. And btrfs failed me in that department twice. |
|
Back to top |
|
|
Dont Panic Guru
Joined: 20 Jun 2007 Posts: 322 Location: SouthEast U.S.A.
|
Posted: Sun May 23, 2010 11:27 pm Post subject: |
|
|
It would have been nice to have caught btrfs "in the act" of doing whatever it did.
It sounds like that by the time you found out that directory had issues, the time had already past for gaining insight as to why btrfs was behaving this way.
As to your general question about making sure your data is still there, I've thought about it before also, and don't have a good answer (especially when it comes to big, space-hungry stuff like movies and music). It's nice to have an older back-up around in addition to your fresher backups, because it's easy to make an un-noticed mistake in deleting important data and then propagate that mistake through your backups. |
|
Back to top |
|
|
|