Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Gentoo on ZFS
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4, 5, 6, 7, 8  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Fri Aug 29, 2014 6:41 pm    Post subject: Reply with quote

Have you tried deleting /home before mounting the dataset rather than just umounting it?
I don't have anything else to suggest, I don't experience this on my set up. I have /home and /home/myusername as seperate datasets, if you don't then I guess that could be the problem, in that your /home is finding files within the subdirectory that it doesn't expect to be there, whereas with the user subdirectory as a seperate set, it wouldn't care (...I think, I'm no expert).
Back to top
View user's profile Send private message
Aonoa
Guru
Guru


Joined: 23 May 2002
Posts: 589

PostPosted: Fri Aug 29, 2014 11:56 pm    Post subject: Reply with quote

mrbassie wrote:
Have you tried deleting /home before mounting the dataset rather than just umounting it?
I don't have anything else to suggest, I don't experience this on my set up. I have /home and /home/myusername as seperate datasets, if you don't then I guess that could be the problem, in that your /home is finding files within the subdirectory that it doesn't expect to be there, whereas with the user subdirectory as a seperate set, it wouldn't care (...I think, I'm no expert).


I will look into deleting /home entirely prior to mounting, but it affects /root too (also it's own dataset) if root is the active user when I reboot. I can add that I didn't always have this issue, I think it may have begun once I put pulseaudio on my system, do you have pulseaudio installed and in use?

I'm also wondering if there is any point in a L2ARC device if I have fast SSD's and a lot of RAM?
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Sat Aug 30, 2014 10:30 am    Post subject: Reply with quote

I don't use pulse-audio, no. I don't see why that would cause this though, wouldn't that mean that pulse-audio tries to write data to an unmounted filesystem during shutdown and so the system creates a new directory for it to write to?

Is /home/username also a seperate dataset to /home?

Please post exactly how you manually mount the dataset after boot. I'm guessing you umount /home and then zfs mount the dataset.

As for l2arc...if it's a desktop or a laptop, I don't see the point, I don't think there would be enough data access to necessitate such a large fast cache. Likewise a home server.
Afaik it's a feature more oriented at big production servers with terrabytes of data constantly being hammered by tons of users which require very good iops.
Back to top
View user's profile Send private message
Aonoa
Guru
Guru


Joined: 23 May 2002
Posts: 589

PostPosted: Sat Aug 30, 2014 7:33 pm    Post subject: Reply with quote

mrbassie wrote:
I don't use pulse-audio, no. I don't see why that would cause this though, wouldn't that mean that pulse-audio tries to write data to an unmounted filesystem during shutdown and so the system creates a new directory for it to write to?

Yes, kind of. The datasets are probably unmounted while some process(es) aren't done shutting down entirely, while / is still available.

mrbassie wrote:
Is /home/username also a seperate dataset to /home?

Yes, it is. My desktop hasn't been operational (broken hardware) for a while, so the laptop I have been using in the meantime has thus far only had issues mounting /root.

mrbassie wrote:
Please post exactly how you manually mount the dataset after boot. I'm guessing you umount /home and then zfs mount the dataset.

I don't need to unmount anything. For /root I just "rm -rf /root/.* ; zfs mount rpool/HOME/root", seeing as it's just a few dot files/folders there's no harm in removing. As long as /root is empty, the mounting works.

mrbassie wrote:
As for l2arc...if it's a desktop or a laptop, I don't see the point, I don't think there would be enough data access to necessitate such a large fast cache. Likewise a home server.
Afaik it's a feature more oriented at big production servers with terrabytes of data constantly being hammered by tons of users which require very good iops.

Indeed, I don't think I will bother using L2ARC at all.
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Sun Aug 31, 2014 11:20 am    Post subject: Reply with quote

You could try rebuilding your initramfs once all the datasets are mounted. I don't know if that will work or not. I'm just wondering if it's something to do with the zpool cache. That's just a wild guess.
Other than that I don't know what to suggest.
Back to top
View user's profile Send private message
Aonoa
Guru
Guru


Joined: 23 May 2002
Posts: 589

PostPosted: Wed Sep 10, 2014 7:33 pm    Post subject: Reply with quote

mrbassie wrote:
You could try rebuilding your initramfs once all the datasets are mounted. I don't know if that will work or not. I'm just wondering if it's something to do with the zpool cache. That's just a wild guess.
Other than that I don't know what to suggest.

I have been experimenting on my newly built system, and only /root is currently having a problem mounting during startup. The directory "/root/.pulse" is created during boot along with two files inside it, somehow prior to /root being mounted. The /root directory is it's own zfs dataset, and /home/user is similarly it's own dataset, but it mounts properly. Funnily there is a similar "/home/user/.pulse" folder with more files in it.

The culprit seems related to alsa-utils because /root mounts properly if I unmerge alsa-utils. With alsa-utils gone, there is no "/root/.pulse" folder being created at all. By the way, I don't have alsa stuff in my runlevels. I guess I currently don't really need alsa-utils, but I would like to figure out exactly what is going on and why /home/user mounts, but not /root.

Rebuilding the initramfs file or /etc/zfs/zpool.cache has no effect at all.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Thu Oct 02, 2014 1:24 am    Post subject: Reply with quote

Hi guys,

is it normal that (small correctable) errors occur from time to time ?

Quote:
pool: WD30EFRX
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-9P
scan: resilvered 320K in 0h0m with 0 errors on Mon Aug 25 17:50:39 2014
config:

NAME STATE READ WRITE CKSUM
WD30EFRX ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
wd30efrx_002 ONLINE 0 0 0
wd30efrx ONLINE 0 0 2
cache
intelSSD180 ONLINE 0 0 0

errors: No known data errors




Quote:

zpool status -v
pool: WD30EFRX
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-9P
scan: scrub repaired 140K in 6h6m with 0 errors on Thu Oct 2 03:20:51 2014
config:

NAME STATE READ WRITE CKSUM
WD30EFRX ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
wd30efrx_002 ONLINE 0 0 4
wd30efrx ONLINE 0 0 0
cache
intelSSD180 ONLINE 0 0 0

errors: No known data errors



this system is being run by a Xeon CPU with ECC RAM

seems like I might have to give the RAM a check during the weekend ...
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
Aonoa
Guru
Guru


Joined: 23 May 2002
Posts: 589

PostPosted: Mon Oct 13, 2014 12:57 pm    Post subject: Reply with quote

kernelOfTruth wrote:
Hi guys,

is it normal that (small correctable) errors occur from time to time ?


I do not get "normal" errors on my ZFS pools, at least not yet. I had errors occurring some times on an old ZFS pool, but the SSD's were dying and hardware problems was the cause.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Mon Oct 13, 2014 2:57 pm    Post subject: Reply with quote

@Aonoa:


oh - indeed 8O ! I've had 2 harddrives already dying after showing these kind of symptoms by the dozens - however this pattern is surprising


haven't tested the RAM yet (low priority as it's ECC it surely would have either displayed an error or hung the system)

in retrospect now it either seems to occur during period of large stress (scrubbing, transferring of all the data [closed to 2TB]) or longer uptime

although self-tests don't indicate any errors this is kind of strange - longer uptime could be related due to regressions of issues introduced by patches in the kernel

I'll take a look


thank you !
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Sun Nov 30, 2014 6:25 pm    Post subject: inquiry Reply with quote

hello,

ZFS is interesting but I am still struggling to get it working smoothly. There are a few rough edges which I hope they can be overcome.

First question is about the scheduler option in the kernel. On them interwebs say that the best is to use:

NO-OP

Is this true? It's even more difficult to have the correct linux option because multiple guides from over the years and different OSes saying conflicting things.

What about CFQ? What's the best scheduler under a modern linux kernel to use with ZFS?

One last question. Why isn't ZFS working on kernel 3.17?? I want kernel latest DRM to be honest.

As I said, ZFS is a curious solution. Having burned several times with BTRFS with disastrous corruptions was nice to see ZFS behaving like LVM with volums as well.

Another question, can the partition 9 be deleted and grow ZFS to use that last bit? I heard that's Solaris' remnant.

thank you.
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Mon Dec 01, 2014 1:46 pm    Post subject: Reply with quote

noop probably is indeed the best way to go. zfs is very independent and I believe it handles scheduling itself.

I have a little issue myself I'd like to put out there:

I've got two gentoo boxes on zfs, one is a replica of the other (stage4), they both have a zvol for swap, on one swapon fails during boot but I can do it manually after logon.
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Mon Dec 01, 2014 4:34 pm    Post subject: Reply with quote

mrbassie wrote:
noop probably is indeed the best way to go. zfs is very independent and I believe it handles scheduling itself.

I have a little issue myself I'd like to put out there:

I've got two gentoo boxes on zfs, one is a replica of the other (stage4), they both have a zvol for swap, on one swapon fails during boot but I can do it manually after logon.


Sounds like init order, or it isn't included in initramfs.

Did you add swap in /etc/fstab?

Maybe fstab is run before zfs activation or something. Check this:

rc-update show
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Mon Dec 01, 2014 4:57 pm    Post subject: Reply with quote

identical on the two machines.

Code:
# <fs>              <mountpoint>         <type>           <opts>                                                                    <dump/pass>

# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.
/dev/sda1               /boot             ext2          defaults                                                                        0 2
/dev/zvol/tank/swap     none              swap          sw                                                                              0 0
/dev/cdrom              /mnt/cdrom        auto          users,exec,rw                                                                   0 0
/dev/sdb1               /media/usb        auto          rw,users,noauto,nodev,nosuid                                                    1 2
tmpfs                   /tmp              tmpfs         rw,nodev,nosuid,size=128M


Code:
 bootmisc | boot                         
           consolekit |      default                 
                 dbus |      default                 
                devfs |                       sysinit
                dmesg |                       sysinit
             hostname | boot                         
              keymaps | boot                         
            killprocs |              shutdown       
             loopback | boot                         
        microcode_ctl | boot                         
              modules | boot                         
             mount-ro |              shutdown       
                 mtab | boot                         
               net.lo | boot                         
            net.wlan0 |      default                 
              numlock |      default                 
              preload | boot                         
               procfs | boot                         
                 root | boot                         
            savecache |              shutdown       
                 swap | boot                         
            swapfiles | boot                         
               sysctl | boot                         
                sysfs |                       sysinit
         termencoding | boot                         
         tmpfiles.dev |                       sysinit
       tmpfiles.setup | boot                         
                 udev |      default                 
           udev-mount |                       sysinit
       udev-postmount |      default                 
              urandom | boot                         
                  xdm |      default                 
                  zfs | boot                       
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Mon Dec 01, 2014 7:02 pm    Post subject: Reply with quote

I can think two more things.

hostid
zpool.cache

re-create them for the new box.
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Mon Dec 01, 2014 7:31 pm    Post subject: Reply with quote

WWWW wrote:
I can think two more things.

hostid
zpool.cache

re-create them for the new box.


I don't know how to do that. They're both single disk workstations with 120G ssd's.
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Mon Dec 01, 2014 8:51 pm    Post subject: Reply with quote

zpool set cachefile=/etc/zfs/zpool.cache <pool>

Also check your hostid with

zpool status

force export and force import (with liveCD), then reboot.

Here's a good link with a checklist:

https://github.com/zfsonlinux/zfs/issues/599

Another suspect could be udev. Try re-generating your initramfs after re-creating zpool.cache file.

Gone are the days when you could simply dd a system from hdd to another hdd and linux would boot without a hitch. Now with the plethora of UUIDS down to your fingerprints cloning aint that straight forward.

To date I have not tried clonning a gpt formated drive. I wonder how could that work since uefi is the king of UUIDS.
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Tue Dec 02, 2014 9:42 am    Post subject: Reply with quote

I'll have a look at that when I get home from work. Not sure about what you mean by recreating hostid. Are you talking about the name of the pool?

I didn't actually clone one to the other. Originally my laptop was on jfs. I built a machine at work (which is now at home) and wanted to play with zfs so I built Gentoo from scratch on a zpool copying my config files across from the laptop with a usb stick (everything in /etc/portage, kernel config, world etc.) So when I say they're identical I don't mean bit for bit, I mean they're configured identically (other than a couple of in kernel drivers)
When I decided it was useful and stable enough I made a stage4 of my home laptop installation, created a pool and datasets on the disk, mounted them all and just untarred the stage4, built the kernel and that was that.
Didn't dd anything.
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Tue Dec 02, 2014 10:30 am    Post subject: Reply with quote

New version out!!

zfs-0.6.3-1.1

Are gentoo maintainers sleeping or something?

There are many goodies and 3.17 support!!

please add an ebuild within the next 8 hours. Exactly in 8 hours I will sync portage to confirm.

thanks
Back to top
View user's profile Send private message
peje
Tux's lil' helper
Tux's lil' helper


Joined: 11 Jan 2003
Posts: 100

PostPosted: Tue Dec 02, 2014 10:56 am    Post subject: Reply with quote

@WWWW please mind your words:
Quote:
New version out!!

zfs-0.6.3-1.1

Are gentoo maintainers sleeping or something?

There are many goodies and 3.17 support!!

please add an ebuild within the next 8 hours. Exactly in 8 hours I will sync portage to confirm.

thanks

thats not the way it works, no one earth anything with doing work for gentoo
cu Peje
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Fri Dec 05, 2014 12:31 pm    Post subject: Reply with quote

I wanted to know whether zfs-0.6.3-r1 fixes the 4GB hard coded for max ram?

I have the problem where any value over 512MB fills ALL RAM eventually. This leads to numerous performance issues.

Since I have a nice amount of ram I wanted to leverage this fact to have zfs performing optimally and more.

Ram fills quickly when traversing half a million indexed with mysql.

I am not using ARC cache.

When I set the limit to 4GB zfs seems to understand 8GB. I don't mind zfs using all ram it wants as long as it releases some.

Qemu is not able to start because it can't find contiguous ram allocation.

Some curious fact that after emerging something rams empties out suddenly.

I don't know how to control this behavior.

thanks.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Fri Dec 05, 2014 3:54 pm    Post subject: Reply with quote

WWWW wrote:
New version out!!

zfs-0.6.3-1.1

Are gentoo maintainers sleeping or something?

There are many goodies and 3.17 support!!

please add an ebuild within the next 8 hours. Exactly in 8 hours I will sync portage to confirm.

thanks

Patches welcome.
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Fri Dec 05, 2014 9:30 pm    Post subject: Reply with quote

Ant P. wrote:

Patches welcome.


It's in portage already and 3.17 compatible.

What about my last question?
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Sat Dec 06, 2014 10:44 pm    Post subject: Reply with quote

@WWWW:

post your question again at:

https://groups.google.com/a/zfsonlinux.org/forum/#!forum/zfs-discuss

those folks might have more experience with QEMU & ZFSOnLinux
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
traq9
n00b
n00b


Joined: 16 Jul 2013
Posts: 2
Location: Mesa, AZ

PostPosted: Tue Dec 09, 2014 5:27 pm    Post subject: Kernel Oops using spl-0.6.3-r1/zfs-kmod-0.6.3-r1 Reply with quote

Issues with ZFS spl-0.6.3-r1/zfs-kmod-0.6.3-r1.

Quote:
[285511.517924] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[285511.517955] IP: [] feature_do_action+0x23/0x2b0 [zfs]
[285511.517989] PGD 74ad3067 PUD 74ad6067 PMD 0
[285511.518005] Oops: 0000 [#1] SMP


See https://github.com/zfsonlinux/zfs/issues/2946 for details.

Step lightly for those of you depending on your ZFS pools.
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Thu Dec 11, 2014 8:09 pm    Post subject: Reply with quote

Oh man!! I got one even scarier!!

Code:

INFO: rcu_sched self-detected stall on CPU
\x092: (84003 ticks this GP) idle=cbf/140000000000001/0 softirq=2798109/2798109
\x09 (t=84004 jiffies g=1276523 c=1276522 q=18648)
Task dump for CPU 2:
zvol/26         R  running task    12768  2646      2 0x00000008
 06edf9e7cf55bbf4 ffff88009c204200 ffffffff945fa800 ffff88023ed03db8
 ffffffff9407b671 0000000000000002 ffffffff945fa800 ffff88023ed03dd0
 ffffffff9407dd04 0000000000000003 ffff88023ed03e00 ffffffff94098cc0
Call Trace:
 <IRQ>  [<ffffffff9407b671>] sched_show_task+0xc1/0x130
 [<ffffffff9407dd04>] dump_cpu_task+0x34/0x40
 [<ffffffff94098cc0>] rcu_dump_cpu_stacks+0x90/0xd0
 [<ffffffff9409c13c>] rcu_check_callbacks+0x44c/0x6d0
 [<ffffffff9407eaea>] ? account_system_time+0x8a/0x160
 [<ffffffff9409e883>] update_process_times+0x43/0x70
 [<ffffffff940ad331>] tick_sched_handle.isra.18+0x41/0x50
 [<ffffffff940ad379>] tick_sched_timer+0x39/0x60
 [<ffffffff9409eea1>] __run_hrtimer.isra.34+0x41/0xf0
 [<ffffffff9409f715>] hrtimer_interrupt+0xe5/0x220
 [<ffffffff940227e2>] local_apic_timer_interrupt+0x32/0x60
 [<ffffffff94022d9f>] smp_apic_timer_interrupt+0x3f/0x60
 [<ffffffff94428c7b>] apic_timer_interrupt+0x6b/0x70
 <EOI>  [<ffffffff94427a0e>] ? _raw_spin_lock+0x1e/0x30
 [<ffffffff94425f47>] __mutex_unlock_slowpath+0x17/0x40
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc0405cb9>] dbuf_clear+0xd9/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160

INFO: rcu_sched self-detected stall on CPU
\x092: (20999 ticks this GP) idle=cbf/140000000000001/0 softirq=2798109/2798109
\x09 (t=21000 jiffies g=1276523 c=1276522 q=5491)
Task dump for CPU 2:
zvol/26         R  running task    12768  2646      2 0x00000008
 06edf9e7cf55bbf4 ffff88009c204200 ffffffff945fa800 ffff88023ed03db8
 ffffffff9407b671 0000000000000002 ffffffff945fa800 ffff88023ed03dd0
 ffffffff9407dd04 0000000000000003 ffff88023ed03e00 ffffffff94098cc0
Call Trace:
 <IRQ>  [<ffffffff9407b671>] sched_show_task+0xc1/0x130
 [<ffffffff9407dd04>] dump_cpu_task+0x34/0x40
 [<ffffffff94098cc0>] rcu_dump_cpu_stacks+0x90/0xd0
 [<ffffffff9409c13c>] rcu_check_callbacks+0x44c/0x6d0
 [<ffffffff9407eaea>] ? account_system_time+0x8a/0x160
 [<ffffffff9409e883>] update_process_times+0x43/0x70
 [<ffffffff940ad331>] tick_sched_handle.isra.18+0x41/0x50
 [<ffffffff940ad379>] tick_sched_timer+0x39/0x60
 [<ffffffff9409eea1>] __run_hrtimer.isra.34+0x41/0xf0
 [<ffffffff9409f715>] hrtimer_interrupt+0xe5/0x220
 [<ffffffff940227e2>] local_apic_timer_interrupt+0x32/0x60
 [<ffffffff94022d9f>] smp_apic_timer_interrupt+0x3f/0x60
 [<ffffffff94428c7b>] apic_timer_interrupt+0x6b/0x70
 <EOI>  [<ffffffff94427a0e>] ? _raw_spin_lock+0x1e/0x30
 [<ffffffff94425f47>] __mutex_unlock_slowpath+0x17/0x40
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc0405cb9>] dbuf_clear+0xd9/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160


general protection fault: 0000 [#4] SMP
CPU: 3 PID: 2625 Comm: zvol/5 Tainted:
task: ffff88009c1b1080 ti: ffff88009c1b1608 task.ti: ffff88009c1b1608
RIP: 0010:[<ffffffff94425f54>]  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
RSP: 0000:ffffc90015cb3b78  EFLAGS: 00010283
RAX: fefefefefefefefe RBX: ffff8801e52ab1b0 RCX: ffff880225a820c0
RDX: ffff8801e52ab1b8 RSI: ffff8800734d6b90 RDI: ffff8801e52ab1b4
RBP: ffffc90015cb3b80 R08: 00000000000823d1 R09: 0000000000000000
R10: ffff880225ac1818 R11: 000000000000000e R12: 0000000000000002
R13: ffff880226537930 R14: ffff880225feaa68 R15: ffff880226537948
FS:  00007137d8f89740(0000) GS:ffff88023ed80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00006ff3987db000 CR3: 00000001dc357000 CR4: 00000000000407f0
Stack:
 ffff8801e52ab1b0 ffffc90015cb3b98 ffffffff94425f8d ffff8801e52ab158
 ffffc90015cb3bb8 ffffffffc04059d9 ffff8800734d6b90 ffff8801e52ab158
 ffffc90015cb3be8 ffffffffc0405ca2 ffff8800734d6b90 0000000000000000
Call Trace:
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc04059d9>] dbuf_rele_and_unlock+0x179/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
Code: 1f 84 00 00 00 00 00 55 48 89 e5 53 48 89 fb 48 8d 7b 04 c7 03 01 00 00 00 e8 a9 1a 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 53 5b c5 ff 80 43 04 01 5b 5d c3 66 66 66 2e 0f
RIP  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
 RSP <ffffc90015cb3b78>
---[ end trace 8fc20d6e09e2d611 ]---



general protection fault: 0000 [#3] SMP
CPU: 1 PID: 2623 Comm: zvol/3 Tainted:
task: ffff88009c1b0000 ti: ffff88009c1b0588 task.ti: ffff88009c1b0588
RIP: 0010:[<ffffffff94425f54>]  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
RSP: 0000:ffffc90015ca3b78  EFLAGS: 00010287
RAX: fefefefefefefefe RBX: ffff8800934e22a8 RCX: ffff880225a820c0
RDX: ffff8800934e22b0 RSI: ffff8801e268dbc0 RDI: ffff8800934e22ac
RBP: ffffc90015ca3b80 R08: 000000000007e19b R09: 0000000000000000
R10: ffff880225ac1818 R11: 000000000000000e R12: 0000000000000002
R13: ffff880226537930 R14: ffff880225feaa68 R15: ffff880226537948
FS:  00007137d8f89740(0000) GS:ffff88023ec80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00006ff3987db000 CR3: 00000001dc357000 CR4: 00000000000407f0
Stack:
 ffff8800934e22a8 ffffc90015ca3b98 ffffffff94425f8d ffff8800934e2250
 ffffc90015ca3bb8 ffffffffc04059d9 ffff8801e268dbc0 ffff8800934e2250
 ffffc90015ca3be8 ffffffffc0405ca2 ffff8801e268dbc0 0000000000000000
Call Trace:
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc04059d9>] dbuf_rele_and_unlock+0x179/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
Code: 1f 84 00 00 00 00 00 55 48 89 e5 53 48 89 fb 48 8d 7b 04 c7 03 01 00 00 00 e8 a9 1a 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 53 5b c5 ff 80 43 04 01 5b 5d c3 66 66 66 2e 0f
RIP  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
 RSP <ffffc90015ca3b78>
---[ end trace 8fc20d6e09e2d610 ]---



This is what happened:

Installed spl-0.6.3-r1/zfs-kmod-0.6.3-r1. Then upgraded to kernel 3.17.

During this upgrade I decided to forgo low latency pre-emption and with voluntary pre-emption or no pre-emption. Perhaps doesn't matter because with kernel 3.16 was fully pre-emptable and never segfaulted or OOPed.

Upon seeing that scary segfault and reading rcu_sched self-detected stall on CPU in dmesg I backtracked as fast as possible.

I thought that had to be related to RCU and pre-emption since RCU options in the kernel change according to pre-emption model.

But the problem persists )':

This zvol is formated with ext4.

Anybody know if this is fixable with proper option under RCU? To be honest I don't know how to configure the RCU options best for zfs.

thanks
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page Previous  1, 2, 3, 4, 5, 6, 7, 8  Next
Page 4 of 8

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum