Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Gentoo on ZFS
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4, 5, 6, 7, 8  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Fri Dec 12, 2014 12:07 am    Post subject: Reply with quote

@WWWW:

3.17, 3.18, 3.19-rc* kernel-related scheduler and/or rcu caused problem

I got it randomly for a short time upon migration from 3.16* to 3.17*

but haven't seen it since switching back to BFS, after a short time with CFS

it's not related to ZFSOnLinux


http://marc.info/?l=linux-kernel&m=141825835003708&w=2


you could try the following patch: https://lkml.org/lkml/2014/3/30/7 ([PATCH] sched: update_rq_clock() must skip ONE update)

but I'm not sure if that'll help alone,

I've meanwhile added several patches on top of my 3.17*-based kernel (rather randomly ;) ) as a prophylaxis to mitigate and prevent that issue from happening in the first place

perhaps one of those prevents it (like I wrote, I luckily didn't see that message again)

repo is at: https://github.com/kernelOfTruth/linux/commits/linux-3.17.6-plus-v4_btrfs-old_14
if you're curious


one odd thing that also happened from time to time was that services wouldn't launch (e.g. dmcrypt/luks) of openrc and it plainly would boot into root prompt

or other times it would boot properly but had issues during login (would not log in and only a reboot would help) - could have been related to the underlying Btrfs filesystem on root - or not

but I suspect that it might also be related to this racing/rcu stall problem

could be that ZFS (spl) triggers it - since the short time I hadn't ZFS installed - I didn't see it, if I remember correctly


edit:

this is what I got:

zcat /proc/config.gz | grep -i RCU wrote:
kernel config

Code:

# RCU Subsystem
CONFIG_TREE_PREEMPT_RCU=y
CONFIG_PREEMPT_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_FANOUT=32
CONFIG_RCU_FANOUT_LEAF=4
# CONFIG_RCU_FANOUT_EXACT is not set
CONFIG_RCU_FAST_NO_HZ=y
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_RCU_BOOST=y
CONFIG_RCU_BOOST_PRIO=40
CONFIG_RCU_BOOST_DELAY=331
# RCU Debugging
# CONFIG_SPARSE_RCU_POINTER is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=32
CONFIG_RCU_CPU_STALL_VERBOSE=y
CONFIG_RCU_CPU_STALL_INFO=y
# CONFIG_RCU_TRACE is not set

_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Mon Dec 15, 2014 1:54 pm    Post subject: Reply with quote

Thank you, trying that now.

The freaking memory bug is really annoying, it destroys the purpose of using zfs to begin with.

Apparently a fix is not scheduled until 0.6.4, which will also use linux's memory management.

Changing topic, kernel 3.19 introduces blk-mq. This looks interesting in conjunction with zil/arc cache on SSD.

I haven't tried any caching yet with zfs but I'd like to test it. There must be something to because even LVM is adopting a caching mechanism on SSD. And I believe btrfs has it already.

Will blk-mq improve zfs cache performance?

thanks.
Back to top
View user's profile Send private message
ryao
Retired Dev
Retired Dev


Joined: 27 Feb 2012
Posts: 132

PostPosted: Mon Feb 09, 2015 10:21 pm    Post subject: Reply with quote

WWWW wrote:
Oh man!! I got one even scarier!!

Code:

INFO: rcu_sched self-detected stall on CPU
\x092: (84003 ticks this GP) idle=cbf/140000000000001/0 softirq=2798109/2798109
\x09 (t=84004 jiffies g=1276523 c=1276522 q=18648)
Task dump for CPU 2:
zvol/26         R  running task    12768  2646      2 0x00000008
 06edf9e7cf55bbf4 ffff88009c204200 ffffffff945fa800 ffff88023ed03db8
 ffffffff9407b671 0000000000000002 ffffffff945fa800 ffff88023ed03dd0
 ffffffff9407dd04 0000000000000003 ffff88023ed03e00 ffffffff94098cc0
Call Trace:
 <IRQ>  [<ffffffff9407b671>] sched_show_task+0xc1/0x130
 [<ffffffff9407dd04>] dump_cpu_task+0x34/0x40
 [<ffffffff94098cc0>] rcu_dump_cpu_stacks+0x90/0xd0
 [<ffffffff9409c13c>] rcu_check_callbacks+0x44c/0x6d0
 [<ffffffff9407eaea>] ? account_system_time+0x8a/0x160
 [<ffffffff9409e883>] update_process_times+0x43/0x70
 [<ffffffff940ad331>] tick_sched_handle.isra.18+0x41/0x50
 [<ffffffff940ad379>] tick_sched_timer+0x39/0x60
 [<ffffffff9409eea1>] __run_hrtimer.isra.34+0x41/0xf0
 [<ffffffff9409f715>] hrtimer_interrupt+0xe5/0x220
 [<ffffffff940227e2>] local_apic_timer_interrupt+0x32/0x60
 [<ffffffff94022d9f>] smp_apic_timer_interrupt+0x3f/0x60
 [<ffffffff94428c7b>] apic_timer_interrupt+0x6b/0x70
 <EOI>  [<ffffffff94427a0e>] ? _raw_spin_lock+0x1e/0x30
 [<ffffffff94425f47>] __mutex_unlock_slowpath+0x17/0x40
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc0405cb9>] dbuf_clear+0xd9/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160

INFO: rcu_sched self-detected stall on CPU
\x092: (20999 ticks this GP) idle=cbf/140000000000001/0 softirq=2798109/2798109
\x09 (t=21000 jiffies g=1276523 c=1276522 q=5491)
Task dump for CPU 2:
zvol/26         R  running task    12768  2646      2 0x00000008
 06edf9e7cf55bbf4 ffff88009c204200 ffffffff945fa800 ffff88023ed03db8
 ffffffff9407b671 0000000000000002 ffffffff945fa800 ffff88023ed03dd0
 ffffffff9407dd04 0000000000000003 ffff88023ed03e00 ffffffff94098cc0
Call Trace:
 <IRQ>  [<ffffffff9407b671>] sched_show_task+0xc1/0x130
 [<ffffffff9407dd04>] dump_cpu_task+0x34/0x40
 [<ffffffff94098cc0>] rcu_dump_cpu_stacks+0x90/0xd0
 [<ffffffff9409c13c>] rcu_check_callbacks+0x44c/0x6d0
 [<ffffffff9407eaea>] ? account_system_time+0x8a/0x160
 [<ffffffff9409e883>] update_process_times+0x43/0x70
 [<ffffffff940ad331>] tick_sched_handle.isra.18+0x41/0x50
 [<ffffffff940ad379>] tick_sched_timer+0x39/0x60
 [<ffffffff9409eea1>] __run_hrtimer.isra.34+0x41/0xf0
 [<ffffffff9409f715>] hrtimer_interrupt+0xe5/0x220
 [<ffffffff940227e2>] local_apic_timer_interrupt+0x32/0x60
 [<ffffffff94022d9f>] smp_apic_timer_interrupt+0x3f/0x60
 [<ffffffff94428c7b>] apic_timer_interrupt+0x6b/0x70
 <EOI>  [<ffffffff94427a0e>] ? _raw_spin_lock+0x1e/0x30
 [<ffffffff94425f47>] __mutex_unlock_slowpath+0x17/0x40
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc0405cb9>] dbuf_clear+0xd9/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160


general protection fault: 0000 [#4] SMP
CPU: 3 PID: 2625 Comm: zvol/5 Tainted:
task: ffff88009c1b1080 ti: ffff88009c1b1608 task.ti: ffff88009c1b1608
RIP: 0010:[<ffffffff94425f54>]  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
RSP: 0000:ffffc90015cb3b78  EFLAGS: 00010283
RAX: fefefefefefefefe RBX: ffff8801e52ab1b0 RCX: ffff880225a820c0
RDX: ffff8801e52ab1b8 RSI: ffff8800734d6b90 RDI: ffff8801e52ab1b4
RBP: ffffc90015cb3b80 R08: 00000000000823d1 R09: 0000000000000000
R10: ffff880225ac1818 R11: 000000000000000e R12: 0000000000000002
R13: ffff880226537930 R14: ffff880225feaa68 R15: ffff880226537948
FS:  00007137d8f89740(0000) GS:ffff88023ed80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00006ff3987db000 CR3: 00000001dc357000 CR4: 00000000000407f0
Stack:
 ffff8801e52ab1b0 ffffc90015cb3b98 ffffffff94425f8d ffff8801e52ab158
 ffffc90015cb3bb8 ffffffffc04059d9 ffff8800734d6b90 ffff8801e52ab158
 ffffc90015cb3be8 ffffffffc0405ca2 ffff8800734d6b90 0000000000000000
Call Trace:
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc04059d9>] dbuf_rele_and_unlock+0x179/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
Code: 1f 84 00 00 00 00 00 55 48 89 e5 53 48 89 fb 48 8d 7b 04 c7 03 01 00 00 00 e8 a9 1a 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 53 5b c5 ff 80 43 04 01 5b 5d c3 66 66 66 2e 0f
RIP  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
 RSP <ffffc90015cb3b78>
---[ end trace 8fc20d6e09e2d611 ]---



general protection fault: 0000 [#3] SMP
CPU: 1 PID: 2623 Comm: zvol/3 Tainted:
task: ffff88009c1b0000 ti: ffff88009c1b0588 task.ti: ffff88009c1b0588
RIP: 0010:[<ffffffff94425f54>]  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
RSP: 0000:ffffc90015ca3b78  EFLAGS: 00010287
RAX: fefefefefefefefe RBX: ffff8800934e22a8 RCX: ffff880225a820c0
RDX: ffff8800934e22b0 RSI: ffff8801e268dbc0 RDI: ffff8800934e22ac
RBP: ffffc90015ca3b80 R08: 000000000007e19b R09: 0000000000000000
R10: ffff880225ac1818 R11: 000000000000000e R12: 0000000000000002
R13: ffff880226537930 R14: ffff880225feaa68 R15: ffff880226537948
FS:  00007137d8f89740(0000) GS:ffff88023ec80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00006ff3987db000 CR3: 00000001dc357000 CR4: 00000000000407f0
Stack:
 ffff8800934e22a8 ffffc90015ca3b98 ffffffff94425f8d ffff8800934e2250
 ffffc90015ca3bb8 ffffffffc04059d9 ffff8801e268dbc0 ffff8800934e2250
 ffffc90015ca3be8 ffffffffc0405ca2 ffff8801e268dbc0 0000000000000000
Call Trace:
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc04059d9>] dbuf_rele_and_unlock+0x179/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
Code: 1f 84 00 00 00 00 00 55 48 89 e5 53 48 89 fb 48 8d 7b 04 c7 03 01 00 00 00 e8 a9 1a 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 53 5b c5 ff 80 43 04 01 5b 5d c3 66 66 66 2e 0f
RIP  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
 RSP <ffffc90015ca3b78>
---[ end trace 8fc20d6e09e2d610 ]---



This is what happened:

Installed spl-0.6.3-r1/zfs-kmod-0.6.3-r1. Then upgraded to kernel 3.17.

During this upgrade I decided to forgo low latency pre-emption and with voluntary pre-emption or no pre-emption. Perhaps doesn't matter because with kernel 3.16 was fully pre-emptable and never segfaulted or OOPed.

Upon seeing that scary segfault and reading rcu_sched self-detected stall on CPU in dmesg I backtracked as fast as possible.

I thought that had to be related to RCU and pre-emption since RCU options in the kernel change according to pre-emption model.

But the problem persists )':

This zvol is formated with ext4.

Anybody know if this is fixable with proper option under RCU? To be honest I don't know how to configure the RCU options best for zfs.

thanks


Is this a multisocket SMP system? If so, this is likely the Linux kernel bug described here:

https://github.com/zfsonlinux/zfs/issues/3091

An earlier version of this post included a proposed patch based on a theory that I had. Unfortunately, I am told that the proposed patch doesn't fix this problem.
Back to top
View user's profile Send private message
taozhijiang
n00b
n00b


Joined: 26 Mar 2015
Posts: 5

PostPosted: Fri Mar 27, 2015 2:14 am    Post subject: Reply with quote

ryao wrote:
WWWW wrote:
Oh man!! I got one even scarier!!

Code:

INFO: rcu_sched self-detected stall on CPU
\x092: (84003 ticks this GP) idle=cbf/140000000000001/0 softirq=2798109/2798109
\x09 (t=84004 jiffies g=1276523 c=1276522 q=18648)
Task dump for CPU 2:
zvol/26         R  running task    12768  2646      2 0x00000008
 06edf9e7cf55bbf4 ffff88009c204200 ffffffff945fa800 ffff88023ed03db8
 ffffffff9407b671 0000000000000002 ffffffff945fa800 ffff88023ed03dd0
 ffffffff9407dd04 0000000000000003 ffff88023ed03e00 ffffffff94098cc0
Call Trace:
 <IRQ>  [<ffffffff9407b671>] sched_show_task+0xc1/0x130
 [<ffffffff9407dd04>] dump_cpu_task+0x34/0x40
 [<ffffffff94098cc0>] rcu_dump_cpu_stacks+0x90/0xd0
 [<ffffffff9409c13c>] rcu_check_callbacks+0x44c/0x6d0
 [<ffffffff9407eaea>] ? account_system_time+0x8a/0x160
 [<ffffffff9409e883>] update_process_times+0x43/0x70
 [<ffffffff940ad331>] tick_sched_handle.isra.18+0x41/0x50
 [<ffffffff940ad379>] tick_sched_timer+0x39/0x60
 [<ffffffff9409eea1>] __run_hrtimer.isra.34+0x41/0xf0
 [<ffffffff9409f715>] hrtimer_interrupt+0xe5/0x220
 [<ffffffff940227e2>] local_apic_timer_interrupt+0x32/0x60
 [<ffffffff94022d9f>] smp_apic_timer_interrupt+0x3f/0x60
 [<ffffffff94428c7b>] apic_timer_interrupt+0x6b/0x70
 <EOI>  [<ffffffff94427a0e>] ? _raw_spin_lock+0x1e/0x30
 [<ffffffff94425f47>] __mutex_unlock_slowpath+0x17/0x40
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc0405cb9>] dbuf_clear+0xd9/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160

INFO: rcu_sched self-detected stall on CPU
\x092: (20999 ticks this GP) idle=cbf/140000000000001/0 softirq=2798109/2798109
\x09 (t=21000 jiffies g=1276523 c=1276522 q=5491)
Task dump for CPU 2:
zvol/26         R  running task    12768  2646      2 0x00000008
 06edf9e7cf55bbf4 ffff88009c204200 ffffffff945fa800 ffff88023ed03db8
 ffffffff9407b671 0000000000000002 ffffffff945fa800 ffff88023ed03dd0
 ffffffff9407dd04 0000000000000003 ffff88023ed03e00 ffffffff94098cc0
Call Trace:
 <IRQ>  [<ffffffff9407b671>] sched_show_task+0xc1/0x130
 [<ffffffff9407dd04>] dump_cpu_task+0x34/0x40
 [<ffffffff94098cc0>] rcu_dump_cpu_stacks+0x90/0xd0
 [<ffffffff9409c13c>] rcu_check_callbacks+0x44c/0x6d0
 [<ffffffff9407eaea>] ? account_system_time+0x8a/0x160
 [<ffffffff9409e883>] update_process_times+0x43/0x70
 [<ffffffff940ad331>] tick_sched_handle.isra.18+0x41/0x50
 [<ffffffff940ad379>] tick_sched_timer+0x39/0x60
 [<ffffffff9409eea1>] __run_hrtimer.isra.34+0x41/0xf0
 [<ffffffff9409f715>] hrtimer_interrupt+0xe5/0x220
 [<ffffffff940227e2>] local_apic_timer_interrupt+0x32/0x60
 [<ffffffff94022d9f>] smp_apic_timer_interrupt+0x3f/0x60
 [<ffffffff94428c7b>] apic_timer_interrupt+0x6b/0x70
 <EOI>  [<ffffffff94427a0e>] ? _raw_spin_lock+0x1e/0x30
 [<ffffffff94425f47>] __mutex_unlock_slowpath+0x17/0x40
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc0405cb9>] dbuf_clear+0xd9/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160


general protection fault: 0000 [#4] SMP
CPU: 3 PID: 2625 Comm: zvol/5 Tainted:
task: ffff88009c1b1080 ti: ffff88009c1b1608 task.ti: ffff88009c1b1608
RIP: 0010:[<ffffffff94425f54>]  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
RSP: 0000:ffffc90015cb3b78  EFLAGS: 00010283
RAX: fefefefefefefefe RBX: ffff8801e52ab1b0 RCX: ffff880225a820c0
RDX: ffff8801e52ab1b8 RSI: ffff8800734d6b90 RDI: ffff8801e52ab1b4
RBP: ffffc90015cb3b80 R08: 00000000000823d1 R09: 0000000000000000
R10: ffff880225ac1818 R11: 000000000000000e R12: 0000000000000002
R13: ffff880226537930 R14: ffff880225feaa68 R15: ffff880226537948
FS:  00007137d8f89740(0000) GS:ffff88023ed80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00006ff3987db000 CR3: 00000001dc357000 CR4: 00000000000407f0
Stack:
 ffff8801e52ab1b0 ffffc90015cb3b98 ffffffff94425f8d ffff8801e52ab158
 ffffc90015cb3bb8 ffffffffc04059d9 ffff8800734d6b90 ffff8801e52ab158
 ffffc90015cb3be8 ffffffffc0405ca2 ffff8800734d6b90 0000000000000000
Call Trace:
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc04059d9>] dbuf_rele_and_unlock+0x179/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
Code: 1f 84 00 00 00 00 00 55 48 89 e5 53 48 89 fb 48 8d 7b 04 c7 03 01 00 00 00 e8 a9 1a 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 53 5b c5 ff 80 43 04 01 5b 5d c3 66 66 66 2e 0f
RIP  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
 RSP <ffffc90015cb3b78>
---[ end trace 8fc20d6e09e2d611 ]---



general protection fault: 0000 [#3] SMP
CPU: 1 PID: 2623 Comm: zvol/3 Tainted:
task: ffff88009c1b0000 ti: ffff88009c1b0588 task.ti: ffff88009c1b0588
RIP: 0010:[<ffffffff94425f54>]  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
RSP: 0000:ffffc90015ca3b78  EFLAGS: 00010287
RAX: fefefefefefefefe RBX: ffff8800934e22a8 RCX: ffff880225a820c0
RDX: ffff8800934e22b0 RSI: ffff8801e268dbc0 RDI: ffff8800934e22ac
RBP: ffffc90015ca3b80 R08: 000000000007e19b R09: 0000000000000000
R10: ffff880225ac1818 R11: 000000000000000e R12: 0000000000000002
R13: ffff880226537930 R14: ffff880225feaa68 R15: ffff880226537948
FS:  00007137d8f89740(0000) GS:ffff88023ec80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00006ff3987db000 CR3: 00000001dc357000 CR4: 00000000000407f0
Stack:
 ffff8800934e22a8 ffffc90015ca3b98 ffffffff94425f8d ffff8800934e2250
 ffffc90015ca3bb8 ffffffffc04059d9 ffff8801e268dbc0 ffff8800934e2250
 ffffc90015ca3be8 ffffffffc0405ca2 ffff8801e268dbc0 0000000000000000
Call Trace:
 [<ffffffff94425f8d>] mutex_unlock+0x1d/0x20
 [<ffffffffc04059d9>] dbuf_rele_and_unlock+0x179/0x350 [zfs]
 [<ffffffffc0405ca2>] dbuf_clear+0xc2/0x160 [zfs]
 [<ffffffffc0405d50>] dbuf_evict+0x10/0x400 [zfs]
 [<ffffffffc0405911>] dbuf_rele_and_unlock+0xb1/0x350 [zfs]
 [<ffffffffc04a8f70>] ? dsl_dataset_get_holds+0x17b0/0x2fe1e [zfs]
 [<ffffffffc0405bd1>] dmu_buf_rele+0x21/0x30 [zfs]
 [<ffffffffc0419f58>] dmu_tx_assign+0x8e8/0xc60 [zfs]
 [<ffffffffc041a30c>] dmu_tx_hold_write+0x3c/0x50 [zfs]
 [<ffffffffc04a27e8>] zrl_is_locked+0xa78/0x1880 [zfs]
 [<ffffffffc0292b66>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff9407bb10>] ? wake_up_state+0x20/0x20
 [<ffffffffc02929d0>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff940733e4>] kthread+0xc4/0xe0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
 [<ffffffff94427ec4>] ret_from_fork+0x74/0xa0
 [<ffffffff94073320>] ? kthread_create_on_node+0x160/0x160
Code: 1f 84 00 00 00 00 00 55 48 89 e5 53 48 89 fb 48 8d 7b 04 c7 03 01 00 00 00 e8 a9 1a 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 53 5b c5 ff 80 43 04 01 5b 5d c3 66 66 66 2e 0f
RIP  [<ffffffff94425f54>] __mutex_unlock_slowpath+0x24/0x40
 RSP <ffffc90015ca3b78>
---[ end trace 8fc20d6e09e2d610 ]---



This is what happened:

Installed spl-0.6.3-r1/zfs-kmod-0.6.3-r1. Then upgraded to kernel 3.17.

During this upgrade I decided to forgo low latency pre-emption and with voluntary pre-emption or no pre-emption. Perhaps doesn't matter because with kernel 3.16 was fully pre-emptable and never segfaulted or OOPed.

Upon seeing that scary segfault and reading rcu_sched self-detected stall on CPU in dmesg I backtracked as fast as possible.

I thought that had to be related to RCU and pre-emption since RCU options in the kernel change according to pre-emption model.

But the problem persists )':

This zvol is formated with ext4.

Anybody know if this is fixable with proper option under RCU? To be honest I don't know how to configure the RCU options best for zfs.

thanks


Is this a multisocket SMP system? If so, this is likely the Linux kernel bug described here:

https://github.com/zfsonlinux/zfs/issues/3091

An earlier version of this post included a proposed patch based on a theory that I had. Unfortunately, I am told that the proposed patch doesn't fix this problem.


Hi, ryao

I can only use 3.17 kernel, because the spl only support up to 3.17。
Is spl still under active development? Or will there be any upgrade schedule?

In addition, why you encourage to use vanilla-sources instead of gentoo-sources? Will gentoo-sources introduce some unstable factors??
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Fri Mar 27, 2015 2:28 am    Post subject: Reply with quote

taozhijiang,

if you can - what stops you from using the live ebuilds (9999) ?


There's active development in both spl & zfs:

https://github.com/zfsonlinux/spl/commits/master

https://github.com/zfsonlinux/zfs/commits/master




https://github.com/zfsonlinux/spl/pulls

https://github.com/zfsonlinux/zfs/pulls
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
taozhijiang
n00b
n00b


Joined: 26 Mar 2015
Posts: 5

PostPosted: Fri Mar 27, 2015 6:28 am    Post subject: Reply with quote

kernelOfTruth wrote:
taozhijiang,

if you can - what stops you from using the live ebuilds (9999) ?


There's active development in both spl & zfs:

https://github.com/zfsonlinux/spl/commits/master

https://github.com/zfsonlinux/zfs/commits/master




https://github.com/zfsonlinux/spl/pulls

https://github.com/zfsonlinux/zfs/pulls


Because what I most care about is stability.
In addition, file system is much more important than usual applications, I can not bear the lost of data.

Can I tell you I even do not open "~amd64" globally? :D :D :D
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Sat Mar 28, 2015 7:53 pm    Post subject: Reply with quote

I have a question,

When ZFS memory management is finally merged with Linux's one, will ZFS be able to keep up with in tandem with kernel releases?

Last stable release is stuck in 3.17 while current kernel is 4.0.

I believe the memory management fixe are in zfs 6.4 release which looks very close to release.

So BFS scheduler seems to be a fix? Too bad that hardened sources don't have BFS )-:

[quote=ryao]Is this a multisocket SMP system? If so, this is likely the Linux kernel bug described here:

https://github.com/zfsonlinux/zfs/issues/3091

An earlier version of this post included a proposed patch based on a theory that I had. Unfortunately, I am told that the proposed patch doesn't fix this problem.[/quote]

No, it's not a multi-socket system.


Oh I see the zfs/kernel memory management is the kmem-rework merge? By the looks of the discussions seems like a tough nut to crack.

I wish I could help fixing things, looks like fun. What kind of skills are needed for zfs in particular?

thanks.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Sun Mar 29, 2015 5:21 pm    Post subject: Reply with quote

it really seems to be a PITA:

https://github.com/zfsonlinux/zfs/pull/3225#issuecomment-87340779


I'd say:

if you're somewhat experienced - start with taking a look at issues and/or pull-requests and help to reference

relating problems



I was a mere ZFS-user for some time (using new features & pull-requests), then suddenly - during experimentation (testing new stuff, looking into things) - I wound up putting/merging two of the new building blocks together

(ABD: linear/scatter dual typed buffer for ARC https://github.com/zfsonlinux/zfs/pull/2129 - Lock contention on arcs mtx https://github.com/zfsonlinux/zfs/pull/3115)

and it worked out nicely :D


Just start with at what current level you're comfortable at and then continually you'll realize that you can contribute more and more
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Fri Apr 10, 2015 4:54 pm    Post subject: Reply with quote

wowowow zfs 6.4.0 release today!!

gentoo devs sleeping again??

cmon, mem integration, latest kernel support, etc...

how long to hit portage???
Back to top
View user's profile Send private message
peje
Tux's lil' helper
Tux's lil' helper


Joined: 11 Jan 2003
Posts: 100

PostPosted: Fri Apr 10, 2015 7:11 pm    Post subject: Reply with quote

@WWWW please think about your words:
Quote:
BeitragVerfasst am: Fr Apr 10, 2015 4:54 pm Titel: Antworten mit Zitat
wowowow zfs 6.4.0 release today!!

gentoo devs sleeping again??

you are not an customer who pays for some product...
cu peje
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Fri Apr 10, 2015 8:28 pm    Post subject: Reply with quote

https://bugs.gentoo.org/show_bug.cgi?id=546112


Give Richard some leeway, he lives in a different timezone and works on ZFS all the time anyway

do you pay or donate him ?

This distribution maintainership is on a voluntary basis after all


I'm sure you already tried modifying the ebuilds in your local overlay ?

Or if you're proficient with dev-vcs/git - just edit the live ebuilds to point to the 0.6.4 branches

edit:

just take a look at the current state of the zfs & spl master repositories:

as long they're at the "Tag zfs-0.6.4" or "Tag spl-0.6.4" commits you're fine to emerge via the live-ebuild

https://github.com/zfsonlinux/zfs/commits/master
https://github.com/zfsonlinux/spl/commits/master


I've also created non-moving 0.6.4 branches on my repository, in case you might need it:

https://github.com/kernelOfTruth/zfs/commits/zfs_master_09.04.2015_0.6.4
https://github.com/kernelOfTruth/spl/commits/spl_master_09.04.2015_0.6.4
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Sun Apr 12, 2015 1:05 pm    Post subject: Reply with quote

Available in bliss-overlay.

Thank you fearedbliss.
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Sun Apr 12, 2015 1:23 pm    Post subject: Reply with quote

I've got / on zfs. Is zpool upgrade going to break my system?
Back to top
View user's profile Send private message
peje
Tux's lil' helper
Tux's lil' helper


Joined: 11 Jan 2003
Posts: 100

PostPosted: Sun Apr 12, 2015 2:20 pm    Post subject: Reply with quote

@mrbrassie if you want to be 100% shure that you dont lose anything take an recursive snapshot of your root pool. (And store it elsewhere)
cu Peje
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Sun Apr 12, 2015 2:30 pm    Post subject: Reply with quote

peje wrote:
@mrbrassie if you want to be 100% shure that you dont lose anything take an recursive snapshot of your root pool. (And store it elsewhere)
cu Peje

I'm not really bothered too much if it breaks, I guess I'm really asking is if zpool upgrade supported in the linux port yet(?)

If there's only a chance it'll go wrong I'll make a stage4. Pool was created with 0.6.2-r5.

EDIT: Upgraded the pool last night, worked fine. Maybe placebo effect but my system feels a little snappier. Hopefully the occasional lockups I was experiencing disappear.


Last edited by mrbassie on Wed Apr 15, 2015 2:23 pm; edited 1 time in total
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Mon Apr 13, 2015 11:07 pm    Post subject: Reply with quote

:idea:

New SRM-modules (v0.6.4) for the SystemRescueCD are available (4.5.2)

in case you need to do maintenance stuff on your ZFS zpools



https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Tue Apr 21, 2015 9:12 am    Post subject: Reply with quote

hello,

I am running out patience already. Is the 0.6.4 version in the bliss overlay SAFE to use??

thanks!
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Tue Apr 21, 2015 9:58 am    Post subject: Reply with quote

WWWW wrote:
hello,

I am running out patience already. Is the 0.6.4 version in the bliss overlay SAFE to use??

thanks!


Seems at least as stable (so far) as previous versions I've used, I just have it on home p.c's though.
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Mon Apr 27, 2015 10:00 am    Post subject: Reply with quote

anybody got a reccomendation for an inexpensive ssd to use it as ZFS cache/dedup?

I've got a lower end Sandisk but I am not sure whether is reccomended and it isn't AF (4k format).

thanks.
Back to top
View user's profile Send private message
ryao
Retired Dev
Retired Dev


Joined: 27 Feb 2012
Posts: 132

PostPosted: Mon Apr 27, 2015 2:05 pm    Post subject: Reply with quote

WWWW wrote:
hello,

I am running out patience already. Is the 0.6.4 version in the bliss overlay SAFE to use??

thanks!


My apologies. I had held back 0.6.4 because I wanted GRUB2 support to be updated before it went into the tree, but time constraints on my end kept me from doing that in a timely fashion. I have decided to relax my requirement that GRUB2 be ready by going with an ewarn instead. 0.6.4 is now in the tree and people should be able to fetch it from the mirrors.

That said, my updates have not been as frequent as they used to be. I took a new job in August and have been working on a stable /dev/zfs API that I believe will eventually allow us to mix and match kernel modules with userland tools. Much of my time for ZFS has been spent on that. An initial pull request for review became public last week:

https://github.com/zfsonlinux/zfs/pull/3299

I highly recommend against deploying it in production at this time. It is meant solely for developer review and more needs to be done to ensure that the userland tools and kernel modules can be freely mixed. However, I am confident that this will be in place later this year and consequently, my time should become more available for shorter term priorities in the packaging. e.g. supporting newer kernels as soon as they become available

WWWW wrote:
anybody got a reccomendation for an inexpensive ssd to use it as ZFS cache/dedup?

I've got a lower end Sandisk but I am not sure whether is reccomended and it isn't AF (4k format).

thanks.


For L2ARC, anything recent should be fine.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Mon Apr 27, 2015 3:14 pm    Post subject: Reply with quote

ryao wrote:
My apologies. I had held back 0.6.4 because I wanted GRUB2 support to be updated before it went into the tree, but time constraints on my end kept me from doing that in a timely fashion. I have decided to relax my requirement that GRUB2 be ready by going with an ewarn instead. 0.6.4 is now in the tree and people should be able to fetch it from the mirrors.

That said, my updates have not been as frequent as they used to be. I took a new job in August and have been working on a stable /dev/zfs API that I believe will eventually allow us to mix and match kernel modules with userland tools. Much of my time for ZFS has been spent on that. An initial pull request for review became public last week:

https://github.com/zfsonlinux/zfs/pull/3299

I highly recommend against deploying it in production at this time. It is meant solely for developer review and more needs to be done to ensure that the userland tools and kernel modules can be freely mixed. However, I am confident that this will be in place later this year and consequently, my time should become more available for shorter term priorities in the packaging. e.g. supporting newer kernels as soon as they become available

Thank you for showing such professionalism.

WRT packaging, users like KoT can help (he's done it before..;) with others bug-wrangling.
Back to top
View user's profile Send private message
Yamakuzure
Advocate
Advocate


Joined: 21 Jun 2006
Posts: 2280
Location: Adendorf, Germany

PostPosted: Tue Apr 28, 2015 1:02 pm    Post subject: Reply with quote

@ryao : Thank you very much! And don't worry, your work on ZFS is still awesome!
_________________
Important German:
  1. "Aha" - German reaction to pretend that you are really interested while giving no f*ck.
  2. "Tja" - German reaction to the apocalypse, nuclear war, an alien invasion or no bread in the house.
Back to top
View user's profile Send private message
Yamakuzure
Advocate
Advocate


Joined: 21 Jun 2006
Posts: 2280
Location: Adendorf, Germany

PostPosted: Tue Apr 28, 2015 2:39 pm    Post subject: Reply with quote

ryao wrote:
I have decided to relax my requirement that GRUB2 be ready by going with an ewarn instead. 0.6.4 is now in the tree and people should be able to fetch it from the mirrors.
I have just upgraded and have a question about that ewarn:
Code:
 * Messages for package sys-fs/zfs-kmod-0.6.4:

 * This version of ZFSOnLinux includes support for new feature flags
 * that are incompatible with ZFSOnLinux 0.6.3 and GRUB2 support for
 * /boot with the new feature flags is not yet available.
 * Do *NOT* upgrade root pools to use the new feature flags.
 * Any new pools will be created with the new feature flags by default
 * and will not be compatible with older versions of ZFSOnLinux. To
 * create a newpool that is backward compatible, use
 * zpool create -o version=28 ...
 * Then explicitly enable older features. Note that the LZ4 feature has
 * been upgraded to support metadata compression and has not been
 * tested against the older GRUB2 code base. GRUB2 support will be
 * updated as soon as the GRUB2 developers and Open ZFS community write
 * GRUB2 patchese that pass mutual review.
This means, that grub2 will not be able to boot a system if /boot lies on a ZFS Pool, right? So if I have /boot on a separate ext4 partition, I should be save to upgrade, even if they are mounted under a root file system on ZFS?

Or does it mean anything ZFS related for booting, so I should not upgrade, yet, if my grub.cfg reads something like:
Code:
linux   /kernel-3.17.8-geek root=ZFS=gpool/ROOT/system ro dozfs (...)
?
_________________
Important German:
  1. "Aha" - German reaction to pretend that you are really interested while giving no f*ck.
  2. "Tja" - German reaction to the apocalypse, nuclear war, an alien invasion or no bread in the house.
Back to top
View user's profile Send private message
mrbassie
l33t
l33t


Joined: 31 May 2013
Posts: 771
Location: over here

PostPosted: Tue Apr 28, 2015 4:07 pm    Post subject: Reply with quote

Yamakuzure wrote:
This means, that grub2 will not be able to boot a system if /boot lies on a ZFS Pool, right? So if I have /boot on a separate ext4 partition, I should be save to upgrade, even if they are mounted under a root file system on ZFS?

Or does it mean anything ZFS related for booting, so I should not upgrade, yet, if my grub.cfg reads something like:
Code:
linux   /kernel-3.17.8-geek root=ZFS=gpool/ROOT/system ro dozfs (...)
?


I have /boot on a seperate partition (ext2) and upgraded to 0.6.4 and ran zpool upgrade and it's all fine.
Back to top
View user's profile Send private message
WWWW
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2014
Posts: 143

PostPosted: Tue Apr 28, 2015 4:27 pm    Post subject: Reply with quote

thank you for the work.

Is the 0.6.4.1 too far behind?

Should I try 0.6.4.1 from bliss overlay?

I went through all the trouble setting up layman and such. Now I can choose official 0.6.4 or non-official 0.6.4.1.

There seems to be important fixes:

Code:

-  Fixed io-spare.sh script for ZED.
-  Fixed multiple deadlocks which might occur when reclaiming memory.
-  Fixed excessive CPU usage for meta data heavy workloads when reclaiming the ARC.


Deadlocking with memory seems like a delicate issue.

thanks again.

ps.: oh packaging, that looks like an easy task, is it possible to help with packaging? How to start helping?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page Previous  1, 2, 3, 4, 5, 6, 7, 8  Next
Page 5 of 8

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum