Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
ZFS issues on my host
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
mattlqx
n00b
n00b


Joined: 28 Oct 2012
Posts: 3
Location: San Jose, CA

PostPosted: Sun Oct 28, 2012 5:06 pm    Post subject: ZFS issues on my host Reply with quote

I have a pool with a fair number of fs and snapshots on it (about 15 filesystems and about 20 snaps per). At some point, the pool seems to have gotten itself into a funky state. Yesterday, I powered down the system install a hard drive (didn't touch any devices in the zpool) and now when I try to do anything to zfs or zpool, it will block indefinitely until the host freezes. It takes a few hours for the host to crash. The odd part is that as soon as the pool gets touched, there is a trickle of iops that seemingly put the drive up to 100% utilization according to iostat.

Code:
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00  158.00    0.00   349.00     0.00     4.42     2.94   18.68   18.68    0.00   6.33 100.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-4              0.00     0.00  158.00    0.00   349.50     0.00     4.42     2.94   18.68   18.68    0.00   6.33 100.00


The device can read out at about 15000 rKB/s normally, so I'm not sure what zfs is doing here but I'm not sure how to get more information about what it's doing. Processes in top look like this:

Code:
 4173 root      39  19     0    0    0 D    1  0.0   0:41.25 z_fr_iss/1
4172 root      39  19     0    0    0 R    1  0.0   0:41.05 z_fr_iss/0
4174 root      39  19     0    0    0 D    1  0.0   0:37.24 z_fr_iss/2
4140 root      39  19     0    0    0 S    1  0.0   0:23.73 z_rd_int/0
4141 root      39  19     0    0    0 R    1  0.0   0:23.82 z_rd_int/1
4142 root      39  19     0    0    0 S    1  0.0   0:22.21 z_rd_int/2


I've tried using the git version of spl, zfs, zfs-kmod but same results. I've also upgraded the kernel (3.5.7) and driver for my raid card. It's also worth noting that there are other lvols on sda using XFS and they work fine, so this is definitely something odd with zfs.

So far I've tried to scrub the pool, moved my zpool.cache and tried to reimport, tried to prune a snapshot, tried to create a snapshot.... all commands block indefinitely and appear not to do anything. Can this pool be saved? Any help is appreciated.
Back to top
View user's profile Send private message
mattlqx
n00b
n00b


Joined: 28 Oct 2012
Posts: 3
Location: San Jose, CA

PostPosted: Mon Oct 29, 2012 12:09 am    Post subject: Reply with quote

So it looks like after about two hours a couple more processes kick in...

Code:
 2442 root       0 -20     0    0    0 D    4  0.0   0:43.07 txg_sync
 2301 root       0 -20     0    0    0 S    2  0.0   0:29.21 arc_adapt


And slowly start consuming memory. I have 4gb on this host and the following set for arc limits in modprobe.d:

Code:
options zfs zfs_arc_max=2147483647 zfs_vdev_cache_size=536870912


From the looks of the trend, this is what eventually runs the host out of memory and crashes it.
Back to top
View user's profile Send private message
mattlqx
n00b
n00b


Joined: 28 Oct 2012
Posts: 3
Location: San Jose, CA

PostPosted: Sat Nov 24, 2012 7:06 am    Post subject: Reply with quote

I added more RAM to my host and it still dies with plenty free, so I'm not entirely sure what the actual issue is but it consistently freezes when my nightly rsync's fire and try to back up data to the pool.
Back to top
View user's profile Send private message
ryao
Developer
Developer


Joined: 27 Feb 2012
Posts: 122

PostPosted: Thu Nov 29, 2012 8:37 pm    Post subject: Reply with quote

I am afraid that the forum is not a great place to get my attention. It is far better to send an email. That email would preferably be to the ZFSOnLinux mailing list, so that everyone involved can see it. The address is zfs-discuss@zfsonlinux.org.

With that said, it is hard to make sense of your issue without knowing your pool configuration. That means what drives you have, how you partitioned them, how ZFS goes on those partitions and how the vdevs are organized. It would also be useful to know if these are advanced format drives, if you created your pool with ashift=12, if this pool stores / and what versions of sys-kernel/spl, sys-fs/zfs-kmod and sys-fs/zfs you are using.

It would also be useful to file a bug report in the upstream issue tracker:

https://github.com/zfsonlinux/zfs/issues/new

By the way, I was preoccupied with the aftermath of Hurricane Sandy when you tried to contact me through the forums, so I was in no position to help.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum