Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] ZFS ARC and two zpools on a computer
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
as.gentoo
Apprentice
Apprentice


Joined: 07 Aug 2004
Posts: 284

PostPosted: Thu Aug 10, 2017 9:47 pm    Post subject: [SOLVED] ZFS ARC and two zpools on a computer Reply with quote

Hi.

I tried to find out if there is a difference in memory usage when there are two zpools used on a computer.
So I made a script that creates two pools, copies data and prints the output of 'free -h' and 'cat /proc/spl/kstat/zfs/arcstats'.

The copied data is 3,8 GB video files.
L1ARC = level 1 adaptive replacement cache (cache in memory), FS = filesystem, DS = zfs dataset
When the L1ARC grows or shrinks the column "used" (of 'free -h') will change, not the column "buff/cache".

Code:
### after pool creation
#               total        used        free      shared  buff/cache   available
# Mem:            31G        2.6G         28G         83M        559M         28G
# c                               4    34277888
# c_max                           4    16819320832
#
### difference to before-zpool-creation: "used" +100 MB, buff/cache +59M ]-> not relevant

### after 'cp /none_zfs/*.mp4 /zp1'
#                total        used        free      shared  buff/cache   available
#  Mem:            31G        6.4G         20G         85M        4.4G         24G
# c                               4    4103057408
# c_max                           4    16819320832
#
### L1ARC +4GB, "used" +3,8 GB, "free" -8 GB, buff/cache +3,8 GB
### ===== Looks like data was put into the ARC _and_ the kernel buffer/cache! Sounds logic, but is not desired here.

### after 'cp /zp1/* /zp2/'
#               total        used        free      shared  buff/cache   available
# Mem:            31G         10G         16G         82M        4.7G         20G
# c                               4    8167341056
# c_max                           4    16819320832
#
### L1ARC +3,8 GB, "used" +3,6 GB, "free" -4 GB, buff/cache +0,3 GB
### ===== This indicates that the ARC is not shared between the pools. It grew although the data is in the (source zpools) ARC already!

When I do some more copying with the same data the _L1ARC_grows_again_ although by less than 3,8 GB. :-(


If I didn't make a mistake the conclusion is that the L1ARC is not shared by the zpools on a system. :( Is there a way to set it up to have a shared L1ARC?
The background is that I want to use a single drive for gaming and have the benefits of ZFS like snapshots (backup, undelete), compression, L1ARC (data kept based on usage and not on "keep latest") and dynamic DS size.

As well, if data is copied between a pool and another FS that data is kept twice, in the ZFS L1ARC and the kernels FS-cache.
Is there a way to prevent that that data is put into the kernel cache if data is copied to a zpool?

Apart from 'echo 3 > /proc/sys/vm/drop_caches' I didn't find anything regarding flushing the L1ARC (only). If I call that command the kernel-FS cache is flushed too. Do documents exist that specifically refer to settings for the ARC, how it works…? Looking into 'man zfs-module-parameters' doesn't show something like that. I could set the zfs_arc_max to the value of zfs_arc_min and then back but that doesn't feel like how it should be done.
As well, both zpools L1ARC is flushed. If I'd get along with 'one L1ARC per zpool', how would I address the L1ARC of a specific zpool - e.g. in order to set the L1ARCs min and max memory size with different values for zpool1 and zpool2?

Thanks in advance!


Last edited by as.gentoo on Sat Aug 19, 2017 12:11 am; edited 1 time in total
Back to top
View user's profile Send private message
bunder
Bodhisattva
Bodhisattva


Joined: 10 Apr 2004
Posts: 5277

PostPosted: Sat Aug 12, 2017 6:48 am    Post subject: Reply with quote

ARC is shared by all pools attached to the system.

If you don't want a dataset cached you can set

Code:
zfs set primarycache=none mypool/mydataset
zfs set secondarycache=none mypool/mydataset (if you're using an L2ARC)


Echoing 1 to drop_caches will clear the linux cache (see buffers and cache) and leave the ZFS ARC (which shows up as "used") intact, whereas 2 or 3 will clear the ARC as well. However if you find yourself doing this a lot, you might want to lower your zfs_arc_max, upgrade to zfs 0.7.x (along with genkernel 3.5.1.1), or both.

Feel free to drop by #zfsonlinux on freenode if you need more specific pointers. :D
Back to top
View user's profile Send private message
as.gentoo
Apprentice
Apprentice


Joined: 07 Aug 2004
Posts: 284

PostPosted: Sat Aug 12, 2017 10:23 am    Post subject: Reply with quote

bunder wrote:
ARC is shared by all pools attached to the system.

That's good news! Do you have a clue why my tests indicate that the L1ARC it is not shared?
The arc max size (c_max) is 17179869184 bytes (16GB).

Here is what the script (basically) does:
Code:
echo 3 > /proc/sys/vm/drop_caches
grep "^c " /proc/spl/kstat/zfs/arcstats
# c                               4    33554432

zpool create zp1 /dev/sda /dev/sde -f
zpool create zp2 /dev/sdb -f
grep "^c " /proc/spl/kstat/zfs/arcstats
# c                               4    34277888

# ~~ cp ext to zpool
cp --target-directory=/zp1 $( find /home/ohmy/videos/ -type f )
grep "^c " /proc/spl/kstat/zfs/arcstats
# c                               4    4103057408             ###  L1ARC = +3,8 GB

# ~~ cp zpool to other zpool
cp --target-directory=/zp2 $( find /zp1 -type f )
grep "^c " /proc/spl/kstat/zfs/arcstats
# c                               4    8167341056             ### L1ARC again +3,8 GB … Isn't that data in the L1ARC already?

cp --target-directory=/zp1 $( find /home/ohmy/videos/ -type f )
grep "^c " /proc/spl/kstat/zfs/arcstats
# c                               4    11001876448            ### L1ARC= +2,6 GB … What is put into the cache here, and why?

Quote:
If you don't want a dataset cached you can set
Code:
zfs set primarycache=none mypool/mydataset
zfs set secondarycache=none mypool/mydataset
(if you're using an L2ARC). Echoing 1 to drop_caches will clear the linux cache (see buffers and cache) and leave the ZFS ARC (which shows up as "used") intact, whereas 2 or 3 will clear the ARC as well.
That helps!
Can I drop the L1ARC but not the kernel cache? How about setting it up that the data is not cached twice? I do not want to drop the kernel cache again and again - that would drop everything else that is held there, not only what's transferred between ZFS and an other FS.

Can I tell the L1ARC - once it is saturated - to hold data on a strictly most-frequently-used basis only? AFAIK a mix of MFU (frequently) + MRU (recently) is the default behavior.
Quote:
Feel free to drop by #zfsonlinux on freenode if you need more specific pointers. :D
I do not have good experience with IRC… I guess I'll try anyways.
Back to top
View user's profile Send private message
bunder
Bodhisattva
Bodhisattva


Joined: 10 Apr 2004
Posts: 5277

PostPosted: Sun Aug 13, 2017 6:43 am    Post subject: Reply with quote

as.gentoo wrote:
Do you have a clue why my tests indicate that the L1ARC it is not shared?


I'm afraid I don't off hand.

Quote:
Can I drop the L1ARC but not the kernel cache?


If you set zfs_arc_max to something small (the minimum is 32mb), it should evict all objects in the cache.

Quote:
How about setting it up that the data is not cached twice?


If the data originated from a ZFS filesystem and is being written to a ZFS filesystem, it shouldn't get cached twice. Some linux utilities do make use of the traditional cache however. I don't think there is much you can do to get around that.

Quote:
Can I tell the L1ARC - once it is saturated - to hold data on a strictly most-frequently-used basis only?


Possibly, not 100% on that, there are a whole whack of tuneables that can be adjusted, but its usually best to leave them alone and let zfs do it's thing. If you're worried about things getting discarded from the ARC, you can add a SSD as an L2ARC, which might help, but its usually best to just add more memory to the system.
Back to top
View user's profile Send private message
as.gentoo
Apprentice
Apprentice


Joined: 07 Aug 2004
Posts: 284

PostPosted: Sun Aug 13, 2017 5:49 pm    Post subject: Reply with quote

bunder wrote:
as.gentoo wrote:
Do you have a clue why my tests indicate that the L1ARC it is not shared?
I'm afraid I don't off hand.

Let's see what the devs say: https://github.com/zfsonlinux/zfs/issues/6505
Back to top
View user's profile Send private message
as.gentoo
Apprentice
Apprentice


Joined: 07 Aug 2004
Posts: 284

PostPosted: Fri Aug 18, 2017 11:11 pm    Post subject: Reply with quote

Let's see.

The ARC can contain data from all pools. It does not share it but provides data for reading.
Data is put into the ARC when read and write operations are executed - obviously only "new" or changed data. What's in the ARC can be supplied much faster than reading it from HDD/SSD because it is held in memory (at least the L1ARC).

Because pools can have datasets (DS) with the same name, ZFS needs to know which data belongs to which dataset in which pool. Obviously zpool-1/backup and zpool-2/backup can have different data.

So the same data can be in the ARC twice but it doesn't need to. You could say that cp file.bak from pool-1 to pool-2 is a special case.

Code:
For simplicity I used a file instead of blocks here. Of course the internals are a "bit" more complicated but I hope it helps to understand the concept.

metadata in the ARC: content of file x from zpool-1 is at position 34
metadata in the ARC: content of file x from zpool-2 is at position 99

the "payload" data of the ARC in the memory would look like [...34:ABC...99:123...]

request: cp file x from /zp1/backup1 to /zp2/backup2
action:  is file x from zpool1 in ARC? Yes: Take it from the ARC and write "ABC" to zpool2.
                                       No?: Read data from zpool1, write it into the ARC and to zpool2.



EDIT: I was told on #zfsonlinux (IRC) that this does not only apply to DS but to files in different directories, too. Actually most of these information is from there. Thanks guys!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum