Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Btrfs nas?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1343
Location: Fayetteville, NC, USA

PostPosted: Sat Jun 18, 2016 12:51 am    Post subject: Btrfs nas? Reply with quote

Before mentioning Rockstor, don't. I like Rockstor, but due to some oddball design where it sees entire disks as block devices which cannot be partitioned, you need three disks for a RAID1 (OS, then the array). That is plain dumb and wasteful. As such, I want to build my OWN NAS using Gentoo and BTRFS.

I am thinking of doing a BTRFS RAID10 array using four 500GB disks I have laying around. This would give me 1TB of space with increased performance. My goal is to install Gentoo in as small a configuration as possible. It will be shell-only, no Apache or anything, Samba for being a domain member and hosting shares to my gaming rigs, smartmontools on a cron job to monitor disk health, and probably cron jobs for maintaining BTRFS.

I want to control disk spin-downs. I really don't want them to spin down. Spin-downs kill desktop disks much faster than laptop disks, and if I go on vacation, I'll just shut it down myself!

So what advice can you give me before I begin this project? The OS partitions will be on BTRFS also, in case a drive dies the OS can still function. Oh and that reminds me, how about hot-swap in the event of a failed disk?
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
vaxbrat
l33t
l33t


Joined: 05 Oct 2005
Posts: 731
Location: DC Burbs

PostPosted: Sat Jun 18, 2016 5:11 pm    Post subject: It's very doable Reply with quote

I would do the same thing myself if I only had one box. Instead I use 6 in a ceph cluster, but I'm based on btrfs mirror sets for my object stores. If you want to consider this, you would want a minimum of three boxes, two for monitors and object stores, and a single one as the quorum monitor, metadata server and the host for sharing samba and nfs.

Based on your comments, I assuming you intend to put your system boot and root on the same 4 drives as the mirror set. Use btrfs as your system root along with grub2 (or other boot loader that is btfs savvy) and a kernel that has btrfs support built natively and not as a module. I did a writeup on the wiki a few years back about btrfs system mirrors that may still be useful:

https://wiki.gentoo.org/wiki/Btrfs/Native_System_Root_Guide

Make sure you enable raid10 for both metadata and data. Also use a label to make things more readable in the fstab:

Code:
mkfs -t btrfs -L NASMIRROR -m raid10 -d raid10 /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3


(I assume your boots and roots will be /dev/sda1,/dev/sda2.... etc)

There are two chores you want to stay on top of at a minimum, defragmentation and scrubbing. For defragging, I use the autodefrag option when mounting the sets. For scrubbing I would have a crontab entry, but ceph has its own concept of scrubbing in the background that runs through each of its placement groups (PGs) every two weeks and effectively does the btrfs scrubs for me.

Use on the fly compression in btrfs. At the worst it's a wash if most of your files are compressed images and video, etc. If your files can shrink to half the size or less, any modern CPU with SSE support can do the compress/decompress with only a few percent of a single core. If the amount of data transferred on each read/write is cut in half or more, that can be a huge increase in effective disk performance. The added latency of the compression instructions will be a drop in the bucket compared to the amount of time it takes to bring data in and out from disk.

After running the array for a while you might consider doing balances, but that would probably be a manual task you do that takes a day or so each time you schedule it. That can happen in the background, but it will slow down performance while it is running. I've only done it once or twice in my cluster on a single object store at a time to see whether it caused any trouble. I decided that it wasn't worth the extra work.

Use subvolumes so that you can do snapshots and have the option of doing individual shares for nfs and samba. At work I helped out an area with a ReadyNAS box that was running a version of debian new enough to have btrfs support. However their management interface still defaults to doing things with ext4 or xfs on top of mdadm only. I rooted in and set up the data partitions as a btrfs array by hand and then showed their admin how to take snapshots after each daily backup. In the past, he was doing a full image of its share each night and could only fit about a week or two of backups on the ReadyNAS. When I was done, the on the fly lzo compression had shrunk the space taken by the share down to less than 10% of the drive space. Then I had him do a btrfs snapshot of the share each night instead of the full copy. The last time I checked, I think he had gone a full year of effectively doing a full image of the filesystem each night in the same amount of space that would have been filled after two weeks. The admin's face looked something like this once the realization began kicking in 8O

Your fstab will probably look something like this with /nas being shared read/write and /backups being shared readonly over nfs and samba

Code:

LABEL=root           /                btrfs            defaults,noatime,autodefrag,compress=lzo     0   0
LABEL=NASMIRROR      /raid            btrfs            defaults,noatime,autodefrag,compress=lzo     0   0
LABEL=NASMIRROR      /nas             btrfs            defaults,noatime,autodefrag,compress=lzo,subvol=nas   0   0
LABEL=NASMIRROR      /backups         btrfs            defaults,noatime,autodefrag,compress=lzo,subvol=snapshots   0   0
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1343
Location: Fayetteville, NC, USA

PostPosted: Tue Jun 21, 2016 6:35 pm    Post subject: Reply with quote

I would use UUID's, but I am not familiar with snapshots yet. I did some basic reading on them, but nothing major. I do intend on learning them and using them though. I WILL be using an eSATA disk as backup as well.

You gave me a BUNCH of info and I believe that I am ready to move on this. Step one however, is to create my new AD DC using Gentoo, then move on to the snapshots and backup. I am at work and have to keep this quick, but I wanted to thank you for your informative reply.
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
alexcortes
Apprentice
Apprentice


Joined: 18 Dec 2011
Posts: 205
Location: Rio de Janeiro, Brazil

PostPosted: Tue Jun 21, 2016 9:20 pm    Post subject: Reply with quote

Unless the objective IS to use BTRFS I would go to ZFS instead. Also, for NAS installations I usually prefer to have the system on a separated small disk, ever on a flash device.

https://wiki.gentoo.org/wiki/ZFS
Back to top
View user's profile Send private message
vaxbrat
l33t
l33t


Joined: 05 Oct 2005
Posts: 731
Location: DC Burbs

PostPosted: Wed Jun 22, 2016 1:06 am    Post subject: btrfs versus zfs Reply with quote

I think the telling thing about this is that while Oracle owns the IP for both zfs and btrfs (inheirited from the Sun buyout), they use btrfs for their dbms backend on their bastardized version of RHEL. Also Facebook is a major proponent of btrfs. I've personally been using it for about 4 years now either as standalone raids or as the underlying filesystem to ceph (for about 2 years since Firefly). The only rare hiccups I've ever had with it were do to bad memory on consumer hardware (ECC would fix that) or from hard lockups due to hardware or power failure. In those situations I've been able to copy just about everything off of the bad filesystem since it mounts readonly for all but the most screwed up of cases. When you run a clustered filesystem like ceph on top, you just drop the Object Store Daemon (OSD) that runs the bad filesystem, re-intialize and then re-add the OSD. The cluster does all of the repair work necessary in the background to rebuild the replicas.

I might consider using zfs if I were on a bsd setup, but btrfs has a few more advantages including the ability to both grow and shrink the volume pool on the fly.
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1343
Location: Fayetteville, NC, USA

PostPosted: Wed Jun 22, 2016 7:05 pm    Post subject: Reply with quote

ZFS requires too much overhead. Loads of RAM, for example. I like BTRFS and use it on a few laptops here that run Gentoo. We use it on the home partition with zlib compression (save more space since data is duplicated) and it works fine.
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
alexcortes
Apprentice
Apprentice


Joined: 18 Dec 2011
Posts: 205
Location: Rio de Janeiro, Brazil

PostPosted: Wed Jun 22, 2016 9:58 pm    Post subject: Reply with quote

Just to point out: ZFS uses a lot of ram just if you use dedupe.
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1343
Location: Fayetteville, NC, USA

PostPosted: Wed Jun 22, 2016 10:11 pm    Post subject: Reply with quote

Yes, and BTRFS supports deduplication also, but at much less cost from what I have read. I believe that combining deduplication with zlib compression should offer a lot of space saving on a RAID array used for file storage.
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1343
Location: Fayetteville, NC, USA

PostPosted: Thu Jun 23, 2016 2:17 am    Post subject: Reply with quote

Vaxbrat, I am new to CoW on this level. I have used systems with CoW before, but never configured or fully understood it. Can you help point me in the right direction here? I want to understand how it works and how to set it up. I have read reports that using CoW and compression results in less compression or no compression.
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
brownandsticky
n00b
n00b


Joined: 19 Nov 2014
Posts: 4

PostPosted: Thu Jun 30, 2016 1:16 pm    Post subject: Reply with quote

If you suspect the effort and cost will get too high. I'll give a nod to Netgear RN100 series of NAS's.
They use BTRFS on top of md RAID.
A dual bay' is serving well; initially as a NAS and now as shared storage for Xenserver. Admittedly the throughput limits it's usefulness as a Storage Repository.
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1343
Location: Fayetteville, NC, USA

PostPosted: Fri Jul 01, 2016 1:08 am    Post subject: Reply with quote

I just came upon a question. How do I setup fstab with this? Each partition has a unique UUID and the RAID has a UUID. For example, sda2 and sdb2 have UUID's, but when formatted with BTRFS in RAID1, the RAID1 virtual device has a UUID. Can I use the RAID UUID in fstab? If not, what happens if I use the UUID for sda2 and the entire disk fails? Does it know to mount sdb2 instead? Does it not mount at all and crap itself? What?
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
vaxbrat
l33t
l33t


Joined: 05 Oct 2005
Posts: 731
Location: DC Burbs

PostPosted: Fri Jul 01, 2016 3:03 am    Post subject: Uuid Reply with quote

You probably want to use the UUID that you get back from btrfs fi show:

Code:
btrfs fi show
Label: 'cephosd0'  uuid: 87a86762-05f6-44fa-860b-f96df085d967
        Total devices 3 FS bytes used 4.00TiB
        devid    1 size 3.64TiB used 2.69TiB path /dev/sdc
        devid    2 size 3.64TiB used 2.69TiB path /dev/sdd
        devid    3 size 3.64TiB used 2.69TiB path /dev/sde

Label: 'thufirraid'  uuid: 5f6e51a3-d8e7-41e1-bdb9-3cd9be0bf7fe
        Total devices 1 FS bytes used 2.63TiB
        devid    1 size 3.64TiB used 2.64TiB path /dev/sdb


But as you saw in my example, I just set a LABEL and use that instead. It's also smart enough that you can just use one of the member devices or partitions
Back to top
View user's profile Send private message
vaxbrat
l33t
l33t


Joined: 05 Oct 2005
Posts: 731
Location: DC Burbs

PostPosted: Fri Jul 01, 2016 3:20 am    Post subject: compression Reply with quote

It's safe to turn on compression and not worry whether it works out or not. When writing an extent, the btrfs worker runs the compressor on the buffer before sending it out. If it compresses down, the result is written out. If the result is the same or larger, the original buffer is simply stored. Basically its the same approach taken if you try to run 7zip, jar or the like on a directory with contents that's already compressed such as jpg images or compressed video.

So the resulting filesystem size really depends on what you are putting on it. Resulting i/o performance may be substantially faster than expected if the compression is on the order of 2:1, or more or it will be more on par with whatever bandwidth you have on the hardware. For a rule of thumb, the sequential write performance of a spinning disk pretty much maxes out at 100mb/sec.

Because my cluster spreads out its I/O, it's not uncommon for me to see copies of 10gb video files going out to the object stores hitting over 200mb/sec even when they don't compress well, but that's what you get when you run with the big kids :D
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 17769

PostPosted: Sat Jul 02, 2016 1:14 am    Post subject: Re: btrfs versus zfs Reply with quote

Sorry for the interruption, I'll keep it brief...

vaxbrat wrote:
I think the telling thing about this is that while Oracle owns the IP for both zfs and btrfs (inheirited from the Sun buyout), they use btrfs for their dbms backend on their bastardized version of RHEL.
Do you have any references for the dbms stuff (searching isn't returning relevant results)? I'm curious how it's used. I know they use and push ZFS a lot, so it may only be telling wrt the license incompatibility.
_________________
I honestly think you ought to sit down calmly, take a stress pill, and think things over.
Back to top
View user's profile Send private message
vaxbrat
l33t
l33t


Joined: 05 Oct 2005
Posts: 731
Location: DC Burbs

PostPosted: Wed Jul 06, 2016 1:52 am    Post subject: Don't have a cite Reply with quote

Quote:
Do you have any references for the dbms stuff (searching isn't returning relevant results)? I'm curious how it's used. I know they use and push ZFS a lot, so it may only be telling wrt the license incompatibility.


I don't remember where I heard that but it would make sense. Since the Oracle Enterprise Linux is a ripoff of RHEL, btrfs would be packaged while zfs would not by default.[/b]
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum