Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
What would it take to build an active/passive NAS?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
DingbatCA
Guru
Guru


Joined: 07 Jul 2004
Posts: 382
Location: Portland Or

PostPosted: Thu Apr 12, 2018 6:01 pm    Post subject: What would it take to build an active/passive NAS? Reply with quote

I have wondered this one for a long time. I have a SAS disk shelf (16 drives) with 4X paths. I can plumb them all into a single system but I can also split those path and send two path to two systems. Is it possible, with some management layer, to build an active/passive cluster sharing the same disks? In this case all 16 disks in a RAID 6 configuration?

Thoughts?
Back to top
View user's profile Send private message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1707

PostPosted: Thu Apr 12, 2018 6:57 pm    Post subject: Reply with quote

2-headed storage servers do exist, and you can even buy enterprise-grade* stuff like that.
Active-passive is relatively easy. You need some way to negotiate which one is active (say, corosync? keepalived? Floating IP makes traffic redirection easy) and split-brain detection: you want to make sure the disconnected node is actually dead and not corrupting your data. Can be achieved by means of a 3rd machine on the network that helps with voting.
Some solutions (e.g. vmware) use storage heartbeat. They don't explain how it works, but you can imagine writing to some special area and reading back from it on a short interval.


*it takes a whole enterprise to to pay for that stuff
Back to top
View user's profile Send private message
DingbatCA
Guru
Guru


Joined: 07 Jul 2004
Posts: 382
Location: Portland Or

PostPosted: Thu Apr 12, 2018 7:02 pm    Post subject: Reply with quote

I am fine with IP level stuff. Dealing with the split brain problem is rather easy when using something external, like a Raspberry Pi.

What I am more wondering about is the disk level stuff.
Back to top
View user's profile Send private message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1707

PostPosted: Fri Apr 13, 2018 6:44 pm    Post subject: Reply with quote

What sort of problems with disks you expect?
If you have 4 links and they work in multipath mode, this part should be handled for you. You just want to make sure you don't corrupt the filesystem by some race condition.
So, only one head may be active at any time.
Switch must be done in a clean manner. The head that activates must start with empty buffers (filesystem not mounted, buffers dropped), and the filesystem should be able to recover from interrupted writes in no time (head going down before write completes), so you don't have to wait for a full scan.
Back to top
View user's profile Send private message
DingbatCA
Guru
Guru


Joined: 07 Jul 2004
Posts: 382
Location: Portland Or

PostPosted: Fri Apr 13, 2018 9:12 pm    Post subject: Reply with quote

The disk caching layer is what is scaring me.

So lets just have a thought experiment:

Config:
SAS disk shelf with 16 drives, 4 paths, software (MD) RAID 6, XFS.
Two systems attached with 2 paths each.
Something running a heat beat, like corosync, or keepalived, with a floating IP.
2X heart beats, one over net, one over serial console.
Lets go with NFS V3. But in all honesty I don't care very much about the higher level protocol, for this thought experiment.

Failure state; I ripped the power cords out of system1, which was active. A client was/is uploading a 100GB file.

Now what? corosync/keepalived says "system1" is down. System2 should take over. RAID is spun up, 'mdadm --assemble --scan'. RAID is mounted 'mount /dev/disk/by-uuid/12345678 /data'. NFS service is started. Floating IP moves over to System2. Upload just keeps on going?

All the data that was in the system cache, and RAID cache on system1 is just gone? I would guess the 100GB file is missing a chunk in the middle?
Back to top
View user's profile Send private message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1707

PostPosted: Sat Apr 14, 2018 2:37 am    Post subject: Reply with quote

Well if you remove the active head during a write operation, that operation will fail. Client's operating system will not receive confirmation and report failure to the application. Application should check exit code from write. Some applications don't do that, but this issue is not exclusive for multiheaded setups. So, write fails, application gets notified and can retry or report the problem to the user.
In the mean time, the second head hits timeout on heartbeat and takes over. Checks for interrupted operations, and if it finds any, it rolls them back (or replays entries from journal, if possible, to complete those writes). Once filesystem is self-consistent, it can be mounted and exposed to the client via NFS. This means, that any NFS access to the mountpoint itself should fail. Once you mount your shelf on top of that mountpoint, permissions on filesystem from your shelf will shadow mountpoint's own permissions.
NFS assumes that client that attempts to perform any operation has previously mounted the exposed share, so it is able to pick up a session interrupted e.g. because of server restart.


You may use synchronous mode to be on the safe side: no confirmations will be sent until the data is actually written.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum