Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Unsynchronized NFS despite 'sync'
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
jyoung
Apprentice
Apprentice


Joined: 20 Mar 2007
Posts: 225

PostPosted: Wed Jun 26, 2019 5:21 am    Post subject: Unsynchronized NFS despite 'sync' Reply with quote

Hi Folks,

I'm having some trouble setting up synchronous read/writes on an NFS folder. The situation is as follows:

client1 reads a file in the NFS folder
client1 commits changes to the file
several seconds pass
client2 opens the file, and discovers that the contents differ from what client1 wrote; in some cases, the contents are what client1 received

The server is exporting the folder as such:
/export/cluster *(insecure,rw,sync,no_subtree_check,root_squash,insecure,all_squash,anonuid=1001,anongid=1001,no_wdelay)

And from the clients' /etc/fstab:
<server's IP>:/export/cluster /cluster nfs rw,hard,_netdev 0 0

I've also got an NTP daemon running to keep the clocks synced, and I've manually checked that both the clients and the server have the same time.

I've also coded program accessing the file top open and close the parent directory before and after opening the file --- I've read that that can flush an NFS cache.

I am at a loss as to why this would be an issue with this configuration; if any has any ideas, please let me know!
Back to top
View user's profile Send private message
jyoung
Apprentice
Apprentice


Joined: 20 Mar 2007
Posts: 225

PostPosted: Wed Jun 26, 2019 5:48 am    Post subject: Reply with quote

Just an update, I've identified at least one case where the contents of the file that client2 receives matches the last time that client2 had access to the file, roughly 15 seconds before client1 accesses it.
Back to top
View user's profile Send private message
mike155
Veteran
Veteran


Joined: 17 Sep 2010
Posts: 1959
Location: Frankfurt, Germany

PostPosted: Wed Jun 26, 2019 11:29 am    Post subject: Reply with quote

Quote:
And from the clients' /etc/fstab:
<server's IP>:/export/cluster /cluster nfs rw,hard,_netdev 0 0

Why don't you specify the 'sync' option?

man nfs wrote:
The NFS client treats the sync mount option differently than some other file systems (refer to mount(8 ) for a description of the generic sync and async mount options). If neither sync nor async is specified (or if the async option is specified), the NFS client delays sending application writes to the server until any of these events occur:
  • Memory pressure forces reclamation of system memory resources.

  • An application flushes file data explicitly with sync(2), msync(2), or fsync(3).

  • An application closes a file with close(2).

  • The file is locked/unlocked via fcntl(2).

In other words, under normal circumstances, data written by an application may not immediately appear on the server that hosts the file.

If the sync option is specified on a mount point, any system call that writes data to files on that mount point causes that data to be flushed to the server before the system call returns control to user space. This provides greater data cache coherence among clients, but at a significant performance cost.
Back to top
View user's profile Send private message
jyoung
Apprentice
Apprentice


Joined: 20 Mar 2007
Posts: 225

PostPosted: Wed Jun 26, 2019 3:43 pm    Post subject: Reply with quote

I'm trying it with the sync option on the clients now; I'll report back shortly. However, I was already opening and closing the file before and after the critical data was read and written to trigger a flush.
Back to top
View user's profile Send private message
jyoung
Apprentice
Apprentice


Joined: 20 Mar 2007
Posts: 225

PostPosted: Wed Jun 26, 2019 3:52 pm    Post subject: Reply with quote

In my earlier test, there were some cases where multiple clients would read the file and find results consistent with the first client's write, and then the second client would find results that were consistent with it's own last write, but not the most recent write. That seems like client-side caching, as if the second client isn't bothering to get an updated version of the file. Is that possible?
Back to top
View user's profile Send private message
mike155
Veteran
Veteran


Joined: 17 Sep 2010
Posts: 1959
Location: Frankfurt, Germany

PostPosted: Wed Jun 26, 2019 4:03 pm    Post subject: Reply with quote

Quote:
then the second client would find results that were consistent with it's own last write, but not the most recent write

I would expect exactly this if the NFS share was mounted on the second client without the 'sync' option.
Back to top
View user's profile Send private message
jyoung
Apprentice
Apprentice


Joined: 20 Mar 2007
Posts: 225

PostPosted: Wed Jun 26, 2019 4:18 pm    Post subject: Reply with quote

The code is running with the sync option on all clients, but it's glacially slow. This may not be practical. With async in the clients' fstab, is there any way to trigger a client-side flush before a read, the way close() triggers a flush after a write?
Back to top
View user's profile Send private message
mike155
Veteran
Veteran


Joined: 17 Sep 2010
Posts: 1959
Location: Frankfurt, Germany

PostPosted: Wed Jun 26, 2019 4:25 pm    Post subject: Reply with quote

Quote:
is there any way to trigger a client-side flush before a read, the way close() triggers a flush after a write?

Sure! Please read the snippet from the man page I posted above. :-)
Back to top
View user's profile Send private message
jyoung
Apprentice
Apprentice


Joined: 20 Mar 2007
Posts: 225

PostPosted: Wed Jun 26, 2019 4:36 pm    Post subject: Reply with quote

Okay, I think something might be going over my head. The snippet *seems* to just refer cases where the client is writing to the server, and strategies to ensure that the client's write is complete. My situation seems to be a case where one client's write is success and complete, but a second client doesn't pick up the new data.
Back to top
View user's profile Send private message
mike155
Veteran
Veteran


Joined: 17 Sep 2010
Posts: 1959
Location: Frankfurt, Germany

PostPosted: Wed Jun 26, 2019 4:59 pm    Post subject: Reply with quote

I'm sorry. I probably misunderstood your question. So you are talking about
Quote:
then the second client would find results that were consistent with it's own last write, but not the most recent write.

So you want to make the second client look for newer data on the server although it has not transferred data of the last write to the server?

I don't think that this is possible. You could try to use record locking, but then again: it will be slow.

It seems that NFS is not the right technology to solve your problem.
Back to top
View user's profile Send private message
jyoung
Apprentice
Apprentice


Joined: 20 Mar 2007
Posts: 225

PostPosted: Wed Jun 26, 2019 5:34 pm    Post subject: Reply with quote

Not quite. I *think* that the first client successfully transfers its data to the server, but the second client is picking up old data anyway. The reason I think that the first client's write is successful is that other clients are able to read the first client's write without issue.
Back to top
View user's profile Send private message
jyoung
Apprentice
Apprentice


Joined: 20 Mar 2007
Posts: 225

PostPosted: Thu Jun 27, 2019 12:54 am    Post subject: Reply with quote

I think I have a solution. Previously, I was using a custom file locking mechanism using link/unlink to lock a file, and open/close to flush the NFS cache. I encapsulated my code in fcntl calls to lock and unlock the file, and there's no evidence of inconsistencies after a fairly rigorous test. This works even without 'sync' in the clients' fstab (but with 'sync' in the server's /etc/exports). In the next day or so I'll deploy this solution on a larger scale and report back.

This is very strange. The link/unlink + open/close scheme should have worked, but didn't even with 'sync' in the clients' fstab. Also, the data in the logs strongly implies that the custom file locks were being honored, since no client reported access to the file that overlapped in time with another client's access. It seems like fcntl does a much more rigorous flush than open/close, although that shouldn't have mattered with 'sync' in the clients' fstab.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 14922

PostPosted: Thu Jun 27, 2019 1:50 am    Post subject: Reply with quote

As I read your earlier reports, the observed results are perfectly reasonable. With your ad-hoc locking scheme, the second client had no reason to know that its locally cached data was stale, so it had no reason to reread the data from the server. The sync mount option guarantees timely writes to the server, but is not documented to guarantee no client-side caching of previously read data. If I were to speculate, although the documentation is silent on this point, it also seems reasonable that acquiring a read lock on the file might encourage the kernel to revalidate with the server.
Back to top
View user's profile Send private message
jyoung
Apprentice
Apprentice


Joined: 20 Mar 2007
Posts: 225

PostPosted: Fri Jun 28, 2019 6:24 pm    Post subject: Reply with quote

I think we can marked this thread as SOLVED, but I wanted to leave a few notes and solicit any opinions. Thanks for the comment, Hu, that explains why the custom locks with link/unlink were giving me that issue.

When I first reported back a successful test with fcntl, I'd just added fcntl calls to my code, without removing the custom locks. When I deployed the code on a larger scale, I removed the custom locks. The rate of data overwrites became far worse. So, the custom locks seem to have ensured file ownership successfully, but weren't flushing the client-side cache. fcntl ensured the cache was flushed, but weren't ensuring ownership.

This seemed strange since this is what fcntl is meant for, but then I found this article:

www.0pointer.de/blog/projects/locking.html

"...POSIX locks are automatically released if a process calls close() on any (!) of its open file descriptors for that file." Between locking and unlocking the file I'm opening it using a library (cfitsio, if anyone's interested). The library opens it by name; I can't just pass it a file descriptor. Which means that when I subsequently call the library function to close the file, it's internally calling close() on a file descriptor pointing the same file, invalidating the lock created by fcntl.

I just changed the code to call fcntl after the file is opened by the library, and then again to unlock the file before the library closes it. That works; no signs of overwrites. But, there's not guarantee that a library won't internally open and close the file. In fact, there are some cfitsio functions which seem to do exactly that. I'm not totally sure what a portable solution is. And it seems like this isn't that weird of a situation -- the need for applications to have exclusive access to a file while functions open and close them.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 14922

PostPosted: Sat Jun 29, 2019 12:07 am    Post subject: Reply with quote

As I read the documentation, flock may behave a bit better in the presence of file closure - but the documentation also suggests that it interacts poorly with NFS, which is a major requirement for you. I think it is a bit of an odd use case to say you want to repeatedly open and close a file, but retain a lock on it the whole time. I disagree with the decision to make fcntl drop locks like that, but it is much too late to fix that now.

The most portable solution in my opinion would be to fix the library to accept a prepared file descriptor that your code manages, and get it out of opening and closing the file other than as convenience wrappers for simple programs that don't need to keep the file open.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum