Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
WAN performance with Samba: Why does it suck so bad?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
jesnow
l33t
l33t


Joined: 26 Apr 2006
Posts: 856

PostPosted: Sun Feb 26, 2023 5:22 am    Post subject: WAN performance with Samba: Why does it suck so bad? Reply with quote

Previously I was having trouble working because of single digit MB/s WAN file throughput even though I have GBE all the way from work to home. Learn from my mistakes. I started trying all kinds of things before I looked at my network hardware. That was dumb:

https://forums.gentoo.org/viewtopic-t-1161123-highlight-.html
https://forums.gentoo.org/viewtopic-t-1161386-highlight-.html
https://forums.gentoo.org/viewtopic-t-1161818-highlight-.html

Then there was the setup of a 2.5GBE network card in gentoo (yes redundant and unnecessary at this moment), which was a whole saga. Finally it worked at near enough to theoretical throughput. The hardware setup is all in the above links so I won't repeat it all. It's 10yo Core i7 gentoo -- WAN -- 10yo Core i7 gentoo.

I had followed the advice: "just set up wireguard", wow that "just" was a doozy. Wireguard is hard. This is I think why so many businesses have sprung up offering to manage wireguard for you with their proprietary software and servers. But *I* of course had to do it the hard way. Just set up wireguard they said, it will be easy they said. Ok well I did it. Works great now.

So having obtained near "enough" to theoretical pure WAN throughput I set out to benchmark file performance. Still horrible. Felt like dialup! What I had noticed and wanted to figure out is this:

1) Transferring large files was OK, not perfect, maybe 30-40MB/s. BUT it was tricky to benchmark because you have at least three things in the way besides ssh or wireguard plus network transport (which we now think are working OK): The disks on both sides, the OS (just gentoo) and the application. I noticed that rsync and cp had often quite different speeds. And what I mostly use is Dolphin, which who knows what protocol that uses or how to make a benchmark script for it.

2) Small single files have lower throughput, maybe half. No surprise there, this is well known.

3) Certain applications take *FOREVER* in the WAN to do simple things. The file|open and file|save dialogs in libreoffice for example take ~10 seconds to populate the directory you're trying to save to, and you can sit there and watch each directory item load one by one at morse code speeds. Directory traversal.

4) Copying folders with even a few files goes to near-zero throughput. A kernel tree where the tarball is a few seconds may take half an hour.

So I set about to testing samba with cp and rsync (avoiding the obvious advantages rsync brings, just doing file transfers). I transferred four files of different sizes in both directions. 1GB, 100MB, 20MB and 1MB. The last of those was a subfolder from the kernel source tree with ~295 files in 32 directories. This test was in samba with every combination of wireguard and ssh tunnel transport. So for each one of a dozen or so configurations I had 32 speed measurements. I'll try to summarize them now. Anybody who wants the data are welcome to them.

** LAN performance: Figuring that's the best I could expect ever. Recall I was now getting ~930Mb/s netperf LAN speed in both directions (0.1ms), so I should expect a maximum of 120 MB/s file throughput. Indeed I got 109MB/s download and 64MB/s single 1GB file upload speed. OK. BUT Immediately I noticed that the folder of many files (just 1MB total for heaven's sake!) was super slow even on the LAN. Transferring that folder was >500 times slower than an equivalent sized single file, in the *LAN*! And I get perfectly adequate responsiveness in the LAN. You maybe notice that getting Dolphin thumbnails for image folders is a little slow but so what. That was a surprise, 500 times slower! it was taking nearly 30ms per file just to do the overhead of reading it. But wait, there's more.

** Starting with the default WAN configuration, I was now getting maximum file transfer speeds of about 40MB/s in both directions. 1/3 of wire speed. Wireguard was again the clear winner in tunnel speed over ssh to the point that I'm going to stop using ssh for tunneling. The Kernel Samples (small files) folder though was now *5300* times slower than an equivalent WAN single-file transfer. Holy cow. Per-file disk overhead is somewhere around 400msec. Nearly half a second to open and close any file.

** I switched to an insanely faster machine at work, with a much higher clock rate, core count and disk speed. Strangely, all the speeds went down very slightly. No explanation for that, same OS and applications. The faster client cpu/disk made not one bit of difference. But the new machine is vastly more responsive at the user interface. I think because it loads applications much faster.

** I did a bunch of Samba optimizations (like using "socket options: TCP_NODELAY") which they promise can up to double throughput. Nope. I tried ksmbd, nope. I tried moving the server test folder to an ssd (a staid 300MB/sec) from the 140MB/s rotating disk -- no difference. The standard Samba optimizations did zero for me.

** Things that maybe did help: Add "noserverino" the the mount command. That brought the per-file overhead down to 0.3s up to the server and 0.13 seconds down (from the server). Close random open files! I'm trying to work during all of this so I have a few dozen open files, dolphin instances and whatnot. Closing all of them made a big improvement.

After all of that my performance is now up to ~46MB/sec in the upload direction and ~ 51MB/s in the download direction for large single files, and the perceived responsiveness is much better. The per-file transfer overhead is down to 330ms (1/3 second!) and the upload direction is 130ms. While that is massive improvement, it's still slow and awful to work with.

I don't think there's much left to get out of the total transfer rate, as it doesn't seem to be affected by much I do. But the killer is clearly the per-file overhead. The ~100x decrease in transfer from LAN to WAN rate does seem to be affected by the total latency, but I think there are latencies built into in samba and in cp and rsync that could be improved. Samba is not a WAN protocol despite the "Common Internet" in the name. That was/is marketing b&^%$&t I bought for years.

My questions to you now:

What's next to try? I was thinking:
** Test NFS vs samba. I will drop ssh tunnels from my testing though, just wireguard. Linux-NFS-wireguard-NFS-Linux is a stack people claim works well.
** Test NFS and Samba with their built-in encryption instead of wireguard: Linux-Samba-Linux. Both are now secure. Use a nonstandard port though.
** Turn off oplocks. Those things have to be expensive.
** There is an industry of companies offering "WAN optimization" to solve this exact problem. They employ a number of promising sounding techniques to reduce the number of round trips to get a single file read. It's a combinations of caching, deduplicaiton, request bundling, and lots of other cool sounding stuff, all built on top of these same FOSS network technologies, but extending them for the enterprise deep-pockets world.

https://www.gartner.com/reviews/market/wan-optimization.

Is there anything like that you don't have to pay big bucks for, or is that layer a 100% proprietary harvesting of FOSS IP?

** Lastly: Syncthing, which appears to be FOSS. And resilio (a monetization of the bittorrent protocol) which is very much not free. It's like caching your entire volume locally and letting the daemon synchronize as it has time and bandwidth.
** To be clear, anything involving an external cloud server is out. I won't even consider it. I own my data, I own everything it exists on, period.

Does anybody have experience with those things? Any help gratefully received. Thank you for even reading this far.

Cheers,
Jon.
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 918
Location: Richmond Hill, Canada

PostPosted: Sun Feb 26, 2023 4:29 pm    Post subject: Reply with quote

jesnow,

I appreciate your effort for sharing your experience with us. However I still don't get it what is that you want :oops:
  1. Are you trying to find best tuning parameter(s) for your connection over the WAN?
  2. Are you expecting browse file system on the remote end as fast (or as mush as it can be) as it is on locate SAN storage?
  3. Are you want to running a database application over WAN?
  4. Or are your working at both remote and local end and you want to be able to edit files no matter where you are?


For the point 1, I interpreter as learning opportunity. I will be happy to explorer this with you.

For the point 2, As you already stated due to WAN latency there is very little you can expect compare to LAN. However there are different Network File System that can help to reduce the apparency of latency by cache data locally. As you have notice SMB(CIFS) does not work very will over WAN because it a chatty protocol meaning it need multiple turn around traffic in order to get a simple action done.

On Linux there are two will established network file system, NFS and AFS (Andrew File System).

AFS (or OpenAFS) was desige with Wire Area Network in mind, so it may be a good starting point for your use case. However I have never use it before. So I could offer little advise. But if you are interesting, I be happen to explorer with you.

On Linux NFS+CacheFS may be able to meet your expectation. I also never use CacheFS before so we can try if you want.

Another possibility is use FUSE with (ssh or some other transport, i.e ftp) for adhoc remote file access.

For point 3, this is more of application design issue than it is a network issue. Knowing network latency will always be a key factor for data transmission between presentation layer to storage layer. it is best design to move application's calculation logic as much as close to data as possible to reduce data transport.

For point 4, Instead using a networked file system to provide the transparency of remote nature. It may be better to use a file synchronisation application to reduce the risk of system management headache. Because WAN not only have latency it is also unreliable. Using networked file system you may experience system hang due to network interrupt.

On the other hand using file synchronisation application you always in control when to synchronise so your work flow will not have unexpected delay.

There are two will know file synchronisation apps in Linux, one is rsync and the other is net-misc/unison. I think you are familiar with rsync so I will focus on unison. Compare to rsync, unison is by design two way synchronisation, It have both CLI tool and GUI interface. I think net-misc/unison is better suit for your usage.

I am also curious how you test rsync over WAN. Are you test rsync as it is local copy from local file system to the networked file system mountpoint? or are you setup rsync server on remote and use local rsync connect to remote rsync daemon? for this rsync/rsyncd setup, did you use ssh as transport? have you try to use rsync/rsyncd use its native protocol over Wireguard?
Back to top
View user's profile Send private message
jesnow
l33t
l33t


Joined: 26 Apr 2006
Posts: 856

PostPosted: Sun Mar 05, 2023 9:25 pm    Post subject: Reply with quote

Hi Everybody:

My goal is squeeze maximum performance out of WAN connections in order to be able to work remotely from my office. In the meantime (links at the top of the thread) I:

- Fixed my physical hardware using iperf, ifconfig and a whole lot of head scratching. (Do this before anything else!!)
- Tuned my linux network parameters to better serve WAN connections. (TL;DR: Don't bother. Linux tunes itself better than you can.)
- Benchmarked samba over wireguard and samba over ssh. (100% win for wireguard)
- Re-did the whole thing over using different file sizes, from 1GB to 100MB to 2MB to a directory tree of 295 kernel source files.
- Benchmarked cp vs rsync both using only samba transport

today I have

- Tuned Samba for WAN using every possible tweak. There were surprises
- Benchmarked tuned samba vs out-of-the-box nfs. Again there were surprises.

pingtoo wrote:
jesnow,

I appreciate your effort for sharing your experience with us. However I still don't get it what is that you want :oops:
  1. Are you trying to find best tuning parameter(s) for your connection over the WAN?
  2. Are you expecting browse file system on the remote end as fast (or as mush as it can be) as it is on locate SAN storage?
  3. Are you want to running a database application over WAN?
  4. Or are your working at both remote and local end and you want to be able to edit files no matter where you are?



1) Yes
2) Yes
3) No
4) Yes

Previously I found that single-file transfers of large files stuck stubbornly around 30MB/sec in both directions. GB ethernet the whole way. Makes no sense. What's worse, browsing and saving files took forever, and that's distracting when you're trying to work. That makes some sens, as latency is obviously much worse on the WAN than the LAN.

Samba tuning:

I'm not going to go back and recount everything I tried. Basically everything anybody ever posted anywhere on "Speed up Samba with this neat trick". Like with the network hardware parameters, nearly *every* Samba tip and trick has now been made a default. Things like "Raw Read = Yes" you don't need to bother with. In fact in the ended I ended up with nearly my stock smb.conf, with one or two exceptions and caveats:


Code:

socket options = TCP_NODELAY IPTOS_LOWDELAY


These appear not to be defaults, and they do make a modest difference.

The following in /etc/smb.conf either don't work or actually slow you down:
Code:

;       use sendfile = Yes # slow
;       fake oplocks = Yes # some situations no change some much slower.
;        socket options = TCP_NODELAY IPTOS_LOWDELAY IPTOS_THROUGHPUT SO_RCVBUF=131072 SO_SNDBUF=131072 # bad


testparm will actually warn you about tweaking the network buffers, these are autotuned on the fly far better than you can do by hand. Getting rotten network performance is easy, as I found out. In particular oplocks are counterintuitive. Having them on (they are on by default) in their default settings enables client side caching. You do want this, it makes a huge difference. so don't mess with it.

Mount.cifs tuning

Unlike samba itself, the samba client mount.cifs appears to be tunable for better performance. These two things that I discivered in the bowels of the forums made a big difference, in particularl for the many small file case:

Create a file /etc/modprobe.d/cifs.conf:
Code:

# Maximum write buffer
#
options cifs CIFSMaxBufSize=130048
#options cifs CIFSMaxBufSize=262144


The default buffer size is 16K or something, and it can be much larger. I tried it several different sizes, but 128K went the fastest. Do this in conjunction with the following mount option:

Code:

vanaert# mount -t cifs //merckxw/jesnow /mnt/test -o credentials=/root/smb-merckx/.cred,vers=3.11,uid=jesnow,gid=users,cache=loose


Doing these things together makes downloading of smaller files from the server machine go like lightning. More importantly directory browsing now goes much faster.

NFS vs Samba

Ah the age old question. I've used samba for over a decade for this, and always been a bit hesitant about nfs. That was a mistake. Setting up nfs was easy. Just follow the gentoo wiki entry for nfs-tools.

Transfer speeds (Upload to server)
NFS Samba
1GB file 77.5 48.1 (MB/s)
100MB file 143.7 153.7 (MB/s)
2mb file 23.6 22.6 (MB/s)
295 files 2.4 4.1 (files/s)

So it was interesting! It was not the case that nfs beat samba in small file performance. They were about the same in "ordinary" file performance, and with many tiny files, Samba won! I think it was the client side caching that did it. I think part of the very slow performance on small files is the seek time on my server's hdd. I think it's time to change that out. I can cactually hear the ticking as it seeks. Interesting that the 100MB file went up faster than the wire speed, I suspect caching. Also, Maybe a 1-2 second transfer can crowd out other network traffic, where larger files the router has to let some other traffic through, while smaller files are seek-time limited.

In the download direction, the results were *much* faster and wildly divergent. The first large file download went at 5MB/s (!), but every subsequent download from nfs or samba went at insane speeds, faster than the wire (from 100-2000 MB/s!). Cache hits. NFS and samba are both using caching, so that complicates things. I'm curious to see how that works out in the real world.

A really big difference was the upload small file overhead, which went to 11 files/sec over NFS and 6 files/sec over samba. I chalk that up to the very low seeks tim on the nvme on my client machine.

I think that most of the responsiveness of the machine in file browsing is exactly because of small file downloads (ie directories), and it's asking for a lot of information over and over again (like walking directory structures). I'm hopeful that NFS over wireguard is going to be really useable without having to resort to remote-mirroring, yet.

Thanks to everyone who contributed ideas and to all the people who posted their smb.conf over the years. I'm here to say that smb.conf tuning doesn't do a whole lot.

Cheers,
Jon.
Back to top
View user's profile Send private message
jesnow
l33t
l33t


Joined: 26 Apr 2006
Posts: 856

PostPosted: Sun Mar 12, 2023 9:59 pm    Post subject: Reply with quote

I spent a week after getting the numbers above. I have been using both NFS (out of the box) and tuned Samba over Wireguard on the WAN. I'm subjectively finding that both now work OK. You can work with them mostly, especially using Dolphin (linux) and finder (mac), but I use much more linux of course.

Here are the problem applications I still continue to be annoyed by:

Okular. It starts very slowly for what it does (it's just a pdf reader after all!) even on the LAN or on a local machine. But on WAN it takes for effing ever to open a file. I mean for petes sake. I wonder if it's doind individual seeks for each page or something, but even single page ducments take a very long time. Adobe pdf reader (mac) and browser plugins on linux don't take several seconds to open a file, so I think Okular just sucks.

Libreoffice. It starts a little slower (it must be reading tons of plugins and modules), but the thing that really really sucks is the File|Save and file|Open dialogs. Holy crap that is slow! Yuo can tell they know it's slow because it puts up a wait icon while it populates the save directory.

Dolphin. Ok I just said it's ok, which it is. But when using Dolphin, when you hover them over over a directory, it highlights the directory entry. which is fine on LAN and local disks, but it lags annoyingly and distractingly on WAN. I think it's updating info it shows on a status line at the bottom of the window, but I think it's querying the file system every time the mouse moves from one entry to the other. Otherwise it's great though.

The next thing to do there is fire up wireshark and see when and what accesses are being made by these applications. But it's pretty clear that these three applications developers basically never imagined that they would be run with >1ms ping time. So they never gave a thought to querying the filesystem 10 times in a second or more. Currently I'm at 30ms ping time, but I assume it will get worse when I'm in Japan. Last time I was there I was able to navigate on dolphin (slowly) and copy files up and down, but normal work was impossible. We shall see.

Cheers,
Jon.
Back to top
View user's profile Send private message
fw23
n00b
n00b


Joined: 15 Apr 2023
Posts: 1

PostPosted: Sat Apr 15, 2023 8:54 am    Post subject: Reply with quote

jesnow wrote:
My goal is squeeze maximum performance out of WAN connections in order to be able to work remotely from my office.


Thanks for your input! :-) I have also done extensive smb/nfs4 (+/- wireguard) testing the past 3 years due to the same remote working conditions and it feels like a never ending story to me. In my case the additional ipv4to6 problem came into play. I moved away from wireguard (in the end) to a socat (internal OPENSSL) because i needed the gender changer functionality. (a story in itself) I also moved to nfs4.2 (from samba) without encryption because the caching (no need for the extra fs chache stage) and WAN resilience is just great. I can also confirm it boils somehow down to (WAN) latency to some extent. In the 1Gbit<->10Gbit WAN setup and with a 2.3ms RTT, when moving to the 11ms RTT there was reduction of 4k performance (the hardest WAN test) from (2ms 500Mbit 8100iops) to (11ms 50Mbit 880iops), as expected.

I usually test my links/mounts using this:

Code:
fio –name=mine –rw=randread –size=500M –direct=1 –bs=4k –iodepth=8 –ioengine=libaio –numjobs=4


BTW: Why did i end up socat+openssl? In fact i could not get any better than 700Mbit on wireguard on the UDP iperf3 (i suspect some router limitations in my WAN scenario) plus CPU limitations ... but then maybe wireguard is just too much overhead/overkill in a "simple" samba/nfs on WAN scenario ;-)

Would be great to share and discuss details of findings, because there is so much to talk about in this thread. One aspect which made some difference is the window sizes and scaling algs. / congestion algs in TCP, i feel. Thanks again.
Back to top
View user's profile Send private message
jesnow
l33t
l33t


Joined: 26 Apr 2006
Posts: 856

PostPosted: Sat Aug 05, 2023 5:13 pm    Post subject: Reply with quote

I'm just returning to this thread after months on the road. The whole idea was that I should be able to have global access to my data across various clients I might use, and I was going some really weird places that few road warriors go.

Like to the bottom of the ocean in a submersible (I came back).

Well, with all that work put in, I had fixed some real problems in my hardware setup, and in my configuration. I have a main server machine that has all the 4TB of data, and a gaggle of heterogeneous other machines that need to be accessible for such things as distcc (this is gentoo after all), and as workstations. I have a laptop I take lord knows where (like Japan) I have desktops at work behind a blank NAT firewall, and suchlike. My home server has 1GBE bandwidth to the world and an ipv4 address that hasn't changed in five years. So that's why my work data are all on my home server, I don't even keep a copy of it all at work, as they don't allow incoming connections.

Openssh works. From work or wherever, I can ssh to my home server with -R and a reverse tunnel to services like distcc I may wish to offer on my home network. It's pretty fun to see seven machines scattered across the globe working on a compile. The main thing is file based networking. Over openssh, samba, nfs and sshfs all work acceptably. Samba seemed to have an edge in that the windows client for it was the least bad, and I had to use windows in my role at the time. So I did that for about three years. it worked but was slow. But nice anyway to be able to attcah a home drive over the internet and keep working as if it were local.

Wireguard really works wireguard once you get it set up, is super powerful. Especially point to point between two machines or pairs of machines, it really is insane. You can start up a session and go away for months and it's still live. After all the testing, I have dropped ssh for making tunnels and just use wireguard.

Nfs unencrypted over wireguard works There is no point in double encrypting a data stream. It doesn't seem to make much of a speed difference, but I turned off nfs encryption. I now no longer need to use windows, and the mac and obviously linux nfs clients are plenty fast and reliable. It was all working. After running both NFS and Samba side by side for a while on their own mount points, I finally decided that NFS *feels* faster.

So I got all of this set up. I could ssh into my home server from a satellite connection in the middle of the ocean. I could see all my data. I could work on anything. Until the day I got back on land and was ready to start work on my 1-month visiting professorship in Nagoya, Japan. I got "connection refused". It's OK, I thought, my wife is still at home, and painfully (with the time zone difference) I walked her through resetting the server , and then the router, no luck. "connection refused". So it was damn lucky I had taken a complete backup of the 4tb of data on an external usb hdd. What a lot of work I had done all for nothing. And the external hard drive wasn't really any faster than what the remote net connection had been!

When I got home (for three days, then headed out again for another month) it was no dice: My home server was working fine but *could not* receive incoming connections from the internet. I finally discovered that around about the day I lost connection to it (while in Japan), AT&T had pushed out a firmware update to all the routers of that make and model. But it made no sense! I had cold started the router umpteen times. Until in an obscure user forum I got the advice: after this firmware update you may need to factory reset the router and re-create all your port forwarding rules.

So what it was: The router had had a firmware update. It remembered all of its configurations over the update, *except* the port forwarding. That was just dead, but *pretending* to be working correctly. You could open and close ports all day long, but they wouldn't work on the outside. The fix was a factory reset, and re-creating all of my passwords, re-registering all of my devices and redoing all the firewall rules from scratch. What a pain, but it worked. So today, I have everything working. My ping from the outside world at about 30ms isn't that great, but I can live with it.

Final summary: Wireguard/NFS is the way to go for most use cases. Make a big backup and take it with you on the road anyway.

Cheers,
Jon.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum