View previous topic :: View next topic |
Author |
Message |
jesnow l33t
Joined: 26 Apr 2006 Posts: 856
|
Posted: Sun Dec 04, 2022 7:43 pm Post subject: Openssl performance issues |
|
|
I am using a lot of ssh these days via ssh tunnels, but the performance is pretty bad. My new rabbit hole is to make sure that I have a correct ssl setup and that it's performing optimally. I'm writing down what I found in case it is useful later. Please criticize me in this thread if I make any mistakes, and please help if you have any ideas about how to troubleshoot my rotten throughput. I'm tunneling from my place of work (workstations vanaert and pogacar) to my home (bartali and merckx). All have wired ethernet and get ~1Gb/s conneciton speeds to the internet. It's not that easy it turns out to find out what's really going on because there aren't many introductory guides for people like me. Skip to the end for a description of the problems I've been getting.
Introduction
Openssl is a robust, full-featured Open Source toolkit for the Transport Layer Security (TLS). Openssl uses cryptographic engines to implement ciphers that exchange certificates to establish encrypted connections outside and inside the operating system instance.
https://www.openssl.org/
https://packages.gentoo.org/packages/dev-libs/openssl
The number of gentoo packages that depend on openssl is impressive, and it's sometimes surprising what all uses it. It's really a fundamental part of the linux operating system and gentoo in particular. Fortunately, it's pretty trouble-free. The default implementation included in gentoo "just works" and if you follow the gentoo installation guide, you have a solid engine and plenty of available ciphers of excellent quality to work with. Most people can stop reading right here.
Openssl vs libressl
Those of us old enough remember that there was a controversy, a fork, many flame wars, and a confusing situation for users. For a while there was a choice of which TLS implementation to use, openssl or libressl. This is a feature of open source software, not a bug. Many electrons have been spilled over this. But this particular competition is now resolved in favor of openssl, it's a long story:
https://wiki.gentoo.org/wiki/LibreSSL
ssl vs ssh
Most of us knowingly only use ssl by using ssh to log into another machine. ssl does much much more, it's the underlying technology, also used by a lot of other communication protocols, but still the best way to test it in the real world is through ssh. You should have public key encryption set up in ssh, at the very least generated keys installed for localhost. How to do that is beyond the scope of this article, there are a million guides out there.
https://wiki.gentoo.org/wiki/SSH
Versions
Openssl is under steady development, but the changes are intended to be transparent to the user, and only really relevant to developers. Currently in tree are:
Code: |
jesnow@bartali ~ $ equery list openssl -p
* Searching for openssl ...
[-P-] [M ] dev-libs/openssl-1.0.2u-r1:0
[IP-] [ ] dev-libs/openssl-1.1.1q:0/1.1
[-P-] [ ~] dev-libs/openssl-1.1.1s:0/1.1
[-P-] [M~] dev-libs/openssl-3.0.7:0/3
|
1.0.2 is deprecated, there are two version of 1.1.1, and there is a 3.0 version that incorporates some big under the hood changes, so I'm going to limit myself to 1.1.1 until 3.0 goes stable in gentoo. There are constant security updates.
openssl command
Openssl's user interface is the openssh command which has an excellent man page. It has its own internal cli shell, kind of like sftp does, where you can give commands and see their results. Most things can be done from the system shell by using command line arguments. This seems more convenient to me because it gives you access to history, which the internal cli does not.
Code: |
jesnow@bartali ~ $ openssl version
OpenSSL 1.1.1q 5 Jul 2022
jesnow@bartali ~ $ openssl
OpenSSL> version
OpenSSL 1.1.1q 5 Jul 2022
OpenSSL> quit
jesnow@bartali ~ $
|
Openssl performance
The "openssl speed" command gives extensive benchmarking of the internal performance of the openssl stack using all the available ciphers and a variety of block sizes. Even the summary of the output is a bit daunting:
Code: |
OpenSSL 1.1.1q 5 Jul 2022
built on: Sun Jul 17 19:44:42 2022 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -O2 -march=x86-64 -pipe -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
md2 0.00 0.00 0.00 0.00 0.00 0.00
mdc2 18974.29k 20271.81k 20910.41k 20977.66k 21020.67k 21129.90k
md4 105874.85k 321972.84k 744771.81k 1116778.51k 1315785.39k 1329971.20k
md5 146403.54k 339441.17k 595976.02k 728138.41k 783141.50k 787111.47k
hmac(md5) 60493.44k 189497.54k 439870.50k 656708.32k 764900.69k 782448.33k
sha1 162097.83k 351767.01k 702521.45k 993588.57k 1093025.79k 1102555.53k
rmd160 53346.66k 126231.69k 229794.42k 288154.97k 311227.51k 311077.55k
rc4 482920.52k 797182.25k 919254.53k 976198.66k 981281.45k 982870.70k
des cbc 81204.94k 83484.01k 83802.62k 83806.55k 83992.58k 84262.91k
des ede3 30962.74k 31396.61k 31408.04k 31390.38k 31358.98k 31490.05k
idea cbc 106584.92k 110611.41k 111297.45k 111853.57k 111894.53k 111869.95k
seed cbc 90919.63k 95345.56k 95699.63k 96067.93k 95783.59k 95890.09k
rc2 cbc 59871.08k 61316.80k 61841.72k 62059.88k 61952.34k 61773.14k
rc5-32/12 cbc 295779.62k 328290.77k 336580.01k 338123.43k 336980.65k 338231.30k
blowfish cbc 133783.65k 141040.42k 142769.83k 143025.83k 143431.23k 143527.13k
cast cbc 118728.76k 127222.69k 129503.46k 129676.63k 130129.51k 130061.65k
aes-128 cbc 253446.09k 261655.32k 262920.28k 263661.23k 263495.68k 264525.14k
aes-192 cbc 217894.45k 224816.81k 225606.14k 226702.68k 223349.42k 225836.18k
aes-256 cbc 186522.45k 195011.20k 196768.68k 198703.95k 195739.32k 194035.71k
camellia-128 cbc 112228.90k 181183.55k 207081.39k 215213.74k 217022.46k 217366.53k
camellia-192 cbc 101708.66k 141470.12k 156160.34k 161113.43k 162922.50k 162725.89k
camellia-256 cbc 100172.22k 141594.88k 156356.27k 161319.25k 163190.10k 163015.34k
sha256 88805.05k 196290.47k 366010.37k 461756.07k 494441.81k 500230.83k
sha512 58311.16k 233388.71k 402951.77k 602200.06k 702980.10k 726947.16k
whirlpool 38657.68k 82188.91k 135896.49k 164015.00k 173129.73k 174344.39k
aes-128 ige 232342.22k 248786.02k 253756.07k 257386.84k 258916.35k 258790.74k
aes-192 ige 205817.59k 214868.31k 215990.10k 221858.13k 217251.84k 218054.66k
aes-256 ige 182099.57k 190099.82k 193063.42k 194466.47k 194874.03k 194707.46k
ghash 1395582.57k 4997389.55k 7899978.41k 8950806.19k 9575333.89k 9473845.93k
rand 15550.39k 62755.20k 242432.81k 825508.39k 2768350.84k 3304924.92k
|
What's obvious is that there's a pretty big variation in speed between the different ciphers. Whether that makes a difference or not in the real world is another matter, because modern processors are fast compared to the width of network pipes they are trying to push the encrypted bytes through. Or are they?
Hardware acceleration
Both Intel and AMD feature on-die hardware acceleration for cryptographic calculations. For intel they are called AES-NI and for AMD they are called ccp. Until pretty recently openssl was not configured to use these features out of the box. The present situation on that is cloudy. Previously there were two different approaches to hardware acceleration: af_alg and codedev. It looks like neither is installed by default. So for example:
Code: |
jesnow@bartali ~ $ openssl engine -t -c
(rdrand) Intel RDRAND engine
[RAND]
[ available ]
(dynamic) Dynamic engine loading support
[ unavailable ]
|
That was supposed to show the hardware encryption engine, if it is available, but doesn't. But apparently, some hardware encryption is going on. I'm encouraged by the compiler flag -DAESNI_ASM in the output above. Based on this documentation from the OpenWRT project, you can turn the hardware encryption on and off using an environment variable:
https://openwrt.org/docs/techref/hardware/cryptographic.hardware.accelerators
Like this:
Stock (AES-NI on?):
Code: |
jesnow@pogacar ~ $ openssl speed -elapsed -evp aes-128-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 151067326 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 40611140 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 10344060 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 2596655 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 324284 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 16384 size blocks: 161880 aes-128-cbc's in 3.00s
OpenSSL 1.1.1q 5 Jul 2022
built on: Sun Jul 17 19:44:42 2022 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -O2 -march=x86-64 -pipe -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 805692.41k 866370.99k 882693.12k 886324.91k 885511.51k 884080.64k
|
Hardware AES-NI switched off:
Code: |
jesnow@pogacar ~ $ OPENSSL_ia32cap="~0x200000200000000" openssl speed -elapsed -evp aes-128-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 60032317 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 18128897 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 4796872 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 1196739 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 150431 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 16384 size blocks: 75582 aes-128-cbc's in 3.00s
OpenSSL 1.1.1q 5 Jul 2022
built on: Sun Jul 17 19:44:42 2022 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -O2 -march=x86-64 -pipe -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 320172.36k 386749.80k 409333.08k 408486.91k 410776.92k 412778.50k
|
So the stock (ie using the crypto hardware) is about 4x faster (this is a 10yo Intel i7) than the software-only implementation, suggesting that openssl indeed incorporates hardware encryption for some ciphers. Interestingly this worked the same on my AMD machine as well.
Crypto Throughput:
All of this is just numerology unless you can measure bytes moved per second by the application, in this case ssh. What seems to be a standard way of doing this is the following script:
Code: |
jesnow@bartali ~ $ for i in `ssh -Q cipher`; do dd if=/dev/zero bs=1M count=100 2> /dev/null | ssh -c $i localhost "(time -p cat) > /dev/null" 2>&1 | grep real | awk '{print "'$i': "100 / $2" MB/s" }'; done
aes128-ctr: 526.316 MB/s
aes192-ctr: 526.316 MB/s
aes256-ctr: 555.556 MB/s
aes128-gcm@openssh.com: 555.556 MB/s
aes256-gcm@openssh.com: 555.556 MB/s
chacha20-poly1305@openssh.com: 588.235 MB/s
jesnow@bartali ~ $
|
I have seen this used on multiple sites to gauge ssh performance. On this machine (AMD Ryzen 3600) the throughput to itself is well above what can fit through a 1GBE network pipe, so if you're getting that kind of throughput to localhost, the network pipe is going to be the limiting factor. Here's the same local throughput test on an intel 11K at work:
Code: |
jesnow@vanaert ~ $ for i in `ssh -Q cipher`; do dd if=/dev/zero bs=1M count=100 2> /dev/null | ssh -c $i localhost "(time -p cat) > /dev/null" 2>&1 | grep real | awk '{print "'$i': "100 / $2" MB/s" }'; done
aes128-ctr: 625 MB/s
aes192-ctr: 833.333 MB/s
aes256-ctr: 769.231 MB/s
aes128-gcm@openssh.com: 769.231 MB/s
aes256-gcm@openssh.com: 769.231 MB/s
chacha20-poly1305@openssh.com: 769.231 MB/s
jesnow@vanaert ~ $
|
On an older intel i7 (merckx, that happens to be my home file and external sshd server):
Code: |
jesnow@merckx ~ $ for i in `ssh -Q cipher`; do dd if=/dev/zero bs=1M count=100 2> /dev/null | ssh -c $i localhost -p 2224 "(time -p cat) > /dev/null" 2>&1 | grep real | awk '{print "'$i': "100 / $2" MB/s" }'; done
aes128-ctr: 100 MB/s
aes192-ctr: 106.383 MB/s
aes256-ctr: 97.0874 MB/s
aes128-gcm@openssh.com: 106.383 MB/s
aes256-gcm@openssh.com: 104.167 MB/s
chacha20-poly1305@openssh.com: 196.078 MB/s
jesnow@merckx ~ $
|
This is significant, since, we're now down below the theoretical network throughput of ~116MB/s, meaning that even with 100GBE I could never do better than that, and probably on top of the samba protocol the cryptographic overhead for any tunneled connection is a significant part of the overall workload. This is probably a very real bottleneck in my system, and probably accounts for the big performance hit I take when using my home server while at work.
Using ssh between machines:
From one machine to the other in the local network, I get performance close to the theoretical maximum. In the local net at work I get:
Code: |
jesnow@vanaert ~ $ for i in `ssh -Q cipher`; do dd if=/dev/zero bs=1M count=200 2> /dev/null | ssh -T -c $i pogacar "(time -p cat) > /dev/null" 2>&1 | grep real | awk '{print "'$i': "200 / $2" MB/s" }'; done
3des-cbc: 108.108 MB/s
aes128-cbc: 111.111 MB/s
aes192-cbc: 111.111 MB/s
aes256-cbc: 111.111 MB/s
aes128-ctr: 109.89 MB/s
aes192-ctr: 109.89 MB/s
aes256-ctr: 111.111 MB/s
aes128-gcm@openssh.com: 109.89 MB/s
aes256-gcm@openssh.com: 111.111 MB/s
chacha20-poly1305@openssh.com: 111.111 MB/s
jesnow@vanaert ~ $
|
And in my local net at home I get:
Code: |
jesnow@bartali ~ $ for i in `ssh -Q cipher`; do dd if=/dev/zero bs=1M count=200 2> /dev/null | ssh -T -c $i merckx "(time -p cat) > /dev/null" 2>&1 | grep real | awk '{print "'$i': "200 / $2" MB/s" }'; done
3des-cbc: 81.3008 MB/s
aes128-cbc: 83.3333 MB/s
aes192-cbc: 108.108 MB/s
aes256-cbc: 104.712 MB/s
aes128-ctr: 107.527 MB/s
aes192-ctr: 99.0099 MB/s
aes256-ctr: 104.167 MB/s
aes128-gcm@openssh.com: 104.167 MB/s
aes256-gcm@openssh.com: 100 MB/s
chacha20-poly1305@openssh.com: 102.041 MB/s
jesnow@bartali ~ $
|
This performance in the local net is about the same as merckx got just talking to itself.
Into the tunnel: Throughput problems!
Finally, I have ssh tunnels (with conneciton sharing) running between each of my work machines (vanaert and pogacar) and my home server, merckx. Here's where thing break down and I really stop understanding. I get much worse performance tunneling over the work to home ethernet connection (>1GBE all the way) than I do in either local network. In the download direction (from the point of view of work) I get:
Code: | jesnow@bartali ~ $ for i in `ssh -Q cipher`; do dd if=/dev/zero bs=1M count=1 2> /dev/null | ssh -T -c $i vanaert "(time -p cat) > /dev/null" 2>&1 | grep real | awk '{print "'$i': "1 / $2" MB/s" }'; done
3des-cbc: 3.7037 MB/s
aes128-cbc: 11.1111 MB/s
aes192-cbc: 11.1111 MB/s
aes256-cbc: 11.1111 MB/s
aes128-ctr: 5.88235 MB/s
aes192-ctr: 7.69231 MB/s
aes256-ctr: 7.14286 MB/s
aes128-gcm@openssh.com: 7.14286 MB/s
aes256-gcm@openssh.com: 11.1111 MB/s
chacha20-poly1305@openssh.com: 7.14286 MB/s
|
I wasn't expecting to get the same performance I get in my local net, but factor 10 seems to be a really big hit to take for going through the ssh tunnel. But it gets worse! In the upload direction (from the point of view of Work), it's even slower:
Code: |
jesnow@vanaert ~ $ for i in `ssh -Q cipher`; do dd if=/dev/zero bs=1M count=1 2> /dev/null | ssh -T -c $i merckx "(time -p cat) > /dev/null" 2>&1 | grep real | awk '{print "'$i': "1 / $2" MB/s" }'; done
3des-cbc: 1.26582 MB/s
aes128-cbc: 1.44928 MB/s
aes192-cbc: 0.877193 MB/s
aes256-cbc: 1.42857 MB/s
aes128-ctr: 0.970874 MB/s
aes192-ctr: 1.81818 MB/s
aes256-ctr: 2 MB/s
aes128-gcm@openssh.com: 1.40845 MB/s
aes256-gcm@openssh.com: 1.49254 MB/s
chacha20-poly1305@openssh.com: 1.07527 MB/s
|
I have been getting a similar pattern using iperf3 through the tunnel, but that's a story for another day. All of the work and home computers get ~800 mb/s connection to the internet.
Conclusion:
The ssh tunnel seems to be costing me a huge performance penalty: On a gigabit ethernet connection I'm getting 10MB/s in one direction and 1MB/s in the other! But it's not clear why. I went into this thinking a misconfig of openssl was maybe to blame, or at least that I could improve the situation by choosing a better cipher or enabling the hardware crypto. In fact what I seem to have found is big differences in network performance that have some other explanation, despite the fact that all four machines have a super fast cabled ethernet connection. |
|
Back to top |
|
|
Genone Retired Dev
Joined: 14 Mar 2003 Posts: 9532 Location: beyond the rim
|
Posted: Wed Dec 14, 2022 12:42 pm Post subject: |
|
|
First, your tunnel test is only using a 1MB sample size, so any delay caused by connection setup/teardown will have a much greater effect than in your other tests using a 100MB / 200MB sample size.
Also your tests show that the bottleneck is not openssl itself, but rather an issue specific to the tunneling setup, so the title is a bit misleading. More so given that you've omitted any information about how you setup your tunnels. Like using an ssh connection over an ssh tunnel obviously won't work very well as you're encrypting/decrypting the data twice on both ends for no reason (though I'm not an expert on ssh tunnels, so that could be wrong).
What is really odd is that the 256 bit ciphers in your tunnel test seem to perform much better than the 128 bit variants. Should really retest with a much larger sample to ensure you're actually looking at bandwidth rather than (random) latency. As a general rule: high-level performance tests should run for more than just milliseconds, esp. if network connections are involved.
Or just drop the |grep|awk part and look at the actual times. Also look at actual CPU usage during your tests, esp. if it changes significantly between different tests.
Last but not least, ssh has a -v option for diagnostics that might be useful to you. |
|
Back to top |
|
|
no101 n00b
Joined: 10 Oct 2022 Posts: 11 Location: Piney Woods
|
Posted: Wed Dec 14, 2022 4:29 pm Post subject: |
|
|
You might be experiencing "TCP Meltdown". Running TCP over TCP can cause problems. Wireguard uses UDP for exactly this reason and OpenVPN suggests using UDP as well. VPN is an inversion of what you're doing but the issue is the same.
I suspect tcp meltdown because the localhost network is perfect: you never get dropped packets or congestion. Since you never get errors, localhost is pretty useless for testing network code performance. There's other differences too like you will likely get zero-copy network transmission because the kernel knows it can simply pass the existing buffer.
I don't really have a good reference but OpenVPN has a simple explanation in their FAQ.
https://openvpn.net/faq/what-is-tcp-meltdown/ |
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3137
|
Posted: Wed Dec 14, 2022 10:59 pm Post subject: |
|
|
I just want to add that in my experience SSH makes a very slow data pipe. Even an otherwise idle system wouldn't transfer more than 20MBps per connection.
Fortunately, in my particular case I could simply work around this by splitting a job into a few thousand chunks and sending up to 20 of those in parallel for a total bandwidth of 400MBps but I still do not consider it a good general-purpose solution.
Just saying. If you do have a full control over both ends, you'll be better of simply shoving your data down netcat wrapped in wireguard.
Anyway, what do you need it for? I wonder what other options are available. |
|
Back to top |
|
|
jesnow l33t
Joined: 26 Apr 2006 Posts: 856
|
Posted: Fri Jan 27, 2023 4:36 pm Post subject: |
|
|
To summarize:
I've got networking running very reliably on multiple machines outside my home network using dynamic dns, a pinhole in my router firewall on a random port, ssh reverse tunneling (using openssh) and cifs. The nice thing is it all works about the same as it does at home, and my home systems can connect easily to my remote systems at work through the tunnel. It's like a simple vpn.
Here's my remote machine .ssh/config:
Code: |
Host *
ForwardX11 yes
ForwardX11Trusted yes
controlmaster auto
controlpath /tmp/ssh-%r@%h:%p
ServerAliveInterval 60
ServerAliveCountMax 10
ConnectTimeout 300
Host merckx
Hostname merckx.*****.***
User ******
RemoteForward 42223 localhost:22
RemoteForward 43632 localhost:3632
RemoteForward 44000 localhost:4000
LocalForward 44445 merckx:445
LocalForward 44000 bartali:4000
Port *****
host vanaert
user jesnow
|
It's a little cumbersome in that I have to bring up the tunnel by hand on the remote machine with
Code: |
jesnow@pogacar ~ $ autossh -M 0 -f -T -N merckx
|
but since passwordless login is all set up, this takes one second and I can see error messages if something goes wrong. Up comes the tunnel, and I can mount a samba share easily with
Code: |
mount -t cifs //localhost/jesnow /mnt/merckx-jesnow -o port=44445,credentials=/root/smb-merckx/.cred,vers=3.11,uid=jesnow,gid=users
|
It all works fine on the download side. I can copy files *from* my home server at near-network speeds over samba, to my work machines, but copying them in the reverse direction is *to* the server extremely slow as documented above. I'm not using samba above, so that's not an issue, samba is doing its job fine, it's the tunnel that's slowing everything down, but only in one direction. There is no visible load on the server at home during transfers, and the net connection is nowhere near saturated.
Responding to comments above:
I don't think the TCP over TCP issue is what's wrong. That would slow transfers in both direction. Because I get acceptable speed in one direction it just doesn't make sense that it won't go both ways. But after a few months of messing with it, I'm still no further than I was when I posted.
@Genone: My tunnel setup is above.
@szatox: I'm surprised by that too. Yes, ssh is a slow data pipe, but I think this is way too slow, and I'm just missing one detail for it to be only 80% (which would be enough) instead of 1%, which really doesn't work.
@no101: I agree that localhost and the local net aren't that interesting, I just wanted to show that it all works. "TCP meltdown" probably is why I only get about 80% speed from vanaert (remote)
Well, so I was hoping someone would say "ah we get this all the time, be sure you set X in .ssh/config", useful tips like that is what got me as far as I've gotten. I thought it was an inefficiency in the crypto setup, so in the post above I set out to test that proposition. I now don't think that's the case -- yes it's not as fast as it might be, but doesn't explain the up/down disparity.
I think I'm at a dead end in getting this particular setup to work. Probably the "just use wireguard" option is the way to go. But there's a lot of work behind that "just": I have firewalls at both ends and a heterogeneous gaggle of 6 machines.
Anybody who has seen this one-way slowdown before, please let me know.
Cheers,
Jon |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|