Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
rsync always grinds to a halt ..
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
dirtbag
Guru
Guru


Joined: 18 Feb 2003
Posts: 508
Location: NC

PostPosted: Thu Jun 21, 2012 11:51 pm    Post subject: rsync always grinds to a halt .. Reply with quote

when trying to transfer a lot of pictures from my digital camera from my gentoo server (running

3.2.12-gentoo #1 SMP Sat May 12 17:06:27 EDT 2012 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ AuthenticAMD GNU/Linux

up to another linux box via rsync, the transfer always grinds to a halt after about a minute.. ive upgraded rsync on both sides to the latest version, but the same thing happens..
the only thing I can find is

https://bugzilla.samba.org/show_bug.cgi?id=5478 which seems to be the same problem im having.. do we know of any funky tcp bugs in this kernel?

I never had this problem before.. previously, I was running kernel 3.0.6

-db
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7457

PostPosted: Fri Jun 22, 2012 12:10 am    Post subject: Reply with quote

try add -e ssh
so your rsync will be done thru ssh, that could bypass your transport trouble, except if your ssh is also affect :D (but at least it should gave you another clue it's not really rsync, but your connection or kernel)
if it work and you don't consider that as a solve or if you cannot use ssh, bugs.gentoo.org is where to report such bug with the software.
Back to top
View user's profile Send private message
dirtbag
Guru
Guru


Joined: 18 Feb 2003
Posts: 508
Location: NC

PostPosted: Fri Jun 22, 2012 12:40 am    Post subject: Reply with quote

yeah its hanging with -e ssh as well.

I guess ill start lookin fer kernel bugs.. :(

-db
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 16043

PostPosted: Fri Jun 22, 2012 2:06 am    Post subject: Reply with quote

dirtbag wrote:
yeah its hanging with -e ssh as well.

I guess ill start lookin fer kernel bugs.. :(
Have you confirmed that booting the older kernel allows rsync to work correctly? What is the output of emerge --info net-misc/rsync net-misc/openssh?
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6227
Location: Room 101

PostPosted: Fri Jun 22, 2012 8:38 am    Post subject: Reply with quote

dirtbag wrote:
I guess ill start lookin fer kernel bugs.

dirtbag ... I don't think it relates to 3.2.12-gentoo as I've used it and had no problems with rsync. None the less, I would follow Hu's advice as it maybe something other than the tcp stack.

I seem to remember there being some recent issue with ~arch versions of openssh and TcpRcvBufPoll ... if your running ~arch then you might try adding 'TcpRcvBufPoll no' to /etc/ssh/sshd_config and see if the problem persists (after restarting sshd of course).

best ... khay
Back to top
View user's profile Send private message
dirtbag
Guru
Guru


Joined: 18 Feb 2003
Posts: 508
Location: NC

PostPosted: Sat Jun 23, 2012 3:40 pm    Post subject: Reply with quote

ok so i tried on my 3.0.6 kernel and got the same issue..
I upgraded to openssh-5.9_p1-r4 and still have the same issue..
i put the

TcpRcvBufPoll no

in my /etc/ssh/sshd_config
and that didnt seem to help either .

-db
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6227
Location: Room 101

PostPosted: Sat Jun 23, 2012 4:27 pm    Post subject: Reply with quote

dirtbag wrote:
ok so i tried on my 3.0.6 kernel and got the same issue.

OK, which suggests its not the kernel (at least 3.2.12-gentoo specifically)

dirtbag wrote:
I upgraded to openssh-5.9_p1-r4 and still have the same issue.

5.9_p1-r4 is arch ... the bug (which you'll find if seaching the forums) relates ~arch (so 6.0_pX).

dirtbag wrote:
i put the TcpRcvBufPoll no in my /etc/ssh/sshd_config and that didnt seem to help either.

Well, this was what I remember to be the fix for the issue with 6.0. I did say, "if you are running ~arch".

Can this be isolated to rsync, does pushing data with scp produce a similar result? Do you have any kind of TOS (traffic shaping) in place, have you run 'mtr', or similar, to see if there is any bottleneck between you and the remote host? Bascially, the more you can isolate the problem the easier it'll be to pin-point exactly what kernel and/or app is the cause, right now it could be your DSL-modem/gateway, and so before we start thinking ssh, rsync, or the kernel, are at issue we should have excluded other perhaps more likely possiblities.

best ... khay
Back to top
View user's profile Send private message
dirtbag
Guru
Guru


Joined: 18 Feb 2003
Posts: 508
Location: NC

PostPosted: Thu Jun 28, 2012 1:28 am    Post subject: Reply with quote

i just tried to scp a whole directory and it was going good, until..



24.html 100% 1821 1.8KB/s 352.1KB/s 00:00
IMAG0056.jpg 100% 136KB 135.8KB/s 352.1KB/s 00:00
IMAG0040.jpg 100% 181KB 180.9KB/s 352.1KB/s 00:00
igal2.css 100% 741 0.7KB/s 352.1KB/s 00:00
.indextemplate2.html 100% 1274 1.2KB/s 352.1KB/s 00:00
.tile.png 100% 237 0.2KB/s 352.1KB/s 00:00
25.html 100% 1821 1.8KB/s 352.1KB/s 00:00
.thumb_IMAG0082.jpg 100% 21KB 20.9KB/s 352.1KB/s 00:00
IMAG0055.jpg 100% 190KB 189.9KB/s 352.1KB/s 00:00
7.html 100% 1814 1.8KB/s 352.1KB/s 00:00
.thumb_IMAG0076.jpg 100% 32KB 31.7KB/s 352.1KB/s 00:00
.slidetemplate2.html 100% 1759 1.7KB/s 352.1KB/s 00:00
.thumb_IMAG0064.jpg 100% 28KB 28.0KB/s 352.1KB/s 00:00
IMAG0070.jpg 100% 115KB 115.4KB/s 352.1KB/s 00:00
.thumb_IMG_0554.jpg 100% 15KB 15.2KB/s 352.1KB/s 00:00
IMAG0050.jpg 100% 1164KB 581.9KB/s 992.0KB/s 00:02
IMAG0054.jpg 100% 2060KB 294.2KB/s 1.1MB/s 00:07
IMAG0067.jpg 100% 1792KB 358.4KB/s 1.1MB/s 00:05
IMAG0036.jpg 100% 1476KB 492.1KB/s 1.1MB/s 00:03
IMAG0030.jpg 60% 1472KB 7.3KB/s 0.0KB/s - stalled -

im not doing any TOS stuff.


heres mtr output..

jason@beast ~ $ sudo /usr/sbin/mtr --report xx.xx.xx.xx
HOST: beast Loss% Snt Last Avg Best Wrst StDev
1.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
2.|-- 99-3-168-2.lightspeed.rlg 60.0% 10 21.6 21.8 21.4 22.8 0.7
3.|-- 99.134.77.24 40.0% 10 23.7 24.3 23.7 25.9 0.8
4.|-- 99.134.77.14 90.0% 10 26.5 26.5 26.5 26.5 0.0
5.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
6.|-- 72.157.44.114 0.0% 10 23.3 23.2 22.9 23.4 0.1
7.|-- 12.81.56.26 0.0% 10 28.0 26.0 24.6 28.0 1.2
8.|-- 12.81.56.13 0.0% 10 61.7 28.6 24.7 61.7 11.6
9.|-- 74.175.192.58 0.0% 10 23.4 26.1 23.2 37.3 5.1
10.|-- cr2.rlgnc.ip.att.net 0.0% 10 34.8 39.5 34.7 44.7 3.9
11.|-- cr1.wswdc.ip.att.net 0.0% 10 36.5 35.6 34.2 37.4 1.3
12.|-- wswdc03jt.ip.att.net 0.0% 10 32.6 41.2 32.3 118.5 27.1
13.|-- 192.205.34.246 0.0% 10 35.3 35.3 34.9 35.7 0.2
14.|-- if-2-2.tcore1.AEQ-Ashburn 0.0% 10 35.0 35.2 34.8 35.6 0.2
15.|-- 66.198.154.2 0.0% 10 34.2 34.4 34.2 34.8 0.2
16.|-- 107.14.19.132 0.0% 10 38.1 40.6 38.1 43.8 1.7
17.|-- 107.14.19.21 0.0% 10 43.8 43.6 43.1 44.3 0.3
18.|-- ae19.rlghnca-rtr1.nc.rr.c 20.0% 10 43.8 43.6 43.3 44.0 0.2
19.|-- gig14-1.rlghncj-ar42.nc.r 10.0% 10 405.8 128.1 55.8 405.8 124.3
20.|-- thehost.somewhere.com 10.0% 10 106.4 86.1 55.1 110.3 23.5
jason@beast ~ $


regards,
db
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6227
Location: Room 101

PostPosted: Thu Jun 28, 2012 2:00 am    Post subject: Reply with quote

dirtbag wrote:
2.|-- 99-3-168-2.lightspeed.rlg 60.0% 10 21.6 21.8 21.4 22.8 0.7
3.|-- 99.134.77.24 40.0% 10 23.7 24.3 23.7 25.9 0.8
[...]
18.|-- ae19.rlghnca-rtr1.nc.rr.c 20.0% 10 43.8 43.6 43.3 44.0 0.2
19.|-- gig14-1.rlghncj-ar42.nc.r 10.0% 10 405.8 128.1 55.8 405.8 124.3
20.|-- thehost.somewhere.com 10.0% 10 106.4 86.1 55.1 110.3 23.5

Well, thats 60% packet loss on the second hop, a further 40% on the third hop, and more by the time you reach your destination. Basically, the issue is with your network.

best ... khay
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6227
Location: Room 101

PostPosted: Thu Jun 28, 2012 2:10 am    Post subject: Reply with quote

dirtbag wrote:
1.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
2.|-- 99-3-168-2.lightspeed.rlg 60.0% 10 21.6 21.8 21.4 22.8 0.7
3.|-- 99.134.77.24 40.0% 10 23.7 24.3 23.7 25.9 0.8
4.|-- 99.134.77.14 90.0% 10 26.5 26.5 26.5 26.5 0.0
5.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0

Actually, this doesn't look right at all ... your 5th hop seems to land you right back at the first (and notice the packet loss)

best ... khay
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 16043

PostPosted: Thu Jun 28, 2012 10:42 pm    Post subject: Reply with quote

khayyam wrote:
dirtbag wrote:
1.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
2.|-- 99-3-168-2.lightspeed.rlg 60.0% 10 21.6 21.8 21.4 22.8 0.7
3.|-- 99.134.77.24 40.0% 10 23.7 24.3 23.7 25.9 0.8
4.|-- 99.134.77.14 90.0% 10 26.5 26.5 26.5 26.5 0.0
5.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0

Actually, this doesn't look right at all ... your 5th hop seems to land you right back at the first (and notice the packet loss)
Not all unresponsive hosts are the same host. This output states that both the first and fifth hops failed to return enough information to describe them. It does not state that they are both the same unresponsive host.
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6227
Location: Room 101

PostPosted: Fri Jun 29, 2012 12:07 am    Post subject: Reply with quote

Hu wrote:
Not all unresponsive hosts are the same host. This output states that both the first and fifth hops failed to return enough information to describe them. It does not state that they are both the same unresponsive host.

Hu, your absolutely correct. None the less the issue is with the network ...

best .. khay
Back to top
View user's profile Send private message
dirtbag
Guru
Guru


Joined: 18 Feb 2003
Posts: 508
Location: NC

PostPosted: Sat Jul 07, 2012 6:05 pm    Post subject: Reply with quote

so I did some other tests and it seems fine to other sites.. so I still think theres some funky tcp windowing problem somewhere..
if I rate limit thte rsync with like
rsync -av --bwlimit 170
it seems to work fine. if I watch a transfer, (im sorry, I dont know the technical vernacular for whats really going on) but something like the tcp window gets bigger as the transfer goes on to more better utilize the available bandwidth. but it seems to grind to a halt after some point for some reason. I assume it should automagically scale back, but that doesnt seem to be happening. so, I do have a workaround for the moment. I have tried the same transfer of the same directory to another host on the internet and there was no stoppage of the transfer.
:?

-db
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6227
Location: Room 101

PostPosted: Sat Jul 07, 2012 6:59 pm    Post subject: Reply with quote

dirtbag ...

the problem with such tests is you can't be sure to isolate the problem, but with the mtr/traceroute it clearly shows that packet loss is at issue. You may be seeing this due to CRC checking at the hardware/MAC level (meaning its localised) or due to network configuration and/or bad cabling ... however, each of these would effect the entire TCP/IP layer, though not necessarily consistantly.

As for the effect of "stepping" this is how TCP works, it steps down on errors, however "scaling back" can only go so far. If you limit bandwith then this will have the effect of producing fewer packets, and so fewer to potencially loose, but also perhaps not create the conditions for the problem to arrise.

So, unless you can reproduce the same with a seperate NIC, cable, router, then I don't think you can isolate it to rsync .. infact the mtr seems to rule that out, it may trigger the issue, but I think its not the cause.

best ... khay
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum