Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
emerge --sync fails almost always
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
roseZ
n00b
n00b


Joined: 17 Dec 2005
Posts: 31
Location: Finland

PostPosted: Wed Apr 09, 2014 9:51 am    Post subject: emerge --sync fails almost always Reply with quote

I've had a lot of problems syncing the portage tree since Oct 1, 2013. I've been using rsync.europe.gentoo.org or rsync.se.gentoo.org (I live in Finland) without problems prior to that date. Checking emerge.log between Oct 1 and the last sync before that, I found no changes to my system that appear relevant to the issue (only a rebuild of a few packages that aren't even installed any more).
A typical failed sync attempt (with banners, filelists and similar retries abridged) looks like this:
Code:

# emerge --sync
>>> Synchronization of repository 'gentoo' located in '/usr/portage'...
>>> Starting rsync with rsync://88.156.222.90/gentoo-portage...

receiving incremental file list
timestamp.chk

Number of files: 1
Number of files transferred: 1
Total file size: 32 bytes
Total transferred file size: 32 bytes
Literal data: 32 bytes
Matched data: 0 bytes
File list size: 27
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 98
Total bytes received: 135

sent 98 bytes  received 135 bytes  93.20 bytes/sec
total size is 32  speedup is 0.14

receiving incremental file list
app-i18n/ibus-chewing/ChangeLog
app-i18n/ibus-chewing/Manifest

--- SNIP ---

dev-ruby/hoe/hoe-2.13.0-r1.ebuild
dev-ruby/hoe/hoe-3.9.0.ebuild
rsync error: timeout in data send/receive (code 30) at io.c(137) [sender=3.0.9]
rsync: connection unexpectedly closed (1610442 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [receiver=3.0.9]
rsync: connection unexpectedly closed (3185 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [generator=3.0.9]
>>> Retrying...


>>> Starting retry 1 of 5 with rsync://81.91.242.10/gentoo-portage

--- SNIP ---

rsync error: timeout in data send/receive (code 30) at io.c(137) [sender=3.0.8]
rsync: read error: Connection reset by peer (104)
rsync error: error in rsync protocol data stream (code 12) at io.c(764) [receiver=3.0.9]
rsync: connection unexpectedly closed (2909 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [generator=3.0.9]
>>> Retrying...


>>> Starting retry 2 of 5 with rsync://137.226.34.42/gentoo-portage
*******************************************************
=======================================================
Welcome to the ftp.halifax.rwth-aachen.de rsync mirror
=======================================================

Please don't use compression (-z) or checksums (-c).
Both features are disabled!

=======================================================
=======================================================

receiving incremental file list
timestamp.chk

--- SNIP ---

receiving incremental file list
rsync error: timeout in data send/receive (code 30) at io.c(137) [sender=3.0.9]
rsync: read error: Connection reset by peer (104)
rsync error: error in rsync protocol data stream (code 12) at io.c(764) [receiver=3.0.9]
rsync: connection unexpectedly closed (3200 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [generator=3.0.9]
>>> Retrying...


>>> Starting retry 3 of 5 with rsync://137.226.34.228/gentoo-portage
Welcome to rsync.informatik.rwth-aachen.de (137.226.34.228).
This server is part of the SunSITE Central Europe and is located
in Aachen, Germany (http://sunsite.informatik.rwth-aachen.de).

--- SNIP ---

receiving incremental file list
dev-ruby/listen/ChangeLog
dev-ruby/listen/Manifest

--- SNIP ---

dev-ruby/ntlm-http/ntlm-http-0.1.1-r1.ebuild
dev-ruby/ntlm-http/ntlm-http-0.1.1.ebuild
rsync error: timeout in data send/receive (code 30) at io.c(239) [sender=3.0.3]
rsync: read error: Connection reset by peer (104)
rsync error: error in rsync protocol data stream (code 12) at io.c(764) [receiver=3.0.9]
rsync: connection unexpectedly closed (2496 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [generator=3.0.9]
>>> Retrying...


>>> Starting retry 4 of 5 with rsync://83.164.133.198/gentoo-portage

--- SNIP ---

receiving incremental file list
rsync error: timeout in data send/receive (code 30) at io.c(137) [sender=3.0.7]
rsync: read error: Connection reset by peer (104)
rsync error: error in rsync protocol data stream (code 12) at io.c(764) [receiver=3.0.9]
rsync: connection unexpectedly closed (3210 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [generator=3.0.9]
>>> Retrying...


>>> Starting retry 5 of 5 with rsync://193.219.184.226/gentoo-portage
timed out
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(549) [Receiver=3.0.9]
>>> Exceeded PORTAGE_RSYNC_RETRIES: 5

My internet connection works fine, as far as I'm aware: no frequent timeouts or disconnects; downloads, bittorrent, etc. all work fine.

In order to troubleshoot the issue, I wrote a script that extracts some sync statistics from emerge.log. Here's what I have from the time period Jun 2010 - Oct 2013:
Code:

$ python ~/syncstats.py <(sed -n '1,/^1380592/p' /var/log/emerge.log)

    Sync server Syncs Fails Rate     Avg dur. Max dur. Min dur. Hostname           
--------------------------------------------------------------------------------
  88.156.222.90     6     0 100.0 %    96.5 s    194 s     31 s ftp.vectranet.pl   
  212.24.172.37     6     0 100.0 %    77.8 s    167 s     37 s rsync1.hu.gentoo.org
  137.226.34.42     6     0 100.0 %    56.2 s    116 s     35 s ftp.halifax.rwth-aachen.de
  193.136.164.6     7     0 100.0 %   116.0 s    482 s     42 s ftp.rnl.ist.utl.pt 
 212.89.225.155     3     0 100.0 %    45.7 s     57 s     39 s tux.rainside.sk     
 83.164.133.198    12     0 100.0 %    48.9 s     80 s     36 s gentoo.lagis.at     
 137.226.34.228     8     0 100.0 %    90.1 s    145 s     48 s rsync.informatik.rwth-aachen.de
 212.227.96.139     4     0 100.0 %   106.2 s    140 s     33 s s15736424.onlinehome-server.info
   194.62.23.99     7     0 100.0 %    59.4 s     96 s     42 s mirror.opteamax.de 
   81.91.242.10     5     0 100.0 %    43.2 s     51 s     36 s gig0.mirror.ln1.qubenet.net
  92.240.244.16     2     0 100.0 %    40.0 s     41 s     39 s phoenix.wheel.sk   
  178.63.25.207     8     0 100.0 %    52.5 s     99 s     32 s static.207.25.63.178.clients.your-server.de
212.110.161.109     7     0 100.0 %    43.3 s     55 s     36 s 212-110-161-109.no-reverse-dns-set.bytemark.co.uk
 193.219.32.205     5     0 100.0 %   163.4 s    549 s     48 s kmynas.elen.ktu.lt 
 91.193.124.136     1     0 100.0 %    37.0 s     37 s     37 s <none>             
 88.198.224.205    13     0 100.0 %    42.8 s     55 s     34 s www1.stealer.net   
 212.110.161.69     1     0 100.0 %    58.0 s     58 s     58 s mirror.bytemark.co.uk
 87.106.218.149     2     0 100.0 %    45.0 s     48 s     42 s s15423754.onlinehome-server.info
  88.198.69.178     5     0 100.0 %    66.8 s    135 s     33 s hosting1.amiconsult.de
    212.37.1.30   124     3  97.6 %   187.9 s    619 s     23 s mirror2.dataphone.se
   193.1.193.64     9     1  90.0 %  2150.7 s   6418 s    690 s ftp.heanet.ie       
   89.238.64.78     9     1  90.0 %    47.4 s    107 s     31 s banana.gentoo-ev.org
 188.165.231.20     8     1  88.9 %    43.6 s     57 s     24 s gentoo64.net       
  81.223.20.162     8     1  88.9 %    93.2 s    409 s     38 s gentoo.inode.at     
  88.191.100.68     6     1  85.7 %    68.8 s    106 s     35 s 88-191-100-68.rev.poneytelecom.eu
 194.146.132.15     4     1  80.0 %    66.0 s     84 s     52 s gentoo.iteam.ua     
 89.206.169.171     8     2  80.0 %    59.0 s    107 s     34 s rsync.star.net.uk   
147.175.167.205     2     2  50.0 %    46.5 s     52 s     41 s <none>             
  178.210.128.3     1     1  50.0 %    42.0 s     42 s     42 s www.telcom.net.ua   
 130.230.54.100     0     2   0.0 %     nan s      0 s      0 s trumpetti.ip4.atm.tut.fi
193.219.184.226     0     1   0.0 %     nan s      0 s      0 s <none>             

On the other hand, this shows what's been happening since the problems started (Oct 2013 - today):
Code:

$ python ~/syncstats.py <(sed -n '/^1379995624/,$p' /var/log/emerge.log)

    Sync server Syncs Fails Rate     Avg dur. Max dur. Min dur. Hostname           
--------------------------------------------------------------------------------
  212.24.172.37    17     1  94.4 %    64.9 s     90 s     36 s rsync1.hu.gentoo.org
    212.37.1.30     1     3  25.0 %   161.0 s    161 s    161 s mirror2.dataphone.se
 88.198.224.205     2    10  16.7 %   556.0 s    797 s    315 s www1.stealer.net   
   89.238.64.78     1     9  10.0 %    59.0 s     59 s     59 s banana.gentoo-ev.org
  88.198.69.178     1    13   7.1 %    26.0 s     26 s     26 s hosting1.amiconsult.de
 137.226.34.228     0    10   0.0 %     nan s      0 s      0 s rsync.informatik.rwth-aachen.de
 91.193.124.136     0    11   0.0 %     nan s      0 s      0 s <none>             
  178.210.128.3     0    12   0.0 %     nan s      0 s      0 s www.telcom.net.ua   
  88.191.100.68     0     6   0.0 %     nan s      0 s      0 s 88-191-100-68.rev.poneytelecom.eu
  137.226.34.42     0    12   0.0 %     nan s      0 s      0 s ftp.halifax.rwth-aachen.de
  193.136.164.6     0    11   0.0 %     nan s      0 s      0 s ftp.rnl.ist.utl.pt 
 188.165.231.20     0    14   0.0 %     nan s      0 s      0 s gentoo64.net       
 212.89.225.155     0     9   0.0 %     nan s      0 s      0 s tux.rainside.sk     
 83.164.133.198     0    13   0.0 %     nan s      0 s      0 s gentoo.lagis.at     
 89.206.169.171     0    10   0.0 %     nan s      0 s      0 s rsync.star.net.uk   
   194.62.23.99     0    11   0.0 %     nan s      0 s      0 s mirror.opteamax.de 
 130.230.54.100     0     3   0.0 %     nan s      0 s      0 s trumpetti.ip4.atm.tut.fi
  88.156.222.90     0     8   0.0 %     nan s      0 s      0 s ftp.vectranet.pl   
 194.146.132.15     0    11   0.0 %     nan s      0 s      0 s gentoo.iteam.ua     
  81.223.20.162     0    13   0.0 %     nan s      0 s      0 s gentoo.inode.at     
   144.76.18.87     0    10   0.0 %     nan s      0 s      0 s skade.schwarzvogel.de
 212.110.161.69     0     9   0.0 %     nan s      0 s      0 s mirror.bytemark.co.uk
 87.106.218.149     0    14   0.0 %     nan s      0 s      0 s s15423754.onlinehome-server.info
   81.91.242.10     0    10   0.0 %     nan s      0 s      0 s gig0.mirror.ln1.qubenet.net
193.219.184.226     0     9   0.0 %     nan s      0 s      0 s <none>             
   193.1.193.64     0     9   0.0 %     nan s      0 s      0 s ftp.heanet.ie       
  92.240.244.16     0     8   0.0 %     nan s      0 s      0 s phoenix.wheel.sk   

Looking at the success rates, you can see that it has gone from a pretty spotless record to only being able to sync reliably with one server (rsync1.hu.gentoo.org) during the past 6 months. By the way, those two listings combined contain the data for (practically) every sync I've attempted since 2010.

Now, I know that I can (and occasionally do) sync using
Code:

emerge-webrsync   # which is slow, or
SYNC="rsync1.hu.gentoo.org" emerge --sync

but I'd rather not do that since I understand that using the server pools like rsync.europe.gentoo.org is preferred to distribute the load.
Mostly, I'm writing to ask a couple of questions:

    1) What gives? Surely this situation is far from normal/expected?
    2) Has anyone else experienced similar behaviour?
    3) Did something change with the sync server infrastructure or practices around the end of September 2013?
    4) If the problem is at my end, any ideas on where to look or how to remedy the situation?

Any input is much appreciated.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3131

PostPosted: Wed Apr 09, 2014 10:51 am    Post subject: Reply with quote

perhaps switch to emerge-delta-webrsync would be a better idea then?
Back to top
View user's profile Send private message
roseZ
n00b
n00b


Joined: 17 Dec 2005
Posts: 31
Location: Finland

PostPosted: Wed Apr 09, 2014 11:24 am    Post subject: Reply with quote

Thanks, I wasn't aware of delta-webrsync script. I guess it would be an improvement over webrsync at least. I would like to stick to rsync, though.
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Wed Apr 09, 2014 4:17 pm    Post subject: Reply with quote

Do you have some iptables/firewall scripts set up? Such things can happen if icmp is not properly handled. Maybe also that MTU is incorrect.
Back to top
View user's profile Send private message
roseZ
n00b
n00b


Joined: 17 Dec 2005
Posts: 31
Location: Finland

PostPosted: Thu Apr 10, 2014 5:01 am    Post subject: Reply with quote

I do have an iptables firewall up since at least 2012, and there's a built-in firewall in the router-ADSL modem combo from my ISP. Though I've made no changes to either around the time that problems started.

I'm not really privy to the inner workings of network protocols, but I do have an iptables rule (from some guide) intended to maintain "normal" net-facing operation of a host:
Code:

# iptables --line-numbers -v -L icmp-allow
Chain icmp-allow (1 references)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 ACCEPT     icmp --  any    any     anywhere             anywhere             icmp destination-unreachable
2        0     0 ACCEPT     icmp --  any    any     anywhere             anywhere             icmp source-quench
3        0     0 ACCEPT     icmp --  any    any     anywhere             anywhere             icmp time-exceeded
4        0     0 ACCEPT     icmp --  any    any     anywhere             anywhere             icmp parameter-problem
5        0     0 ACCEPT     icmp --  any    any     anywhere             anywhere             icmp echo-request
6        0     0 ACCEPT     icmp --  any    any     anywhere             anywhere             icmp echo-reply

That rule is applied at the end of the INPUT chain (policy: DROP), and OUTPUT chain is empty (policy: ALLOW). Judging by the zero counters, that rule doesn't see much action.

I tried replacing all DROP operations in the rules with LOG+DROP and then syncing, but it produced nothing out of the ordinary to the logs. Also tried with the firewall completely disabled and the router's built-in firewall bypassed as best I could (enabled NAT for ports 1:65535 to my machine), but the syncs failed just the same. I did see some new error messages, though:
Code:

# PORTAGE_RSYNC_RETRIES=2 emerge --sync
>>> Synchronization of repository 'gentoo' located in '/usr/portage'...
>>> Starting rsync with rsync://88.191.100.68/gentoo-portage...
timed out
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(549) [Receiver=3.0.9]
>>> Retrying...


>>> Starting retry 1 of 2 with rsync://88.198.224.205/gentoo-portage

--- SNIP ---

receiving incremental file list
app-pda/p3nfs/ChangeLog
app-pda/p3nfs/Manifest
app-pda/p3nfs/metadata.xml
metadata/timestamp
metadata/timestamp.chk
metadata/timestamp.x
metadata/dtd/timestamp.chk
metadata/glsa/timestamp.chk
rsync: read error: Connection timed out (110)
rsync error: error in rsync protocol data stream (code 12) at io.c(764) [receiver=3.0.9]
rsync: writefd_unbuffered failed to write 192 bytes to socket [generator]: Broken pipe (32)
rsync error: error in rsync protocol data stream (code 12) at io.c(1532) [generator=3.0.9]
>>> Retrying...


>>> Starting retry 2 of 2 with rsync://137.226.34.228/gentoo-portage

--- SNIP ---

receiving incremental file list
metadata/timestamp.chk
[receiver] io timeout after 180 seconds -- exiting
rsync error: timeout in data send/receive (code 30) at io.c(140) [receiver=3.0.9]
rsync: connection unexpectedly closed (2572 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [generator=3.0.9]
>>> Exceeded PORTAGE_RSYNC_RETRIES: 2

I can't really see a pattern here, except that often either the sender or receiver appears to gets bored of waiting and throws in the towel. But why?

I don't get how the maximum transmission unit size would affect my issue; shouldn't the packets be chopped up on the way as needed? Anyhow, the values seem pretty usual to me: the router MTU is 1460 and ifconfig reports that the MTU is set to 1500 for the network interface.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum