Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
What happened to "emerge --sync" (slooooow)?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
curmudgeon
Veteran
Veteran


Joined: 08 Aug 2003
Posts: 1741

PostPosted: Wed Jan 16, 2019 2:05 am    Post subject: What happened to "emerge --sync" (slooooow)? Reply with quote

I did search for this, and it seems strange that I can't find anything. Up until a few weeks ago, whenever I did "emerge --sync", the machine would quickly get started. I sync just one machine to the gentoo infrastructure, and all of the other machines to the local repository on that machine. The other machines would frequently sync completely in less than ten seconds.

Now whenever I "emerge --sync" (even on the other machines), the initial few lines pop up right away (looks like a sync of the repository timestamp to see if there is a need to sync the rest of the tree), and then there is a pause of about a minute before the list of files updated begins.

Unless there are a lot of new under the surface complexities, it appears that this is a deliberate pause (to combat network abuse?). Can anyone explain this, and is there someway to lessen or remove it (particularly on the local machines)?


Last edited by curmudgeon on Thu Jan 24, 2019 10:09 pm; edited 2 times in total
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Wed Jan 16, 2019 3:22 am    Post subject: Reply with quote

You can stop updating the gpg key by adding
Code:
sync-rsync-verify-metamanifest = no

in your repos.conf ... if that's what it's stopping on. This of course will no longer let portage check the authenticity of the portage tree (which should be OK if you checked your main copy).

I should submit a different topic, but I find that if I try to sync over my VPN, it will fail to ever get the updated key for gpg... and emerge --sync always fails because it can't get the key...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
curmudgeon
Veteran
Veteran


Joined: 08 Aug 2003
Posts: 1741

PostPosted: Wed Jan 16, 2019 4:50 am    Post subject: Reply with quote

Thanks for the explanation. I guess I can leave that on the the machine that syncs to the gentoo infrastructure, but it is complete overkill for the internal network (if the repository on the internal server gets altered after the sync to the outside world, I have bigger problems than the check was designed to detect).
Back to top
View user's profile Send private message
curmudgeon
Veteran
Veteran


Joined: 08 Aug 2003
Posts: 1741

PostPosted: Thu Jan 24, 2019 10:09 pm    Post subject: Reply with quote

Sorry, that is not the problem. I added that line to /etc/portage/repos.conf/gentoo.conf, and it made no difference. One of the machines just paused 150 seconds between the first part of the sync ("127 bytes transferred") and the second part.

Do you have any other possible cause?
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 4438
Location: Frankfurt, Germany

PostPosted: Thu Jan 24, 2019 10:46 pm    Post subject: Reply with quote

Below is the execution of 'emerge --sync' on my NFS server as well as on my NFS client:

Code:
+--------------------------------------+-------------+-------------+
|                                      |  NFS server |  NFS client |
+--------------------------------------+-------------+-------------+
| execution time of 'emerge --sync'    |  13 seconds | 598 seconds |
+--------------------------------------+-------------+-------------+

Why is 'emerge --sync' so much slower on my NFS client than on my NFS server??? I analyzed 'emerge --sync' and saw that 'emerge --sync' calls 'rsync' to create and fill a directory '/usr/portage/.tmp-unverified-download-quarantine':

Code:
rsync -a \
    --link-dest /usr/portage \
    --exclude=/distfiles \
    --exclude=/local \
    --exclude=/lost+found \
    --exclude=/packages \
    --exclude /.tmp-unverified-download-quarantine \
    /usr/portage/ \
    /usr/portage/.tmp-unverified-download-quarantine/

That rsync call takes 3 seconds on my NFS server, but more than 5 minutes (!) on my NFS client. Below is a table with my observations / measurements:

Code:
+------------------------------------------+--------------+--------------+
|                                          |   NFS server |   NFS client |
+------------------------------------------+--------------+--------------+
| Execution time of rsync-Statement        |    3 seconds |  344 seconds |
| .tmp-unverified-download-quarantine                                    |
| - size                                   |       111 MB |       111 MB |
| - directories                            |       27.000 |       27.000 |
| - files (hard links)                     |      136.000 |      136.000 |
| Execution time of rm .tmp-unv...         |  1.5 seconds |  113 seconds |
+------------------------------------------+--------------+--------------+

'emerge --sync' wastes nearly 8 minutes (!) on my NFS client with creation and removal of .tmp-unverified-download-quarantine. That's annoying.
Furthermore, 'emergy --sync' writes more than 110MB to my SSD just to create and fill a temporary directory - every time I run it. I really don't like that.
Back to top
View user's profile Send private message
C5ace
Guru
Guru


Joined: 23 Dec 2013
Posts: 472
Location: Brisbane, Australia

PostPosted: Fri Jan 25, 2019 10:45 am    Post subject: Reply with quote

My local server runs once a day 'emerge --sync' as cron job.

To upgrade my local clients connect their '/usr/portage' by NFS to the server's '/usr/portage'. Then run 'emerge -av --update --deep --with-bdeps=y --newuse --backtrack=300 --changed-deps=y --keep-going=y @world'. When done without error, emerge --depclean and revdep-rebuild.

This works 99% of the time if there are no USE changes, blockers or kernel upgrades.
_________________
Observation after 30 years working with computers:
All software has known and unknown bugs and vulnerabilities. Especially software written in complex, unstable and object oriented languages such as perl, python, C++, C#, Rust and the likes.
Back to top
View user's profile Send private message
dmpogo
Advocate
Advocate


Joined: 02 Sep 2004
Posts: 3267
Location: Canada

PostPosted: Sat Jan 26, 2019 4:57 am    Post subject: Re: What happened to "emerge --sync" (slooooow)? Reply with quote

curmudgeon wrote:
I did search for this, and it seems strange that I can't find anything. Up until a few weeks ago, whenever I did "emerge --sync", the machine would quickly get started. I sync just one machine to the gentoo infrastructure, and all of the other machines to the local repository on that machine. The other machines would frequently sync completely in less than ten seconds.

Now whenever I "emerge --sync" (even on the other machines), the initial few lines pop up right away (looks like a sync of the repository timestamp to see if there is a need to sync the rest of the tree), and then there is a pause of about a minute before the list of files updated begins.

Unless there are a lot of new under the surface complexities, it appears that this is a deliberate pause (to combat network abuse?). Can anyone explain this, and is there someway to lessen or remove it (particularly on the local machines)?


I got the same symptoms on one of my machines out of three. Although in my case then it will also hang for a minute or more at the end of the sync processes, after it already printed amount of data transferred information. What I noticed about the second part of the hang, is that the process that is waiting completion (without utilizing much of CPU) is 'rm'.
One difference of this machine from the other two, is that it still uses reiserfs v3 filesystem on the partition where all portage resides and the partition is on soft raid 1.
And the hangs started month or so ago when I also updated the kernel (though it was a minor update withing 4.4 family). So in head I am blaming disk operations, but have not yet investigated it furthe
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Sat Jan 26, 2019 5:01 pm    Post subject: Reply with quote

Yeah this is not the healthiest thing to do for SSDs but it's abut the best that can be done to have consistent portage trees that are signed.

When signature checks are enabled, rsync apparently copies into a separate directory '/usr/portage/.tmp-unverified-download-quarantine' which is a complete copy of the portage tree plus the new stuff that's being rsynced in. The signature is checked on that directory before it gets copied back into the main directory. It's the typical copy then move back scheme so that things are kept consistent, except it uses a lot more disk space and needs to clean up the second copy. The second copy was kind of screwy, once there was some corruption in that quarantine and emerge --sync would constantly fail for me until I manually deleted the quarantine.

In any case, need to figure out what's stopping the sync, I don't know...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Sat Jan 26, 2019 6:08 pm    Post subject: Reply with quote

eccerr0r wrote:
Yeah this is not the healthiest thing to do for SSDs but it's abut the best that can be done to have consistent portage trees that are signed.

When signature checks are enabled, rsync apparently copies into a separate directory '/usr/portage/.tmp-unverified-download-quarantine' which is a complete copy of the portage tree plus the new stuff that's being rsynced in. The signature is checked on that directory before it gets copied back into the main directory. It's the typical copy then move back scheme so that things are kept consistent, except it uses a lot more disk space and needs to clean up the second copy. The second copy was kind of screwy, once there was some corruption in that quarantine and emerge --sync would constantly fail for me until I manually deleted the quarantine.

In any case, need to figure out what's stopping the sync, I don't know...


To get rid of the .tmp* stuff this is what I added
Code:
cat /etc/portage/repos.conf/gentoo.conf
[DEFAULT]
main-repo = gentoo
sync-allow-hardlinks = no


The sync-allow-hardlinks is the key.
Yes it might allow an unverified file through (usually caused by syncing while the sending end hasn't gotten a complete file yet) but that gets fixed with a later sync AND you don't wind up with a .tmp* directory and associated copying that might hang around for days/weeks.
I was getting a verification error on the metadata stuff, but I finally changed the time I synced (cron job) by 15 minutes and haven't had a problem in a while.


Edit to add: this is what I used to see occasionally
Code:
>>> Starting rsync with rsync://91.186.30.235/gentoo-portage...
>>> Checking server timestamp ...
 * Manifest timestamp: 2018-09-19 06:08:41 UTC
 * Valid OpenPGP signature found:
 * - primary key: DCD05B71EAB94199527F44ACDB6B8C1F96D8BF6D
 * - subkey: E1D6ABB63BFCFB4BA02FDF1CEC590EEAC9189250
 * - timestamp: 2018-09-19 06:08:41 UTC
 * Verifying /usr/portage/.tmp-unverified-download-quarantine ...!!! Manifest verification failed:
Manifest mismatch for metadata/Manifest.gz
  __size__: expected: 1981, have: 1980


I do notice that even with the .tmp* stuff not activated, there's a pause during verification at the end of the sync but it lasts for less than a minute, in my case.
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Sat Jan 26, 2019 7:10 pm    Post subject: Reply with quote

The last emerge --sync I did, it roughly took 45 seconds to do signature verification. This was on a SSD. I haven't checked my other boxes, specifically the HDD machines, for relative sync times...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
dmpogo
Advocate
Advocate


Joined: 02 Sep 2004
Posts: 3267
Location: Canada

PostPosted: Mon Jan 28, 2019 5:09 pm    Post subject: Reply with quote

So what is the conclusion, what is the proper (or best) way to disable .tmp-unverified-download-quarantine game ?
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 4438
Location: Frankfurt, Germany

PostPosted: Thu Jan 31, 2019 9:40 pm    Post subject: Reply with quote

Anon-E-moose wrote:
sync-allow-hardlinks = no

Thanks, Anon-E-moose!

That's exactly what I was looking for. The option above reduces execution time of 'emerge --sync' on my NFS client from 598 seconds down to 101 seconds. Pretty good!

Code:
+----------------------------------------+-------------+-------------+
|                                        |  NFS server |  NFS client |
+----------------------------------------+-------------+-------------+
| execution time of 'emerge --sync'      |  13 seconds | 598 seconds |
| - default options                      |             |             |
+----------------------------------------+-------------+-------------+
| execution time of 'emerge --sync'      |   7 seconds | 101 seconds |
| - sync-rsync-verify-metamanifest = no  |             |             |
| - sync-allow-hardlinks = no            |             |             |
+----------------------------------------+-------------+-------------+
Back to top
View user's profile Send private message
curmudgeon
Veteran
Veteran


Joined: 08 Aug 2003
Posts: 1741

PostPosted: Wed May 06, 2020 2:53 am    Post subject: Reply with quote

This has come back again after upgrading to portage-2.3.89-r3, and I have no idea why. I ran an strace, and the sync paused here for about 230 seconds (and I joined the other process started with an strace, and it also showed nothing but "wait").

Code:

epoll_ctl(3, EPOLL_CTL_ADD, 8, {EPOLLIN, {u32=8, u64=140423955742728}}) = 0
close(7)                                = 0
getpid()                                = 28680
epoll_wait(3, >>> Syncing repository 'gentoo' into '/usr/portage'...
>>> Starting rsync with rsync://89.238.71.6/gentoo-portage...
>>> Checking server timestamp ...
Welcome to turnstone.gentoo.org / rsync.gentoo.org

Server Address : 89.238.71.6, 2a00:1828:a00d:ffff::6
Contact Name   : mirror-admin@gentoo.org
Hardware       : 16 x Intel(R) Xeon(R) CPU E5530 @ 2.40GHz, 24160MB RAM
Sponsor        : Manitu GmbH, St. Wendel, Germany

Please note: common gentoo-netiquette says you should not sync more
than once a day.  Users who abuse the rsync.gentoo.org rotation
may be added to a temporary ban list.

MOTD autogenerated by update-rsync-motd on Thu Apr  4 19:04:00 UTC 2019

receiving incremental file list
timestamp.chk
             32 100%   31.25kB/s    0:00:00 (xfr#1, to-chk=0/1)

Number of files: 1 (reg: 1)
Number of created files: 0
Number of deleted files: 0
Number of regular files transferred: 1
Total file size: 32 bytes
Total transferred file size: 32 bytes
Literal data: 32 bytes
Matched data: 0 bytes
File list size: 41
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 104
Total bytes received: 132

sent 104 bytes  received 132 bytes  42.91 bytes/sec
total size is 32  speedup is 0.14


The sync paused again for about a minute after outputting the statistics for the entire tree before finally exiting. This time, I don't see this on any other gentoo machines.
Back to top
View user's profile Send private message
Juippisi
Developer
Developer


Joined: 30 Sep 2005
Posts: 724
Location: /home

PostPosted: Wed May 06, 2020 4:34 am    Post subject: Reply with quote

Switch to git.

Code:

$ cat /etc/portage/repos.conf/gentoo.conf
[DEFAULT]
main-repo = gentoo

[gentoo]
location = /var/db/repos/gentoo
sync-type = git
sync-uri = https://github.com/gentoo-mirror/gentoo
auto-sync = true
sync-depth=1
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Thu Feb 04, 2021 3:20 pm    Post subject: Reply with quote

Finally did an emerge --sync experiment to tell how much actually got written (with sync_allow_hardlinks=yes):
Code:
Number of files: 146,556 (reg: 120,047, dir: 26,509)
Number of created files: 3,393 (reg: 3,255, dir: 138)
Number of deleted files: 4,020 (reg: 3,878, dir: 142)
Number of regular files transferred: 32,633
Total file size: 207.61M bytes
Total transferred file size: 82.53M bytes
Literal data: 82.53M bytes
Matched data: 0 bytes
File list size: 3.83M
File list generation time: 0.057 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 664.13K
Total bytes received: 44.72M

sent 664.13K bytes  received 44.72M bytes  653.00K bytes/sec
total size is 207.61M  speedup is 4.57

ext4fs reports lifetime writes change: 1760244K (1.7G) got written during the sync.
I'd say there's some write amplification here.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
halcon
l33t
l33t


Joined: 15 Dec 2019
Posts: 629

PostPosted: Thu Feb 04, 2021 3:49 pm    Post subject: Reply with quote

Hi Juippisi,

Juippisi wrote:
Code:
$ cat /etc/portage/repos.conf/gentoo.conf
[DEFAULT]
main-repo = gentoo

[gentoo]
location = /var/db/repos/gentoo
sync-type = git
sync-uri = https://github.com/gentoo-mirror/gentoo
auto-sync = true
sync-depth=1

Why not
Code:
sync-uri = https://gitweb.gentoo.org/repo/gentoo.git
? Should it work too? (I haven't switched to gentoo-git yet).

Possibly OT: I've been thinking recently about making all my GitHub repositories mirrors from another sites, because GitHub's behaviour is sometimes unexpected. For example, I tested out that it blocks access to the port 443 for machines having multiple IPs while they are trying to access Github's 443 (via command line git) from another IP (not the same as before) - I wonder where does it store a cookie or something. It is not nice to be blocked from updating portage...

EDIT
Noticed only now that Juippisi's post is old.
_________________
A wife asks her husband, a programmer:
- Could you please go shopping for me and buy one carton of milk, and if they have eggs, get 6?
He comes back with 6 cartons of milk.
- Why did you buy 6 cartons of milk?
- They had eggs.
Back to top
View user's profile Send private message
Juippisi
Developer
Developer


Joined: 30 Sep 2005
Posts: 724
Location: /home

PostPosted: Fri Feb 05, 2021 6:16 am    Post subject: Reply with quote

halcon wrote:

Why not
Code:
sync-uri = https://gitweb.gentoo.org/repo/gentoo.git
? Should it work too? (I haven't switched to gentoo-git yet).


This is not meant for syncing, it doesn't come with metadata, news files etc. You can use Gentoo-hosted Git repo mirror too, but that is:
Code:
https://gitweb.gentoo.org/repo/sync/gentoo.git

EDIT: and more specifically,
Code:
https://anongit.gentoo.org/git/repo/sync/gentoo.git


I use Github because it's faster than Gentoo mirrors for me :) although with git the difference is barely noticeable.
Back to top
View user's profile Send private message
halcon
l33t
l33t


Joined: 15 Dec 2019
Posts: 629

PostPosted: Fri Feb 05, 2021 2:03 pm    Post subject: Reply with quote

Juippisi wrote:
Code:
https://anongit.gentoo.org/git/repo/sync/gentoo.git

Good, thank you!
_________________
A wife asks her husband, a programmer:
- Could you please go shopping for me and buy one carton of milk, and if they have eggs, get 6?
He comes back with 6 cartons of milk.
- Why did you buy 6 cartons of milk?
- They had eggs.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Fri Feb 05, 2021 9:40 pm    Post subject: Reply with quote

.. And with sync_allow_hardlinks=no :
Code:
Number of files: 146,639 (reg: 120,132, dir: 26,507)
Number of created files: 8,231 (reg: 7,913, dir: 318)
Number of deleted files: 9,283 (reg: 9,007, dir: 276)
Number of regular files transferred: 46,600
Total file size: 207.90M bytes
Total transferred file size: 108.35M bytes
Literal data: 108.35M bytes
Matched data: 0 bytes
File list size: 4.43M
File list generation time: 0.045 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 920.42K
Total bytes received: 57.83M

... resulted in 436MB written according to ext4s lifetime writes ...

Yeah I think I will have to disallow hardlinks on my SSDs (to reduce wear) and probably most of my computers (for increased speed as mechanical hard drives are slow to build the quarantine too). Or is the preferred solution to use git?
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum